Shadermark benches for NV30 Vs R300

BoardBonobo

My hat is white(ish)!
Veteran
Don't think anybodies posted this yet, but there's a posting over @ H[OCP] with a set of shadermark benches. The FX gets systematically slaughtered in the DX9 tests. See it here.
 
Yes, but it doesn't mean anything...yet.

If the DX9 shader performance is still looking this poor in a week, be worried. Until then, there's no point.
 
Given that the fixed functions stuff appears to be running fine and fast, the drivers must be utter crap, right now! There's definitely the potential for some awesome performance, though. . .
 
Yes, but the fixed-function stuff should also improve significantly. It's likely using the exact same code right now as the GeForce3/4 used. As nVidia's driver team learns the idiosyncracies of the NV30 architecture, additional tweaking should help performance a fair bit in these fixed-function cases as well.

And from John Carmack's statements, it looks like the NV30 is performing quite well with nVidia's proprietary extensions at the moment, but the performance in the ARB extensions is rather low, as we have also seen in DirectX. It seems to me that nVidia just hasn't put much work toward increasing performance in these areas, which isn't surprising, since no games use any of them yet. As for nVidia's proprietary extensions, those are certainly based directly on the architecture, making high performance with them nearly trivial to accomplish.
 
It's not a matter of "the fixed function path is fast, so the PS2.0 path has the potential to be fast with the right optimizations."

NVidia themselves have said that the PS2.0 path will be a fraction of the speed of the fixed-function path. That is why they included a separate fixed-function path, "to get the performance on older applications."

It looks like that with the NV30 there is a fixed-function, 32-bit path that is quite fast, clock-for-clock keeping up with the R300 (and so surpassing it based on clock rate). It sounds like Doom III's NV30 renderer uses this functionality, since it does not have quite the visual quality of the floating-point ARB2 path.

If you use 128-bit FP with ARB2, you get half the speed of the R300. From this I surmise that if you use 64-bit FP, you would get approximately the same speed as the R300. The NV30 would be executing fewer instuctions per clock, but its higher clock rate would make up for it.

I suppose the drivers could do better at scheduling instructions in the shaders than they do now, but the basic hardware limitations will still be there. The NV30 dispatches fewer shader instructions per clock than the R300 (at 64-bit); and half of that at 128-bit.
 
antlers4 said:
It's not a matter of "the fixed function path is fast, so the PS2.0 path has the potential to be fast with the right optimizations."

NVidia themselves have said that the PS2.0 path will be a fraction of the speed of the fixed-function path. That is why they included a separate fixed-function path, "to get the performance on older applications.

Right on. We might as well get used to the 3 levels of precision on NV30. You got the fixel function path which is the old register combiners from the good ol' GF3/4 days, then you got fairly fast FP 16 bit and finally slow FP 32 bit.

After seeing Carmack's comments I kinda like the simplicity of the R300 from a design point of view because you can spare some silicon to do other fun stuff. But hey, that's just me. ;)
 
Back
Top