NV35 might be misunderstood...

DaveBaumann said:
Oh, guess I payed too much attention to NVIDIA marketting there! ;)

Actually it doesn't matter once you turn on trilinear and/or AF, as they won't be able to output 4 pixels/clock anyway.

Same is of course true of the R300 it won't be able to output 8 pixels/clock in most real-world situations.

I tested our terrain rendering routine if there's a speed advantage with multitexturing and with AF there's none. (The cards are fillrate limited and not bandwidth limited.)
Multitexturing is also more complicated, which means more CPU usage, so I won't do that.
(Of course this is with compressed textures, but that's a given today.)
 
Uttar said:
Also, something I'd really like to test is whether the register usage performance hit is bigger with 8 pipelines than with 4 pipelines.

You want to say with 4 than with 2 ? NV30 vs NV34/31 ?

It's really difficult to use more than 2 registers in NV34 when its in 4x1 mode. It's the same with NV30 when it's in 8x0 (I think).

I believe the performance hit is exactly the same for NV30 and NV31/34.
 
Tridam: Actually, no, that's not what I want. I'm limiting myself to the NV30. I mean, the NV30 got a 4 pipeline and a 8 pipeline mode, depending on whether it outputs color or not.

But I think you're making a good point, indirectly, here: Can the NV30 even do loopback in 8x0?


Uttar
 
DaveBaumann said:
8x0 is Z/Stencil only - no other ops work in this mode, and Z/stencil doesn't require loopback.

I think NV30 can have a throughput of 8 z 'pixels' per clock but not 8 stencil 'pixels' per clock. It have a throughput of 4 stencil per clock. However, it can do 8 stencil ops per clock.
 
Tridam said:
DaveBaumann said:
8x0 is Z/Stencil only - no other ops work in this mode, and Z/stencil doesn't require loopback.

I think NV30 can have a throughput of 8 z 'pixels' per clock but not 8 stencil 'pixels' per clock. It have a throughput of 4 stencil per clock. However, it can do 8 stencil ops per clock.

I'm pretty sure you're right. The double sided stencil test... Altho this could possible be decoupled to provide extra stencil test throughput if required..
 
Tridam said:
DaveBaumann said:
8x0 is Z/Stencil only - no other ops work in this mode, and Z/stencil doesn't require loopback.

I think NV30 can have a throughput of 8 z 'pixels' per clock but not 8 stencil 'pixels' per clock. It have a throughput of 4 stencil per clock. However, it can do 8 stencil ops per clock.

Hmm, makes sense.
http://www.theinquirer.net/?article=7920

"GeForce FX 5800 and 5800 Ultra run at 8 pixels per clock for all of the following:
a) z-rendering
b) stencil operations
c) texture operations
d) shader operations"

and

"Only color+Z rendering is done at 4 pixels per clock"

Notice how they clearly indicate "operations" and not rendering, unlike for Z.
The "shader operations" working at 8 pixels per clock most likely refer to there being 8 FX12 units :rolleyes:

Although, could the NV30 run a 1 FX12 instruction shader at 8x0 with full speed? I know, nobody cares about that, but eh! :)


Uttar
 
Hyp-X said:
DaveBaumann said:
Oh, guess I payed too much attention to NVIDIA marketting there! ;)

Actually it doesn't matter once you turn on trilinear and/or AF, as they won't be able to output 4 pixels/clock anyway.

There are certain textures you don't want to use trilinear or AF on. Of course most of those are going to be used in multitexturing, e.g. lightmaps. A HUD would seem to fit the bill, though.

Also, under Nvidia's performance filtering modes, a large number of screen pixels won't be getting trilinear or AF.
 
Tridam said:
DaveBaumann said:
8x0 is Z/Stencil only - no other ops work in this mode, and Z/stencil doesn't require loopback.

I think NV30 can have a throughput of 8 z 'pixels' per clock but not 8 stencil 'pixels' per clock. It have a throughput of 4 stencil per clock. However, it can do 8 stencil ops per clock.
No, it can output 8 'stencil pixels' per clock.
It doesn't make sense to separate 'stencil ops' from 'stencil pixels', it's always one op per pixel. Two-sided stencil does *not* mean two stencil ops per pixel. Just different stencil ops depending on face orientation.
It doesn't make sense to separate stencil ops from Z either, because stencil operations take z test into account. And both are stored together.
 
Back
Top