3DLabs P20 : 12 pixel pipelines, only 4 ROPs ?

Zeross

Regular
I've had a Wildcat Realizm 200 for some times and put it through its paces. The results were sometimes impresive and sometimes disappointing. Particularly the fillrate was not what I was expecting from a 12 pipelines GPU. I first thought that shading speed was the issue but even the raw fillrate was clearly sub par : FFP - Pure fillrate - 1432.220581M pixels/sec

Unless the chip was clocked in the 120/150MHz range something was going wrong. But the other day I found something that might explain these results. In the white paper Wildcat Realizm technology on this page : http://developer.3dlabs.com/openGL2/presentations/index.htm , we can see on page 10 that if there are really 48 fragment processors (12 pipelines in 3Dlabs terminology) there are only 16 pixel processors (4 pipelines) that can write to the frame buffer.

So the P20 is like the NV43 : more shading pipes than ROPs if I understand well. I find it strange because if the choice of nVidia is justified by the lack of bandwith of NV43 128 bit bus, the P20 comes with high speed GDDR3 and a 256 bit bus. The only reason I can think of to explain it, is to save transistors because these pixel processors are FP16. Any thoughts or comment on the topic ?

Sorry if it was already well known and if I was the only one to miss it, but nevertheless with all this talk on NV40 and R420 or even NV48 and R480 I think the P20 deserve a new topic ;)
 
I think the card's aim is to do a lot more intensive rendering. That is to say, fragment programs will be more computationally intensive, rather than bandwidth intensive. The large amount of bandwidth could simply be for HUGE geometry, textures and fragment programs. Of course, the biases will be dramatically different that the current games.
 
My understanding some time back was that it may only have 3 texture units as well (one per quad?). Might be nice to see some texture filltess again (there are some around here, but I didn't really analyze).
 
DaveBaumann said:
My understanding some time back was that it may only have 3 texture units as well (one per quad?). Might be nice to see some texture filltess again (there are some around here, but I didn't really analyze).

Latest results I've had with the card
Code:
Fillrate Tester
--------------------------
Display adapter: 3Dlabs Wildcat Realizm 200
Driver version: 6.14.1.0
Display mode: 1024x768 A8R8G8B8 85Hz
Z-Buffer format: D24S8
--------------------------

FFP - Pure fillrate - 1432.220581M pixels/sec
FFP - Z pixel rate - 1267.341553M pixels/sec
FFP - Single texture - 958.715881M pixels/sec
FFP - Dual texture - 543.938049M pixels/sec
FFP - Triple texture - 267.401581M pixels/sec
FFP - Quad texture - 194.633072M pixels/sec
PS 1.1 - Simple - 705.833557M pixels/sec
PS 1.4 - Simple - 344.463776M pixels/sec
PS 2.0 - Simple - 705.630615M pixels/sec
PS 2.0 PP - Simple - 705.570007M pixels/sec
PS 2.0 - Longer - 346.786133M pixels/sec
PS 2.0 PP - Longer - 346.778992M pixels/sec
PS 2.0 - Longer 4 Registers - 366.767029M pixels/sec
PS 2.0 PP - Longer 4 Registers - 366.806244M pixels/sec
PS 2.0 - Per Pixel Lighting - 45.032333M pixels/sec
PS 2.0 PP - Per Pixel Lighting - 45.036903M pixels/sec

Notice the improvement over previous drivers :
Code:
Fillrate Tester
--------------------------
Display adapter: 3Dlabs Wildcat Realizm 200
Driver version: 6.14.1.0
Display mode: 1024x768 A8R8G8B8 85Hz
Z-Buffer format: D24S8
--------------------------

FFP - Pure fillrate - 937.775940M pixels/sec
FFP - Z pixel rate - 774.708740M pixels/sec
FFP - Single texture - 869.273560M pixels/sec
FFP - Dual texture - 450.414642M pixels/sec
FFP - Triple texture - 273.096252M pixels/sec
FFP - Quad texture - 172.649231M pixels/sec
PS 1.1 - Simple - 656.983215M pixels/sec
PS 1.4 - Simple - 350.396698M pixels/sec
PS 2.0 - Simple - 656.978699M pixels/sec
PS 2.0 PP - Simple - 656.979431M pixels/sec
PS 2.0 - Longer - 305.609955M pixels/sec
PS 2.0 PP - Longer - 305.610535M pixels/sec
PS 2.0 - Longer 4 Registers - 334.277527M pixels/sec
PS 2.0 PP - Longer 4 Registers - 334.246582M pixels/sec
PS 2.0 - Per Pixel Lighting - 21.987038M pixels/sec
PS 2.0 PP - Per Pixel Lighting - 21.987391M pixels/sec

+100% in PS2.0 Per pixel Lighting :D
Still slow though :?
 
It seems like something else is going on here besides the mentioned conjecture that P20 has more fragment pipes than rops. Its pixel fillrate drops so dramatically with added complexity that it does not seem there are many more fragment alus than rops. Perhaps this could be attributed to what Wavey said about 3 textures units total.

I would've guessed that fillrate would drop less dramatically if the fragment pipe to rop ratio was 3:1 (12:4), although I'm not sure about how texture/bandwith dependent the more complex fillrate tests are.
 
Back
Top