Wierd NV40 Fillrate results

Evildeus said:
I don't know how they obtained that score. From HFR, i can see that:
IMG0007678.gif
Please give links so we can get to the review more quickly (besides, it is plagiarism not to....).

Anyway, what we don't know here is that if xbitlabs' results are valid, there may be limits upon when it can achieve that kind of Z fillrate. For all we know, Hardware.fr used the same test that Dave used (they don't reference which benchmark they used to discern the z-only fillrate). So the question is: if xbitlabs' results are valid, will the limits placed upon obtaining maximal z-only fillrate allow the extra fillrate to be useful in games?

Side note: Hardware.fr has the best nalu pic I've seen yet:
http://www.hardware.fr/articles/491/page7.html
 
When I saw that result I assumed it was an error, but it did occur to me that it's possible that the fillrate tester might not disable Z compare, and perhaps NV40 could reject Z pixels at some larger rate.

Without knowing exactly what the test does, it's pretty much impossible to tell.
 
jolle said:
again, could one of the PS units, since there seems to be 2 of them per
pipeline, act like a extra pipeline when it comes to Z passes?
need someone to poke a hole in that theory so i can move on hehe..
Well, we know that this is the case. What we don't know is:
1. Are there limits upon when this can happen? Are those limits realistic for this functionality to be useful in real-game situations?
2. Are the Z ROP's capable of multiple z writes per clock without FSAA?

Edit: Btw, the PS units don't do the color outputting. It's the color ROP's (render outputs) that we're talking about here.
 
Chalnoth said:
jolle said:
again, could one of the PS units, since there seems to be 2 of them per
pipeline, act like a extra pipeline when it comes to Z passes?
need someone to poke a hole in that theory so i can move on hehe..
Well, we know that this is the case. What we don't know is:
1. Are there limits upon when this can happen? Are those limits realistic for this functionality to be useful in real-game situations?
2. Are the Z ROP's capable of multiple z writes per clock without FSAA?

Edit: Btw, the PS units don't do the color outputting. It's the color ROP's (render outputs) that we're talking about here.

And i guess there is also a chance the test showed some sort of error aswell..

But about the PS units i dont know alot about this stuff, but since the Color
ROP acts like a ZOP or whatyoucallit, which gives the theoreticly double
amount of pipelines in Z passes, i thought perhaps the one of the PS units
could do that aswell..
What it does, is just putting in z values for every pixel in the framebuffer
for future refernce in the rendering of the frame isnt it?
And the PS unit wouldnt be dealing in color, but raw Z values in that case..
So that wouldnt be any color outputting, not that i know what that means
hehe but still..

I guess it is to early to say anything for sure, but can use of one of the PS
units be excluded totally?
for all i know it could be totally impossible or not..
 
Easy to figure out what happened, somebody contact xbitlabs asking for whatever detail about that test they're willing to share.
 
991060 said:
Easy to figure out what happened, somebody contact xbitlabs asking for whatever detail about that test they're willing to share.

Dunno about that, they didnt even attempt on any theories in the review
when they noted this, cause they did say how weird it was and all that...
but they didnt speculate on why it was like that..

But since it has been noted, there is prolly some effort going on to find
out why, and if there is any result they will prolly post it when found..
 
Just to note that nv40 acting as 48 pipeline chip really isn't even an option here. Since as noted earlier, 48 x 400Mhz = 19200 Mp/ sec.

Xbitlabs were, however, getting scores higher than that... so that wouldn't solve the problem of chip scoring scores better than theoretically possible. 50 Would be the number we would be after.
for example (extremely unlikely and uneducated example, i might add)

32 z pipelines + 6 shader units * (2 shader ops + 1 color op abused as z op)

32+(6*3) = 50

50 x 400 = 20000 <19889 (xbitlabs result)


All this just theoretical mathematical mumbling and wild guesses!
 
Megadrive1988 said:
48 pipes ?? :oops: :oops: :LOL:

gesus chwist, I'm just happy that NV40 has 16 pipes!

well it is "only" 16 pipes, and 32 in some cases...
It MIGHT be 48 or 50 or something under special conditions..
but if those conditions dont arise, it wont matter..
and it would only be during Z only passes at any rate...

so, it sounds better then it acctually is..
 
ERP is right. You guys are forgetting about effective Z fill vs real Z fill. The results could be because of extra culling coming into play. UltraShadow is more than just the 32x0 mode.

I can't wait to see Doom3 benchies on this card.
 
In Dave's article it is stated that with early Z on, the NV40 can
reject 16 quads (64 pixels) per clock. Can't rememeber the
page but it's there...
 
DemoCoder said:
ERP is right. You guys are forgetting about effective Z fill vs real Z fill. The results could be because of extra culling coming into play. UltraShadow is more than just the 32x0 mode.

I can't wait to see Doom3 benchies on this card.

Ultrashadow is sortof Occlusion culling for shadows isnt it?
is the 32x0 thing even related to Ultrashadow?
 
psurge said:
In Dave's article it is stated that with early Z on, the NV40 can
reject 16 quads (64 pixels) per clock. Can't rememeber the
page but it's there...
Well, if the test in question drew a lot of hidden pixels, then that could definitely account for the high z fillrate numbers. Anyway, as I said, we need more tests!
 
Mendel said:
Just to note that nv40 acting as 48 pipeline chip really isn't even an option here. Since as noted earlier, 48 x 400Mhz = 19200 Mp/ sec.

Xbitlabs were, however, getting scores higher than that... so that wouldn't solve the problem of chip scoring scores better than theoretically possible. 50 Would be the number we would be after.
for example (extremely unlikely and uneducated example, i might add)

32 z pipelines + 6 shader units * (2 shader ops + 1 color op abused as z op)

32+(6*3) = 50

50 x 400 = 20000 <19889 (xbitlabs result)


All this just theoretical mathematical mumbling and wild guesses!

hey, i got more crazy ideas!

6VS units right?, again thinking about the old rumour of how the VS units were
supposedly able to work as pipelines...

ok.. so, 32x0 eh.. ponder 1 VS unit capable of processing 3 pixels per unit somehow.
that would be 6x3= 18, so it would be equvalent of 18 pipes..
32 + 18 = 50...
50 x 400Mhz = 20.000 Mpixels/sec like you said...

Farfetched? extremely.. hehe..
I have no clue about anything, just spitting ideas..
and again, it might all just come down to a error in the test program...
 
I don't think there's any way the VS units could have anything to do with those performance scores.

And just fyi, the VS units are typically less powerful than pixel units.
 
ok, another idea in the bin, hehe..
Hope someone comes up with the final say on it, be it error in the test or special feature thingie, soon..
 
Guys, isn't R3XX also able to do some sort of early z reject tricks? Anybody knows under what circumstance it'll take effect?
 
991060 said:
Guys, isn't R3XX also able to do some sort of early z reject tricks? Anybody knows under what circumstance it'll take effect?
I believe it's mostly related to hierarchical-z. This basically means that if every pixel in a tile of some specific size is behind some maximum z value allotted to that tile, the entire tile is rejected. There is also, of course, a similar trivial z-pass if the every pixel in the tile being rendered is in front of some minimum z value.
 
991060 said:
Guys, isn't R3XX also able to do some sort of early z reject tricks? Anybody knows under what circumstance it'll take effect?

that has been around since.. what is it, Geforce3 was it?
Occlusion culling in hardware, ofcource the implementations on modern GPUs
such as R3xx is prolly alot more evolved.. or maybe thats not the same thing.

But its not related to NVs capabilites to sortof simulate 32pipes in Z passes..
 
Back
Top