New benchmark

trinibwoy said:
geo said:
My 6800GT opened a can of whup-ass on yours, Trini! Same settings BF, mine was 350% faster!!! 71.89 Official drivers.

Hmmmm, what'd you get on front to back?

Meh. Never mind; i-o error [idiot-operator] on this end. When I did it right I'm consistent (and lost my mouse too.)
 
geo said:
Meh. Never mind; i-o error [idiot-operator] on this end. When I did it right I'm consistent (and lost my mouse too.)

Hehe. Humus, how do we revert to standard settings without the use of our mouse - I don't see a config file - are you setting registry keys or something.
 
Ailuros said:
As for the overdraw level I'm not sure where the average depth complexity factor would lie today, but I guess an average of 8.0 shouldn't be too far from reality, especially for the foreseeable future.

I would say it's more like 4, maybe 5. Some parts of a scene will probably get 8 or more, but if the engine is decent, there shouldn't be that much on average.
 
trinibwoy said:
Hehe. Humus, how do we revert to standard settings without the use of our mouse - I don't see a config file - are you setting registry keys or something.

Yeah, I store settings in HKEY_CURRENT_USER\Software\Humus. Just remove all keys there to reset to defaults.
 
Rys said:
Press F1 twice to get your mouse back.

Or just hit Esc should work too. I need to add an option to the framework to not have it capture the mouse for apps like this where there's no interactive free flying.
 
Humus said:
Ailuros said:
As for the overdraw level I'm not sure where the average depth complexity factor would lie today, but I guess an average of 8.0 shouldn't be too far from reality, especially for the foreseeable future.

I would say it's more like 4, maybe 5. Some parts of a scene will probably get 8 or more, but if the engine is decent, there shouldn't be that much on average.

Even in let's say Serious Sam 2 or games based on UE3? I'm having more the foreseeable future in mind than what is available today (which is rather boring I might add *cough*).
 
Humus said:
trinibwoy said:
Hehe. Humus, how do we revert to standard settings without the use of our mouse - I don't see a config file - are you setting registry keys or something.

Yeah, I store settings in HKEY_CURRENT_USER\Software\Humus. Just remove all keys there to reset to defaults.

Cool. Thanks!
 
x800pro/cat5.6 @ 1024*768 with 8x overdraw
Code:
         FtB       BtF      Random      Pre-Z

0xAA    118.81    14.97     43.33       111.60

2xAA    113.98    14.88     43.27       107.22

4xAA    113.39    14.86     44.22       102.52

6xAA    113.39    14.88     43.51        95.82
 
Ailuros said:
Even in let's say Serious Sam 2 or games based on UE3? I'm having more the foreseeable future in mind than what is available today (which is rather boring I might add *cough*).

Well, it's likely to grow in the future, but 8x on average is still quite a lot, unless you have tons of foilage.


Anyway, I fixed a bunch of GUI issues, so there's a new release available.
 
tEd said:
x800pro/cat5.6 @ 1024*768 with 8x overdraw
Code:
         FtB       BtF      Random      Pre-Z

0xAA    118.81    14.97     43.33       111.60

2xAA    113.98    14.88     43.27       107.22

4xAA    113.39    14.86     44.22       102.52

6xAA    113.39    14.88     43.51        95.82


It seems the X800Pro is not very efficient, or?

(1024x768 x 8 x 118,81fps) / (475MHz x 12 ) = 0,13 or 13%


The X850XT-PE as seen on the first page seems to work far more efficient than the X800Pro; reaching around 32,5% with the same 8x overdraw. With an overdraw of 16 the X850XT-PE seems to reach an efficiency of 64% and with 64x overdraw, preZ and random ordering the efficency is 214%. So only with an very high overdraw the X850XT-PE no longer has to render every pixel but can save fillrate.

2 questions:

is this small calculation correct? Or uses the test mutlitexturing or something else which distortes the picture.

if the calculation is correct, why does earlyZ and / or hyperZ not work better? It seems a lot of fillrate is wasted despite all this fancy bandwidth and fillrate saving systems within IMR's nowadays.
 
mboeller said:
if the calculation is correct, why does earlyZ and / or hyperZ not work better? It seems a lot of fillrate is wasted despite all this fancy bandwidth and fillrate saving systems within IMR's nowadays.

When you just have a simple texture there's no much work to save, so the improvement of using HyperZ is much smaller than in the complex shader case, where a very significant improvement can be seen.
 
Mordenkainen said:
Humus: Would FtB pose problems for geometry instancing?

Well, if you can sort loads of objects quick enough it shouldn't be a problem. But if you're using instancing it's likely that your main bottleneck isn't the fragment shader but rather the number of draw calls.
 
Humus said:
When you just have a simple texture there's no much work to save, so the improvement of using HyperZ is much smaller than in the complex shader case, where a very significant improvement can be seen.


Now I'm confused.

I thought HyperZ and earlyZ were implemented to save fillrate? And the 8x overdraw should burn a lot of fillrate, so HyperZ and/or earlyZ should make an big difference here.

Please can you explain what you mean?
 
mboeller said:
Now I'm confused.

I thought HyperZ and earlyZ were implemented to save fillrate? And the 8x overdraw should burn a lot of fillrate, so HyperZ and/or earlyZ should make an big difference here.

Please can you explain what you mean?
Early Z/HierZ can only reject a certain number of pixels per clock. The more expensive it is to render those pixels, the higher are the gains.

Say, GPU X can output 16 pixels per clock and reject 256 via early Z methods. Now you're using a single cycle shader (bilinear texture). If that surface is hidden behind another one (FtB), (non-)rendering is up to 8 times faster than if it were not hidden and had to be rendered at a rate of 16 pixels/clock.
If you use a 16 cycle shader, you only get 1 pixel/clock, but early Z reject rate stays at 256 pixels/clock. So with a complex shader, hidden surfaces are now 256 times faster theoretically.
 
Xmas said:
Early Z/HierZ can only reject a certain number of pixels per clock. The more expensive it is to render those pixels, the higher are the gains.

Say, GPU X can output 16 pixels per clock and reject 256 via early Z methods. Now you're using a single cycle shader (bilinear texture). If that surface is hidden behind another one (FtB), (non-)rendering is up to 8 times faster than if it were not hidden and had to be rendered at a rate of 16 pixels/clock.
If you use a 16 cycle shader, you only get 1 pixel/clock, but early Z reject rate stays at 256 pixels/clock. So with a complex shader, hidden surfaces are now 256 times faster theoretically.

Thanks Xmas!

No it's clear what Humus ment.
But the mystery remains that HyperZ does not work better because I cannot see the 8times faster rendering in the benchmark.
Well IMHO HyperZ helps cause the X850XT-PE is around 2.5 times more efficient as the X800Pro and AFAIR the HyperZ of the X800Pro is not working because only 12Pipe's are activated.
But still 32,5% efficiency seems not great ( I even included the 8x overdraw in the calculation for both chips so IMHO the real efficiency is even lower ).

If I understand all that correct an TBDR would have to render only one surface instead of 8 and would have an far higher efficiency too, or?
So a TBDR with the same spec would be up to 24times faster than an IMR like the X850XT-PE; which seems ridiculously fast.
 
mboeller said:
But the mystery remains that HyperZ does not work better because I cannot see the 8times faster rendering in the benchmark.

Are you going by tEd's scores you quoted above? Then 118.81 / 14.97 = 7.94, so I'd say it's pretty darn good. I wouldn't worry about the missing percent. ;)

mboeller said:
If I understand all that correct an TBDR would have to render only one surface instead of 8 and would have an far higher efficiency too, or?

It would perform better in worst case scenarios. But just adding a Pre-Z pass makes IMRs with HyperZ perform nearly as good. 111.60/118.81 = 94%.

Edit: would->wouldn't
 
Back
Top