ATI Hierarchical-Z issue with Doom 3

im not saying he crippled ati on purpose, its just odd that this is the pnly game to break atis heir z. esp when games with similar rendering styles (dues x 2, theif 3) ati performs much better than nvidia
 
hovz said:
im not saying he crippled ati on purpose, its just odd that this is the pnly game to break atis heir z. esp when games with similar rendering styles (dues x 2, theif 3) ati performs much better than nvidia

You're not saying it, but you're suggesting it as if it is one of the plausible explanations.
 
Hmm. . . I have no idea as to just how much was revealed about the algorithms used in Deus Ex 2 and Thief 3. Is it actually mentioned anywhere that those games render starting off with a z-pass to prevent overdraw?
 
hovz said:
im not saying he crippled ati on purpose, its just odd that this is the pnly game to break atis heir z. esp when games with similar rendering styles (dues x 2, theif 3) ati performs much better than nvidia
Each chip has its strengths and weaknesses. Every game engine has different characteristics, which usually aren't inherently better or worse.
Every game with the same rendering style as Doom3 breaks ATI's hierZ. Maybe those games use other advantages of ATI cards.
You could equally ask why those developers didn't adjust the characteristics of their engine to better suit NVidia hardware.

But that's not the way game engines are built, especially not those which are supposed to be licensed.

I think Carmack didn't even know there was a problem with hierZ when he developed Carmack's Reverse. He did a lot of research in shadow rendering, and he certainly did a lot of tests. He chose the algorithm that came out best (in whatever metric he used). It ran on available hardware, and from the results he extrapolated that performance on the projected minimum system would be acceptable. He couldn't have foreseen how the hardware eventually turned out to be, though he has influence on it (especially with NVidia).
 
ERP said:
But it's all relative.

Would you have been happier if the rendering path was slower on all hardware, but made ATI's hardware look better?

Carmack clearly made a number of decisions, that affect performance in a number of ways, without trying all of them you can't know what the tradeoffs were.

The Z prepass in and of itself might be enough to make the difference, and it could just be "Carmacks reverse" that's the issue.

We don't know, abd speculating that Carmack deliberatly crippled ATI hardware is a bit tin foil hat like IMO.

Its obvous that carmack ahd enough time to mess around with an nv3x path that didn't even make it into the game. Why couldn't he have messed around with special r3x0 paths or r42x paths . To perhaps work around problems ..

That is why i put the blame on both of them.
 
doesn't explain why older ATI hardware has no issues. it would seem to me that the fault (if there even is any?) would be on ATI. if there are no issues on older hardwafe but there is on newer hardware it would suggest to me that ATI has changed something in their hardware.
 
plat said:
doesn't explain why older ATI hardware has no issues. it would seem to me that the fault (if there even is any?) would be on ATI. if there are no issues on older hardwafe but there is on newer hardware it would suggest to me that ATI has changed something in their hardware.

dude what the hell are u talking about?
 
plat said:
doesn't explain why older ATI hardware has no issues. it would seem to me that the fault (if there even is any?) would be on ATI. if there are no issues on older hardwafe but there is on newer hardware it would suggest to me that ATI has changed something in their hardware.

who says ati's older hardware has no faults ?

For all we know all the radeons can be handy capped because of this problem with hyper z .

The reason why we can't tell is because they are better suited for doom 3 than the geforce line .

THe r200 can do things in 1 pass that the 3/4 need multi passes . The r300 in other games has better shader performance.

THat could be why we see the fx and r3x0s so close to each other . Because they are handy capped by the hyper z .
 
jvd said:
THe r200 can do things in 1 pass that the 3/4 need multi passes . The r300 in other games has better shader performance.

THat could be why we see the fx and r3x0s so close to each other . Because they are handy capped by the hyper z .

I seem to remember Carmack commenting on some hardware issues with the R200 series that prevented it from being faster then the GF3/4 series even though it should be (less passes). Though i must say that i'm surprised by the performance of the R300 series in Doom 3, they do much better then i thought compared to the FX. I would say that it's yet another win for the R3XX vs NV3X since it'll probably dominate in Half Life 2.
 
I don't know why you people even bother to respond to hovz. I have yet to see an intelligent statement from him.
 
Chalnoth said:
I don't know why you people even bother to respond to hovz. I have yet to see an intelligent statement from him yet.

Do you need to resort to personal insults ?
 
I don't see any reason to bother to argue a point with somebody who appears to have nothing intelligent to add to the conversation.
 
Chalnoth said:
I don't see any reason to bother to argue a point with somebody who appears to have nothing intelligent to add to the conversation.
Well i should take your advice and not argue with u . Since you appear to have nothign intelligent to add to this convo
 
Uttar said:
Mintmaster: I just did that, thanks for the idea :) I'm on a 6800GT at 410/1000 btw. The tests were run at 1280x1024 60hz, V-Sync disabled, 32bpp.
www.notforidiots.com/0AA.txt
www.notforidiots.com/2AA.txt
www.notforidiots.com/4AA.txt

Key numbers for Doom 3 probably are:
0x AA: Overdraw factor 3, front to back: 1902.61 fps
2x AA: Overdraw factor 3, front to back: 1749.25 fps
4x AA: Overdraw factor 3, front to back: 1567.55 fps
Seems like that could be quite an advantage...

Uttar
Thanks uttar. Actually, you need the front to back (ftb) for overdraw of 3 and 8, and use this formula:
Code:
z-reject rate = (# pixels on screen) * 5 / (1 / ftb8 - 1 / ftb3)
I found it works extremely well for all architectures. Try it with the numbers from this page.

For your card, we get over 25 Gpix/s for all AA settings, a few percent shy of 64 pixels per clock. That means Xmas is right. With 4xAA, NVidia can reject 256 samples per clock. I doubt they have this many Z-units, so does it mean NVidia has a form of HiZ as well? We saw a hint of this in another thread where pixels could only be rejected at a rapid rate when there is enough difference in Z value.

R420 can do 256 pixels per clock when Hi-Z is enabled, and thus 1024 samples when 4xAA is enabled, but only 32 samples per clock when HiZ is disabled, I think. It could be possible that ATI's early Z can do all samples in a quad at once as well. If not, this could definately be a big reason NV40 does so well in Quake3 even with AA.

Has anyone done benchmarks without shadows enabled? We could probably get a good idea of shading speed from that.
 
Why don't we argue about the topic instead of argueing about someone's arguements?

With that said, what's so special about rejection rate?
 
Mintmaster said:
For your card, we get over 25 Gpix/s for all AA settings, a few percent shy of 64 pixels per clock. That means Xmas is right. With 4xAA, NVidia can reject 256 samples per clock. I doubt they have this many Z-units, so does it mean NVidia has a form of HiZ as well? We saw a hint of this in another thread where pixels could only be rejected at a rapid rate when there is enough difference in Z value.
I think it does, but I believe the thread showed that nVidia's optimizations gave more performance when surfaces were closer to one another, which seems a strange result indeed.

Anyway, it's probably not so much that they can reject 256 samples per clock, but that they can reject 16 quads per clock, such that their z optimization either rejects a quad or doesn't, with nothing inbetween. I would expect that would be the only way that they would be able to have the rejection rate nearly independent of FSAA.
 
pat777 said:
Why don't we argue about the topic instead of argueing about someone's arguements?

With that said, what's so special about rejection rate?

Carmack does a Z Only pass, so a cards fill rate when doing the shaded pass is largely dictated by it Z reject rate, not it's color fill rate.
 
Z only passes are used in 3DMark03 GT2 & 3 and also, evidently in Tomb Raider:AoD - I'd wager there are a few others as well.
 
Back
Top