r420 may beat nv40 in doom3 with anti-aliasing

Joe DeFuria said:
Quality differences between FP24 and FP32? Or FP24 and FP16? I remember comments on the latter, but not the former.

I think he commented on that also. But we're probably talking about very small differences since the difference between the R200 and ARB2 was supposed to be close to non existing. Although that has probably changed drastically now:

I did decide rather late in the development to go ahead and implement a nice, flexible vertex / fragment programs / Cg interface with our material system. This is strictly for the ARB2 path, but you can conditionally enable stages with fallbacks for older hardware. I made a couple demo examples, and the artists have gone and put them all over the place...
 
AlphaWolf said:
jvd said:
thnx

I wonder if something is broken.

My 9700pro on an athlon xp 2500+ on gt 2 in 3dmark2001 gets 29 frames

your telling me the x800xt only manages to get another 21 frames over it ?

I dunno that can't be right

gt2 in 3dmark03 not 01.

yes that was a type- o
 
DaveBaumann said:
Pete - compare the front-to-back render performances.
The GL_REME tests? ATi seems to far outperform nV there in F to B. Is this responsible for ATi's slight wins in those tests? Is this advantage negated by AA? What becomes the limiting factor with AA, the memory controller?

Sorry if there are elementary questions.
 
hi pete,
looking at mdolenc’s fillratetester,
there seems to be a problem with fillrate efficiency on r420, maybe a problem with the memory controller or the driver. with 4 textursamples pro Pixel you only get 65% of the possible fillrate.....dunno if this is related
 
Looks like we might not have to wait too long after all:

http://www.gamespot.com/pc/action/doom3/news_6097046.html

the article:

Summer release for Doom 3


Activision and id Software announce that the PC version of Doom 3 will ship to stores this summer.

Activision and id Software have today announced that the PC version of Doom 3, which is also in development for the Xbox, will ship to stores this summer. No specific release date has been confirmed at the time of writing, but today's announcement is sure to be welcomed by fans of the series who have been awaiting the third installment since it was announced back in 2001.

"Doom 3 is the most anticipated PC game of the year and fans around the world are eagerly awaiting its release," said Ron Doonink, CEO at Activision Publishing. "id is delivering the most compelling horror gaming experience to date, so get ready for Doom."

"This summer we're sending game fans on a first-class trip to Mars--with a layover in Hell," added Todd Hollenshead, CEO at id Software. "Doom 3 is coming to the PC and delivering the most atmospheric, terrifying and visually mind-blowing gaming experience ever conceived."

We'll bring you more information on Doom 3 as soon as it becomes available.

By Justin Calvert, GameSpot POSTED: 05/11/04 09:33AM


sounds as vague as the Hl-2 'this summer' promise -- they're prolly trying to compete by releasing near the same time...thoughts, anyone?
 
NV40 does not have fixed function vertex pipeline. The FFP results come from the fact that hand-tuned vertex shaders to implement FFP T&L are more efficient than the on-the-fly compiled ones. This shows you some of the growing room left for the vertex program compiler.
 
Sure, but wouldn't R420 have similar hand-tuned shaders, or are you saying ATi just lets its shader compiler handle things (and thus still has room to grow so long after R300's debut)?

I'm not clear on how hand-tuned shaders affect the fillrate tests, though...?
 
@democoder
the r360 doesnt have this problem and gets around 91% of the possible fillrate under the same conditions....

@pete
about the front-to-back/stencil performance i think on r420 hir-z gets disabled due to the z-test performed in gt2, but im not sure
 
davebaumann wrote:

would someone please enlighten me why r420 doesnt live up to its theoretical
stencil performance in gt2 and other stencil tests then?

.....maybe 'up to 32 z/stencil operations per clock cycle' is limited to more special conditions than just multisampling.....
 
christoph said:
davebaumann wrote:

would someone please enlighten me why r420 doesnt live up to its theoretical
stencil performance in gt2 and other stencil tests then?

.....maybe 'up to 32 z/stencil operations per clock cycle' is limited to more special conditions than just multisampling.....

I'm guessing the hyper z is much better at back to front while nvidia is better at front to back. Thus why ati does worse in this test .

At least thats my dumb guess
 
HyperZ can do nothing at all when rendering back to front. You can't discard pixels that are only hidden by something not rendered yet. Unless you go the deferred rendering route.

edited for clarity.
 
jvd said:
I'm guessing the hyper z is much better at back to front while nvidia is better at front to back. Thus why ati does worse in this test .
Well, Dave's GL_REME Overdraw tests appear to show the X800XT as about equal to the 6800U in Back to Front, but way faster in Front to Back and somewhat to way faster in Random (its lead grows as the overdraw factor increases). Otherwise, the 6800U is ~10% slower in Villagemark, and ~10% faster in Fablemark.

So the implication of Dave's reference to the X800's FtB performance still eludes me. Is it that the X800 keeps up with the 6800 in shadow-heavy benches like X2 and 3DM03 GT2 b/c of ATi's superior culling? That still doesn't clarify (to this ignorant mind) why X800 loses performance when enabling AA+AF, especially if ATi claims that it gains stencilling performance with AA.

The truth is out there, it's just taking me a while to find it. :)
 
Pete said:
So the implication of Dave's reference to the X800's FtB performance still eludes me. Is it that the X800 keeps up with the 6800 in shadow-heavy benches like X2 and 3DM03 GT2 b/c of ATi's superior culling? That still doesn't clarify (to this ignorant mind) why X800 loses performance when enabling AA+AF, especially if ATi claims that it gains stencilling performance with AA.

The truth is out there, it's just taking me a while to find it. :)
One comment. 4x AA requires 4x the Z samples of no AA. X800 XT can do 16 Z/stencil ops per clock without AA and 32 with AA. However, 4x AA for those same 16 pixels requires 64 operations. 4x AA is not free, but certainly faster than it would be doing on 16 operations per clock.

One more thing. The cost of MSAA can be partially, or even completely hidden, if other bottlenecks are at play. For example, complex pixel shaders can shift the load so the the extra time spent doing the Z samples is "free".
 
and how does this explain a 50% lead of gf 6800ultra in gt2 running 1600x1200 with 4xaa looking at over 30% theoretical stencil fillrate advantage of x800xt?
 
christoph said:
and how does this explain a 50% lead of gf 6800ultra in gt2 running 1600x1200 with 4xaa looking at over 30% theoretical stencil fillrate advantage of x800xt?
There's still some tuning work to be done on the X800, especially for AA. As someone mentioned in a review, the memory controller is very programmable, however it's not always obvious what the best settings are (i.e. what's best for one mode may not be best for another) and the combinations are enourmous. Expect large improvements with AA as we find more optimal settings. Stay tuned.
 
Back
Top