Mind helping me interpret AoE3 #s?
I'm losing my bearings again, b/c I can't figure out why ATI does so much worse than NV at AoE3. I thought the X1900 proved that AoE3 liked shader power, given that it outperformed the X1800 at the same clocks. Let's say the X1800 was maxing out its shader units but not its texture units. Surely the situation would be reversed with the X1900? If so, won't the similarly 3:1 ALU:TMU X1600 leave some shaders dangling while maxing its pittance of texture units? Yet we see the 6600GT outperforming the X1600XT and the 7600GT doing the same to the X1800GTO.
Looking at HW.fr's GPU table for clues, how can I mesh the math and texture ops #s to determine AoE3's bottleneck(s) WRT mainstream parts? If X1900 (HW.fr table) gains fps by adding 3x the math ops, then can we assume all of X1600XT's texture units are in use and so it's TMU constrained? So, let's compare the X1600XT with the 6600GT. The X1600XT gets 2k tex ops/s and 7k math ops/s. Comparing relative to tex ops, the 6600GT would get 2k tex ops/s and 4k math ops/s.
Keeping Xbit's X1900 vs. X1800 #s in mind (roughly similar improvements both w/ and w/o AA+AF), how is the X1600XT's extra shader power not putting on a better show in HW.fr's #s? Is it down to NV's more flexible shader pipe allowing more tex ops, packing in more shader ops with more flexible co-issue (3+1/2+2 vs. ATI's 2+2), FP16 optimizations, better shader compiler, (better) app-specific optimizations, or what?
Let's say the 6600GT is faster b/c of more tex fetches, which is possible with 16xAF. Let's scale according to framerates. The GT is 40-50% faster than the XT, so let's say 3k tex ops/s for the GT. That leaves just 2k math ops/s. If tex ops are the limitation and we're seeing fewer math than tex ops, then why isn't the 16*625MHz=10k X1800XT as fast as the 24*430=10k 7800GTX, and why is it slower than the X1900XT/X?
The 4xAA #s further confound me, as NV loses up to three times as much performance as ATI. ATI's 256b X1800GTO and its 128b X1600XT each drops a curiously meager 10%, while NV's 128b 7600GT and 6600GT each drops 30% and its 256b 6800GS drops 16-20%. This is a clue, I'm sure, I just don't know how to interpret it. Is the fact that 16xAF's on for all tests a problem? I could guess that helps NV's more TMU-endowed GPUs to shine w/o AA, but that doesn't explain it dropping more with AA--unless this is ATI's memory controller showing off.
Maybe there's just something wrong with the game engine, as Xbit said.
I'm losing my bearings again, b/c I can't figure out why ATI does so much worse than NV at AoE3. I thought the X1900 proved that AoE3 liked shader power, given that it outperformed the X1800 at the same clocks. Let's say the X1800 was maxing out its shader units but not its texture units. Surely the situation would be reversed with the X1900? If so, won't the similarly 3:1 ALU:TMU X1600 leave some shaders dangling while maxing its pittance of texture units? Yet we see the 6600GT outperforming the X1600XT and the 7600GT doing the same to the X1800GTO.
Looking at HW.fr's GPU table for clues, how can I mesh the math and texture ops #s to determine AoE3's bottleneck(s) WRT mainstream parts? If X1900 (HW.fr table) gains fps by adding 3x the math ops, then can we assume all of X1600XT's texture units are in use and so it's TMU constrained? So, let's compare the X1600XT with the 6600GT. The X1600XT gets 2k tex ops/s and 7k math ops/s. Comparing relative to tex ops, the 6600GT would get 2k tex ops/s and 4k math ops/s.
Keeping Xbit's X1900 vs. X1800 #s in mind (roughly similar improvements both w/ and w/o AA+AF), how is the X1600XT's extra shader power not putting on a better show in HW.fr's #s? Is it down to NV's more flexible shader pipe allowing more tex ops, packing in more shader ops with more flexible co-issue (3+1/2+2 vs. ATI's 2+2), FP16 optimizations, better shader compiler, (better) app-specific optimizations, or what?
Let's say the 6600GT is faster b/c of more tex fetches, which is possible with 16xAF. Let's scale according to framerates. The GT is 40-50% faster than the XT, so let's say 3k tex ops/s for the GT. That leaves just 2k math ops/s. If tex ops are the limitation and we're seeing fewer math than tex ops, then why isn't the 16*625MHz=10k X1800XT as fast as the 24*430=10k 7800GTX, and why is it slower than the X1900XT/X?
The 4xAA #s further confound me, as NV loses up to three times as much performance as ATI. ATI's 256b X1800GTO and its 128b X1600XT each drops a curiously meager 10%, while NV's 128b 7600GT and 6600GT each drops 30% and its 256b 6800GS drops 16-20%. This is a clue, I'm sure, I just don't know how to interpret it. Is the fact that 16xAF's on for all tests a problem? I could guess that helps NV's more TMU-endowed GPUs to shine w/o AA, but that doesn't explain it dropping more with AA--unless this is ATI's memory controller showing off.
Maybe there's just something wrong with the game engine, as Xbit said.
Last edited by a moderator: