R6XX Performance Problems

Me? I was right on the money (just embarrassingly too forgetful/lazy to look up the exact numbers). Perhaps you meant Rangers, although R520 was hardly "twice as big" as G71 (196mm², 13.5mm x 14.5mm) either.
 
R520 ...was massively texture limited.
Funny how adding an extra 200% of ALUs & not increasing TMUs or clock speed made R580 faster than R520 in most situations then...
 
I'd blame the fact that these parts have been in development for years. When they were releasing R300, R600 was probably being drawn up. It seems obvious that they thought we'd be texturing a lot less these days.
 
R5x0 had hybrid ringbus layout. R6x0 has int/ext ringbus. Who said only 4 ringstops?
Sure, but at the time everyone had expected full R600 style ringbus & that was what was being discussed.
4 Ringstops is from ATIs diagrams, Dave quotes & other ATI official word.
There is the 5th one for PCIE I/O but its physically half way between two stops rather than 5 equally spaced so in terms of length of die traversed is pretty much irrelevant.
 
But R520 was substantially slower than G70 per-clock, with the same TMU/ALU ratio ATI had always used up until that point. R580 actually closed this gap, delivering 20%+ increases in performance for only 20% more transistors. That seems like a pretty good deal to me.
 
There is the 5th one for PCIE I/O but its physically half way between two stops rather than 5 equally spaced so in terms of length of die traversed is pretty much irrelevant.
Is it? Every additional node adds latency.
 
But R520 was substantially slower than G70 per-clock, with the same TMU/ALU ratio ATI had always used up until that point. R580 actually closed this gap, delivering 20%+ increases in performance for only 20% more transistors. That seems like a pretty good deal to me.

Actually R520 was faster than G70 and by a fair margin after 2-3 months when the drivers for the most part matured. It was only at launch that it was slower.

Although in OGL it was a completely different story.

Regards,
SB
 
I said slower per-clock. R520 was around 20% faster than G70 w/AA at launch (in Direct3D), but was clocked 45% higher (430 MHz vs. 625 MHz). That's why Nvidia were able to quickly take back the performance crown with the 7800 GTX 512, and had ATI not introduced the R580, the X1800 would have to face off against the 7900 GTX (still only 650 MHz vs. 625 for the X1800).
 
Is it? Every additional node adds latency.
Only for data that has to traverse that section of the bus which is only 2 paths out of 20 that have to, with another 4 that could go that way or could go the other way & bypass that ringstop.
 
They continue squandering enormous opportunities (R520 was twice as big as G70..it should therefore by all reasonable measures have been twice as fast). It was massively texture limited.
First of all, you should be comparing R580 to G71. R520 was rarely texturing limited.

Secondly, R580 was not massively texture limited compared to G71. In fact, for ALU:TEX ratios greater than 2:1, R580 would achieve higher texture throughput. Moreover, G71 would stall when the texturing unit needed more than one cycle (AF, trilinear, >32-bit per texel, etc.), whereas R580 could do math in parallel. The problem with R580's perf/mm2 is that ATI made a real PS3.0 pixel shader with usable dynamic branching. When G71 is handling only 6 massive batches concurrently and R580 is handling 512 small ones, R580 is doing gobs more work for almost nothing.

R600 vs. G80 is where the real texturing deficit came. R600 made irrelevent improvements over R580 (texture cache BW for FP32 use in GPGPU, FP16 filtering), whereas G80 not only doubled the filtering rate but also let the texture units run in parallel to math.

Now we have RV670..half as big as G92..it should be the same size instead, but twice as fast..AMD 55nm edge=totally squandered, yet again.
This is screwed up logic. Did you lambaste G71 for not being twice the size so that it could beat R580?

RV670 is exactly what ATI needed to get its margins back up. It's only 20% bigger than G84, but lays a complete beatdown on it. Who cares if the 8800GT is faster, as it's advantage isn't as big as its size suggests. It can't compete in the OEM market like RV670 can, and we're hearing about supply issues too. ATI scaled R600 down to RV670 very impressively, almost completely erasing the large perf/cost deficit ATI had vs. NVidia.
 
R580 is doing gobs more work for almost nothing.

Oh I don't know about that. Looking at R580's performance vs G71 in modern games, the R580 really does slaughter it. Oblivion was the first game were this became really evident but since then there have been quite a few examples.
 
Oh I don't know about that. Looking at R580's performance vs G71 in modern games, the R580 really does slaughter it. Oblivion was the first game were this became really evident but since then there have been quite a few examples.
Sorry, I meant to use past tense. R580 was doing gobs more work for almost nothing.

Even today, though, I don't think much of this has to do with dynamic branching. Longer pixel shaders and two-component normal maps means less relevence of free normalization, and though ALU:TEX is not high enough to give R600 an advantage over G80, it's high enough to give R580 an advantage over G71.
 
Oh I don't know about that. Looking at R580's performance vs G71 in modern games, the R580 really does slaughter it. Oblivion was the first game were this became really evident but since then there have been quite a few examples.

From Tech Report's 8800 review:
oblivion.gif


Seems that G71 is beating R580. I too thought it was the other way around, but this seems to say otherwise.
 
Oblivion isn't perfect example. Results depends on location. Better exaples:

Age of Epires III
Anno 1701
Call of Juarez
Colin McRae Rally 05
Company of Heroes
Earth 2160
Gothic III
Need For Speed Carbon
Rainbow Six Vegas
Splinter Cell 4
X3
 
TomsHardware VGAcharts is a good resource for comparing videocards over different generations. The latest, 2007 version, obviously has the most recent games (7games + 3DM06). Whether this accurately describes what people actually play is a different question - for some reason the hardware sites don't want to test WoW and its peers. :) If you go back in time to the earlier charts you get a few more comparison points.

http://www23.tomshardware.com/graphics_2007.html

The 7900GTX is faster than the X1900XTX in five out of seven 2007 games, one of two wins for the X1900XTX incidentally being Oblivion.

Most review sites only test new cards versus competing new cards, which doesn't tell a prospective upgrader much about what would be gained compared to what he/she already has, making this particular resource quite useful.
 
Back
Top