RV770 vs GT200 : hidden potential?

Which solution will gain more gaming performance via new drivers?


  • Total voters
    137
  • Poll closed .
This page is interesting:

http://www.firingsquad.com/hardware/ati_radeon_4850_4870_performance/page16.asp

The HD4870 is overclocked 5% on the core and 22% on the memory. Strangely, performance goes up by more than 5%, implying that HD4870 is "bandwidth constrained" in stock form.

Yet HD4870 is clocked 20% higher than HD4850, with 80% more bandwidth, and comes in at ~27% faster - implying that HD4850 is bandwidth constrained.

If HD4850 is bandwidth constrained, how can HD4870 also be bandwidth constrained when it has effectively 50% more bandwidth per core clock cycle?

The overclocked HD4870 has, effectively, 16% more bandwidth per core clock cycle than the stock HD4870.

Looking at HD4850 scaling, core overclock is 10% with 15% on the memory. Performance increases by 8% on average.

All this implies to me that HD4870 is far short of utilising its bandwidth effectively but it should scale considerably with drivers.

The only counter-argument I can think of is that at 2560x1600 HD4870 4xMSAA has run out of memory. In this situation texturing from system memory is having a proportionally lower effect on overall performance as RV770 is overclocked. Is that a reasonable argument?...

Jawed
 
This page is interesting:

http://www.firingsquad.com/hardware/ati_radeon_4850_4870_performance/page16.asp

The HD4870 is overclocked 5% on the core and 22% on the memory. Strangely, performance goes up by more than 5%, implying that HD4870 is "bandwidth constrained" in stock form.

Yet HD4870 is clocked 20% higher than HD4850, with 80% more bandwidth, and comes in at ~27% faster - implying that HD4850 is bandwidth constrained.

If HD4850 is bandwidth constrained, how can HD4870 also be bandwidth constrained when it has effectively 50% more bandwidth per core clock cycle?

The overclocked HD4870 has, effectively, 16% more bandwidth per core clock cycle than the stock HD4870.

Looking at HD4850 scaling, core overclock is 10% with 15% on the memory. Performance increases by 8% on average.

All this implies to me that HD4870 is far short of utilising its bandwidth effectively but it should scale considerably with drivers.

The only counter-argument I can think of is that at 2560x1600 HD4870 4xMSAA has run out of memory. In this situation texturing from system memory is having a proportionally lower effect on overall performance as RV770 is overclocked. Is that a reasonable argument?...

Jawed

Jawed, the only scenario I can come up with here is that the increased memory clocks resulting decrease in latency is the culprit, rather than the increased bandwidth.
 
This page is interesting:

http://www.firingsquad.com/hardware/ati_radeon_4850_4870_performance/page16.asp

The HD4870 is overclocked 5% on the core and 22% on the memory. Strangely, performance goes up by more than 5%, implying that HD4870 is "bandwidth constrained" in stock form.
I think you're looking at GPU charteristics as too black and white. Some parts of the scene are BW limited, some clock limited. Sometimes you even hit parts of a timedemo that are CPU limited.

If we assume that CPU is never a limitation, then the ~9% increases for the OC'd 4870 (except in Crysis) implies you're BW constrained 25% of the time.

(I am making one fairly safe simplification in assuming that there isn't any significant part of the workload that is just slightly BW constrained before the OC).

Yet HD4870 is clocked 20% higher than HD4850, with 80% more bandwidth, and comes in at ~27% faster - implying that HD4850 is bandwidth constrained.
I see quite a bit of variability in the 4850->4870 improvement, but I get 30% ignoring Crysis (which obviously has some CPU limitation, given the sub-20% increase).

Using the same methodology, it seems the 4850 is BW constrained 23% of the time. That's a bit low (or the 25% figure above is too high), but probably within the margin of error.

If HD4850 is bandwidth constrained, how can HD4870 also be bandwidth constrained when it has effectively 50% more bandwidth per core clock cycle?
Basically it means those parts of the rendering load that are BW limited (probably alpha blending, resolving, and post-processing) are really BW limited. 50% more BW/clk is not enough to alleviate the difference.

What's interesting is how this mix can lead to two schools of thought. One is that we should forget about matching BW to that 25% as it needs too much, and just concentrate on being optimal (horsepower to BW) for the other 75%. The other is that more BW is always useful to tap into that 25%, so any measure that increases BW by at least 4% for 1% increase in cost is worth it, even if the GPU remains the same speed.

Thanks for pointing out this matter, Jawed. It's not often that we get to really isolate how much of a rendering workload needs more BW. The last time it was this clear was with 9500 Pro vs. 9700 Pro.

All this implies to me that HD4870 is far short of utilising its bandwidth effectively but it should scale considerably with drivers.
Actually, it looks like it's the other way around. I know I said that the 25% figure is high for the 4870, but that means the performance improvement (9%) is higher than expected.

Hang on, I'm going to quantify this stuff a bit more...
 
Well what we really need to look at is the impact on minimum fps... I mean who cares if you are bandwidth limited @ 150 fps.
 
So you could write the poll this way

Nvidia currently has the worst drivers so G280 will gain most
ATI currently has the worst drivers so v770 will gain the most

Somehow I think the results would change then ;)
 
So you could write the poll this way

Nvidia currently has the worst drivers so G280 will gain most
ATI currently has the worst drivers so v770 will gain the most

Somehow I think the results would change then ;)

At that point you're baiting the sample group with loaded questions though and the poll becomes rather moot. Save the "analysis" and just present the options.
 
Somehow I think the results would change then ;)

Heh yeah the results would be pretty different. It seems to me that GT200 is the one under-performing but people seem to be expecting more gains from RV770. It's probably just new toy syndrome skewing people's opinions.
 
Heh yeah the results would be pretty different. It seems to me that GT200 is the one under-performing but people seem to be expecting more gains from RV770. It's probably just new toy syndrome skewing people's opinions.

And a much lower bar to get over.

Let's face it, G80 was/is just that good.
 
At that point you're baiting the sample group with loaded questions though and the poll becomes rather moot. Save the "analysis" and just present the options.

It is baiting them now Shaidar, I guess you don't see it though.

Saying shine, excel, whatever it amounts to the same thing.
 
It is baiting them now Shaidar, I guess you don't see it though.

Saying shine, excel, whatever it amounts to the same thing.

The problem is you've injected conjecture into the questions, as phrased.

Here's what the poll options should be:
NV will gain the most perf. from new drivers
ATi will gain the most perf. from new drivers
no significant perf. gains for either IHV
similarly significant perf. gains for both IHVs
not sure

See? No "loaded" words.
 
Back
Top