AMD: R7xx Speculation

Status
Not open for further replies.
I said exactly that earlier in this thread.

Would ATI really do what they did with R600 and RV670 again? Increase costs for useless bandwidth?

a 30% performance increase does not sound like useless bandwidth to me. Infact sounds to me that the HD 4850 may be a good bit bandwidth choked.
 
Why would bandwidth per ROP be an indicator of AA performance? Samples/clock should be your benchmark. If both G92 and GT200 are equally bandwidth limited but the bandwidth is doubled on GT200 you should get your 2x scaling.

Also, isn't CPU limitation at lower resolution a good thing? What's the "risk" you're referring to?

Generally AA scenarios at high resolutions are limited also by the bandwidth IIRC. 2x scaling normally requires that everything scales 2x, but TMU are not doubled in GT200, if we believe to the 24 SP per cluster structure. Moreover, we don't yet know the final clocks. So, even if I'm inclined to believe that GT200 could be in average superior to 9800 GX2, having more than 2x scaling with respect to G92 seems a little too much IMHO (as Jawed spoke about "more than 2x").

The "risk" I'm referring to when speaking about the "CPU limited" is that at lower resolutions with GT200 the CPU could limit the card capabilities so this card will not demonstrate its capabilitues and shine up to very, very high resolution (+AA). Of course Crysis is another story ;)
 
Huh??
What's so great about being completely bandwidth limited?
Besides, by all accounts they WILL sell the same GPU with GDDR3, so what's the problem?

The question is rather if the additional performance is worth it in terms of component cost. But given how happily people seem to pay 35% more for the 15% higher clocks of the top end cards, getting 30% higher performance without any additional power draw, and at reasonable cost seems like a comparatively good deal.
Obviously, whether the additional cost is worth it will depend on usage patterns. People who like to use settings that are bandwidth hungry will fork over the money for the higher end part, others will save their pennies and get a relative bargain. Just as it usually works out.
Hmm, sounds just like trying to justify R600's memory configuration. Didn't work before and I don't see why it's going to work now.

If there are D3D10 or 10.1 features that shine due to high bandwidth then it'd be nice to be educated about them because so far there's nary a sign. It might be developer reticence or it might be the biasing effect of TWIMTBP. Maybe someone should benchmark Ruby: Whiteout on R600 and RV670 and report any differences they find...

I think it's worth pausing to consider just what wonderful progress we've seen in bandwidth-efficiency over the last couple of years, from the 64GB/s for R580 to the ~64GB/s for G92. RV770 appears to be catching up with the curve, at least as far as regular DX9 games are concerned. D3D10 remains a mystery...

Jawed
 
Generally AA scenarios at high resolutions are limited also by the bandwidth IIRC. 2x scaling normally requires that everything scales 2x, but TMU are not doubled in GT200, if we believe to the 24 SP per cluster structure. Moreover, we don't yet know the final clocks. So, even if I'm inclined to believe that GT200 could be in average superior to 9800 GX2, having more than 2x scaling with respect to G92 seems a little too much IMHO (as Jawed spoke about "more than 2x").

Hmmm maybe I'm missing what you're saying but you seem to be contradicting your own point. If AA is bandwidth limited and GT200 has twice the bandwidth why do you doubt a potential 2x speedup?

There is no need to double TMU's if they weren't used to capacity in G80/G92 configurations. I'm not saying that there will be a 2x speedup but there's nothing rumoured so far that would eliminate it as a possibility.

The "risk" I'm referring to when speaking about the "CPU limited" is that at lower resolutions with GT200 the CPU could limit the card capabilities so this card will not demonstrate its capabilitues and shine up to very, very high resolution (+AA). Of course Crysis is another story ;)

Understood but the only way to limit that risk is to limit the performance of the chip in the first place. So you're saying having too much performance potential is a bad thing?
 
Hmmm maybe I'm missing what you're saying but you seem to be contradicting your own point. If AA is bandwidth limited and GT200 has twice the bandwidth why do you doubt a potential 2x speedup?

There is no need to double TMU's if they weren't used to capacity in G80/G92 configurations. I'm not saying that there will be a 2x speedup but there's nothing rumoured so far that would eliminate it as a possibility.

I'm saying the same thing, I was only doubtious about having a more than 2x scaling (I was commenting a sentence of Jawed). I expect the GT 200 to be on par with GX2 (when it scales well).


Understood but the only way to limit that risk is to limit the performance of the chip in the first place. So you're saying having too much performance potential is a bad thing?

No, I'm only saying that in most of applications, with current systems, there is a risk of not being able to express the full potential of such a card. Such as needing to be benched on OC configurations. Having more power is definitely a very good thing.
 
30% more performance for ~double the bandwidth. Crass. Why not just use GDDR3 and sell the damn thing for a better price?
I must be missing something, but isn't that what the 4850 is for? Unless "crass" refers to the accuracy of the performance boost (how much is the extra 512MB responsible for?), rather than the efficiency of the higher clocks.

30% doesn't seem like performance you'd want to leave on the table if the extra costs are mainly materials (GDDR5) rather than labor.
 
30% doesn't seem like performance you'd want to leave on the table if the extra costs are mainly materials (GDDR5) rather than labor.
How much of that 30% is available with GDDR3, since 1GHz GDDR3 isn't the fastest available?

We could talk about headroom for the overclockers being solely available with GDDR5. That may indeed make AIBs happier, if they can go to 40%+ performance over HD4850. That is if some HD4850s aren't, themselves, given a 20% overclock by AIBs with the fastest available GDDR3 + core-clock boost.

If GDDR4 is reckoned to cost something like 30% more than the fastest GDDR3, what sort of margin is there on GDDR5?

With a bit of luck RV770 will be a happy overclocker and GDDR5 will be the only way to realise such a gain. As a matter of interest I have a suspicion that GDDR5 is not going to suffer the same degree of latency overhead that GDDR4 seems to have. 90 or 100GB/s of GDDR5 bandwidth, if it were available would seem to be the sweetspot if RV770 can overclock another ~20%. That's ~50% more than HD4850.

Erring on the side of bandwidth is good - but to the extent of ~125GB/s just seems bizarre - or crass just like R600's bandwidth. If there's a big deadzone between the top of GDDR3 and the bottom of GDDR5, well I guess there's no choice. It seems like a rather costly choice though.

Of course, given the scant bandwidth-efficiency analysis of RV670, it does seem rather pointless dwelling on the price/performance of HD4870.

Jawed
 
If GDDR4 is reckoned to cost something like 30% more than the fastest GDDR3, what sort of margin is there on GDDR5?
:LOL::LOL:

The diference in price between one:
Powercolor ATI HD 3870 PCS 512MB GDDR3 PCI-E 1.8Ghz
http://www.powercolor.com/Global/products_features.asp?ProductID=1950
149,89€
Powercolor ATI HD 3870 PCS 512MB GDDR4 PCI-E 2.3Ghz
http://www.powercolor.com/Global/products_features.asp?ProductID=1632
158,99€

9€ diference.
I know that you know quite a lot about GPU´s, but don´t guess numbers.
 
Last edited by a moderator:
:LOL::LOL:

The diference in price between one:
Powercolor ATI HD 3870 PCS 512MB GDDR3 PCI-E 1.8Ghz
http://www.powercolor.com/Global/products_features.asp?ProductID=1950
149,89€
Powercolor ATI HD 3870 PCS 512MB GDDR4 PCI-E 2.3Ghz
http://www.powercolor.com/Global/products_features.asp?ProductID=1632
158,99€

9€ diference.
I know that you know quite a lot about GPU´s, but don´t guess numbers.
When Apple lowered the price of the iPhone by $200, what did that tell you about the price of included hardware? Squat!
 
Erring on the side of bandwidth is good - but to the extent of ~125GB/s just seems bizarre - or crass just like R600's bandwidth. If there's a big deadzone between the top of GDDR3 and the bottom of GDDR5, well I guess there's no choice. It seems like a rather costly choice though.

Of course, given the scant bandwidth-efficiency analysis of RV670, it does seem rather pointless dwelling on the price/performance of HD4870.

Jawed

Perhaps AMD is erring on the side of bandwidth for the sake of R700. If a single chip fails to consistently make use of most of its bandwidth, then the worst-case scenario where two chips hit some off-balance load on one memory controller won't tank the card's performance. This would presuppose that the ring bus and interconnection between the chips is pretty hefty as well.

Sure, better drivers or software allocation could avoid much of the unbalancing. But maybe it's better to count on higher bandwidth that is physically guaranteed as opposed to praying for software and drivers that are far less likely to work broadly.
 
Why is there a massive difference in bandwidth for RV770XT and 770pro? i mean, just like when R600 was nearing its launch, everyone thought that ATi had "features" that would utilize the massive bandwidth it had. Next thing you know, it was a complete waste of material/money/time since it didnt do anything good except make the R600XT cost alot to produce. We know this because RV670 had better performance with less bandwidth.

Yet we have the RV770 with even more bandwidth than R600. If RV770 is based on RV670, then one would assume that having bandwidth of 125GB/s is pretty ridiculous seeing as RV670 was never bandwidth limited as seen by benchmarks all over the web.

Also the use of GDDR5 could potentially bite AMD back. Its something thats completely brand new and havent been used yet on any hardware so far which i think is pretty risky since as of yet nobody knows how it will perform in terms of performance/power consumption/heat etc. Being a relatively new memory technology, what about the supply of these memory chips that could potentially jack up the price? This kind of reminds me of GDDR2 and NV30.

edit - http://www.digitimes.com/mobos/a20080515PD220.html
Not sure if this is a repost or not, but the pricing of the HD4870 is supposedly $349 while $229 for HD4850.
 
Last edited by a moderator:
Quick question for you guys, could one of the main reasons that r700 may implement a single 512bit memory controller between the two chips is to improve multicard performance in addition to multi-gpu performance. Could one of the main bottlenecks of going 2x4 on the gpu front was because their were 4 individual memory controllers involved in the transactions. Could this mean some interesting and tangible gains in multi-card performance for AMD? If so, do you think Nvidia be able to match with with their multi-chip setup
 
Perhaps AMD is erring on the side of bandwidth for the sake of R700. If a single chip fails to consistently make use of most of its bandwidth, then the worst-case scenario where two chips hit some off-balance load on one memory controller won't tank the card's performance. This would presuppose that the ring bus and interconnection between the chips is pretty hefty as well.

Sure, better drivers or software allocation could avoid much of the unbalancing. But maybe it's better to count on higher bandwidth that is physically guaranteed as opposed to praying for software and drivers that are far less likely to work broadly.
Not my exact thinking but pretty close.
Extra bandwidth for R700's sake while being able to keep a leg up on some of the competition?

Why is there a massive difference in bandwidth for RV770XT and 770pro? i mean, just like when R600 was nearing its launch, everyone thought that ATi had "features" that would utilize the massive bandwidth it had. Next thing you know, it was a complete waste of material/money/time since it didnt do anything good except make the R600XT cost alot to produce. We know this because RV670 had better performance with less bandwidth.

Yet we have the RV770 with even more bandwidth than R600. If RV770 is based on RV670, then one would assume that having bandwidth of 125GB/s is pretty ridiculous seeing as RV670 was never bandwidth limited as seen by benchmarks all over the web.
Let's remember the supposed doubling of TMUs, the 50% increase in ALUs and the hopefully tweaked ROPs. These improvements should fill up some of that extra bandwidth.

Also the use of GDDR5 could potentially bite AMD back. Its something thats completely brand new and havent been used yet on any hardware so far which i think is pretty risky since as of yet nobody knows how it will perform in terms of performance/power consumption/heat etc. Being a relatively new memory technology, what about the supply of these memory chips that could potentially jack up the price? This kind of reminds me of GDDR2 and NV30.

edit - http://www.digitimes.com/mobos/a20080515PD220.html
Not sure if this is a repost or not, but the pricing of the HD4870 is supposedly $349 while $229 for HD4850.
Also remember the power savings that GDDR5 offer.
They get early samples, so they know what GDDR5 can do for them before they sign contracts...
Let's not forget that GDDR5 should, eventually be cheaper to produce than other GDDR, once it ramps to full production.

Are there any other players in the GDDR world other than Samsung, Hynix and Qimonda?
 
When you guys say "massive difference in bandwidth" just how high do you expect the difference to be after all between the PRO and the XT? I've been doing some speculative math yesterday and I'd expect the highest end coming GF to probably have to the next best thing something between 30-40% difference in bandwidth.
 
The fact digitimes says 4870X2=$549 means they mean it to compete with GT200.

I wonder how that will work out.

The $349 tag for 4870 seems more reasonable to me. It slots right in above 9800GTX in the current market. If we assume GT260/280 are going to offer price/performance well above that, the price could stand. GT260 might be the interesting one, exactly where it slots, and if it offers greater enough performance at >349 to make the 4870 unnattractive.

A couple things occur to me on the side, one is AMD has more low hanging fruit. If the rumors of 64 or 80 TMU's in GT200 are true, lets take the high end of 80, it's only 25% more. AMD would have 100% more than their previous architecture.

Another is that at 240 SP's, Nvidia will probably not have any more brute shader deficit. 240 sp's double clocked=480. Plus Nvidia supposedly has the efficiency edge per SP.

It seems AMD would have been better off shooting for something like 48 TMU's, the shaders are already up there near Nvidia. 48+TMU's probably could have competed a lot better straight up with GT260/80, at not all that much die cost.

Who knows though, the engineering effort to make such a leap and enlarge the die yet more might have been out of AMD's reach at this time.

I still think AMD's constant philosophy of newest games requiring more shaders may make their architecture more forward looking, though Nvidia will still be faster. The shader/TMU ratio "advantage" will still lie heavily with AMD, if more forthcoming games like Crysis rely intensely on shaders.
 
Last edited by a moderator:
=>v_rr: He doesn't mean a 30% difference in the final cost of the card, but just the memory chips. That could very well be the 9€.

=>Rangers: Current nVidia chips have too much texturing power for their own good. I haven't tested it myself, but I heard that synthetic texture fill-rate tests can send the GPU temperature and power consumption skyrocketing. In normal usage, the cards don't use the texturing units to their full potential. So it would appear logical for GT200 to have relatively less TUs.

=>mao5: No, it's just the unified shaders that can do GPGPU code (physics for example). The only problem is implementation - will the game devs use it since it's not compatible with nVidia's CUDA and there isn't a unified API like Direct3D for physics?
 
Status
Not open for further replies.
Back
Top