AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

  • Within 1 or 2 weeks

    Votes: 1 0.6%
  • Within a month

    Votes: 5 3.2%
  • Within couple months

    Votes: 28 18.1%
  • Very late this year

    Votes: 52 33.5%
  • Not until next year

    Votes: 69 44.5%

  • Total voters
    155
  • Poll closed .
Why would there be one before 40nm? AMD has better things to spend their R&D budget on than things they don't need to do and won't help them in the longer-end...
For all we know, the HD4870 replacement was supposed to have been Jan/Feb 40nm, but now 40nm has been pushed back...

Apart from that, NVidia is expected to refresh its enthusiast SKUs in the same time frame, so the question could read "will AMD be competitive? Or is AMD going to give it up without a refresh?"

As GDDR5 becomes the norm rather than the exception, it makes sense to double the number of ROPs per MC.
I agree.

A 128-bit GDDR5 RV770 on 40nm could definitely be 20% faster than RV770... (remember in that timeframe, 2.5GHz GDDR5 is perfectly reasonable and 40nm promises a pretty nice clockspeed improvement)
Hmm, 20% faster than HD4870?

Being generous, HD4870 needs about 100GB/s on GDDR5 to perform at its current levels. 20% more than that, i.e. 120GB/s is what you're suggesting. How is GDDR5 supposed to get there on only a 128-bit bus? That's ~3.8GHz.

If RV740 is 128-bit 2.5GHz GDDR5, that's 80GB/s, far below HD4870, and hampered further by the slight clock-for-clock loss we've seen so far comparing GDDR3 and GDDR5 at the same speeds.

~20% faster than HD4850 I can accept, but not HD4870.

Jawed
 
Also, I think I said "GT216" very clearly... GT212 also exists, but that's another chip obviously and presumably aimed at 2Q09
Thanks for this info.

But i wonder what this GT216 would reffers to. An intermediate chip between GT212 and DX11 gen ? :???:
 
Hmm, 20% faster than HD4870?

Being generous, HD4870 needs about 100GB/s on GDDR5 to perform at its current levels. 20% more than that, i.e. 120GB/s is what you're suggesting. How is GDDR5 supposed to get there on only a 128-bit bus? That's ~3.8GHz.
I think Arun is counting on a core clock boost, but he's still overestimating how well it can overcome a BW deficit.

Under the assumption that the 4870 is 20% BW limited and using 2.5 GHz GDDR5 (80GB/s @ 128-bit), this new value chip would have to hit 1100MHz to be 20% faster. Not very realistic, IMO, but 10% is doable if the core can reach 965MHz.
 
Nice break? :smile:
I think Arun is counting on a core clock boost, but he's still overestimating how well it can overcome a BW deficit.

Under the assumption that the 4870 is 20% BW limited and using 2.5 GHz GDDR5 (80GB/s @ 128-bit), this new value chip would have to hit 1100MHz to be 20% faster. Not very realistic, IMO, but 10% is doable if the core can reach 965MHz.
Assuming you're talking about a refresh of HD4870, this would still be 55nm I reckon. Considering that RV770 was downclocked from RV670, I think 965MHz is wildly optimistic.

So, erm, right now I'm thinking that RV740 could make a tasty, very cheap, replacement for HD4850.

But HD4870's refresh still leaves me mystified unless there is a moderate bump in clocks, something we did see with RV670 (in X2 form, 825MHz). But that could be a rather negligible 5-10% :cry:

Jawed
 
I think Arun is counting on a core clock boost, but he's still overestimating how well it can overcome a BW deficit.
Yeah. My intuition is that there are enough parts of the average frame that aren't very BW limited to be able to extract a 20% boost out of a 33%+ clock speed increase (->1GHz) and a small GDDR5 efficiency boost. However, I realize now that I was assuming the 'starting point' would have the same bandwidth as a HD4870, and clearly that makes no sense whatsoever. So yeah, I'm nearly certainly wrong here. Also, I was obviously thinking of a 40nm chip... Once again, I have no idea why you want AMD to waste its precious R&D resources into yet another 55nm chip partially overlapping their current line-up.

WRT GT21x, this is what I'm expecting: GT218:1C/32SP, GT216: 3C/96SP, GT214: 6C/192SP, GT212: 12C/384SP. Given how wrong I was with G92, I hope to only be wrong by 1 to 3 orders of magnitude this time around! :) Sorry for the OT, at this rate I probably should just move this entire sub-discussion to the other thread.
 
One thing I wonder is about the "bandwidth efficiency" of RBEs versus MCs.

Could 8 quad-RBEs with a 128-bit bus be more bandwidth efficient than 8 quad-RBEs and a 256-bit bus (both configurations having "the same bandwidth", GDDR5 and GDDR3 respectively, say)?

Presumably the L2 cache, which in RV7xx is evenly split across MCs, would be doubled per MC in a 128-bit version, in order to retain the quantity of L2. Again, is reducing the count of L2 caches going to increase cache-system efficiency, or will that be offset by the increased L2 size?

Alternatively, could we be looking at 4 MCs of 32-bit, instead? So, instead of increasing the number of RBEs per MC, narrow the MC per quad of RBEs.

After all the to-ing and fro-ing that ATI's done with 32-bit and 64-bit MCs, I dunno.

Jawed
 
So yeah, I'm nearly certainly wrong here. Also, I was obviously thinking of a 40nm chip... Once again, I have no idea why you want AMD to waste its precious R&D resources into yet another 55nm chip partially overlapping their current line-up.
I think it's reasonable to expect AMD to produce a 40nm RV740, which, with upto 80GB/s, would replace HD4850 while costing much less to build and being notably faster. Another SKU could be slower, say 80-90% of HD4850.

But that doesn't answer the 1-year gap between HD4870 and HD5870. NVidia will have something to put in that gap, I'm sure. And there'd be no reason for AMD to have planned for NVidia not to do that. So, what's AMD going to deploy to compete - I don't buy the R&D wastage argument. You're basically saying it isn't worth the R&D costs to compete for half the year.

RV870 is, according to nomenclature, a full generational increment. It should be way more than 20-50% extra performance - I'm thinking 100%++ extra performance and coming 1 year-ish after RV770 that would seem reasonable, being 40nm.

Jawed
 
Also, I was obviously thinking of a 40nm chip... Once again, I have no idea why you want AMD to waste its precious R&D resources into yet another 55nm chip partially overlapping their current line-up.
I know, and that's why I think ~950MHz is doable. Not sure why Jawed thinks you were talking about 55nm.
 
RV870 is, according to nomenclature, a full generational increment. It should be way more than 20-50% extra performance - I'm thinking 100%++ extra performance and coming 1 year-ish after RV770 that would seem reasonable, being 40nm.

Jawed

I'm thinking along these lines, but maybe there wasn't meant to be a refresh. (And they're lying without remorse :LOL:)
As it currently goes right now, ATI/AIBs can certainly afford for a little ASP slashing to remain competitive around all levels, and if current AIB indications are correct, current RV770 prices are pretty lofty to partners.


I'm not with the higher clocking rumours though- at least not for a new chip.
 
But that doesn't answer the 1-year gap between HD4870 and HD5870. NVidia will have something to put in that gap, I'm sure. And there'd be no reason for AMD to have planned for NVidia not to do that. So, what's AMD going to deploy to compete - I don't buy the R&D wastage argument. You're basically saying it isn't worth the R&D costs to compete for half the year.

Well if you think about it ATI has been 1 year refreshes since R300 really. 9800pro wasn't that much of an upgrade for 9700pro users and then there was the 9800xt... and then R420 came along almost 2 years after. r420 wasn't replaced until r520 in 2005.. about a 1.5 year wait there. I guess there's a little exception here with r580 but anyways you get my point. 1 year cycles are pretty common for ATI.

Development cycles are generally getting longer these days as well so I wouldn't be surprised in the slightest if ATI didn't have anything to fill the time gap between rv770 release and rv870 release.
 
Refreshes, as in R420/423->R480, R520->r580, R600->RV670. Though it's fair to say all of these refreshes have different "headlines", with R480 being the weakest.

Also, one could argue that R600, being so massively late (shoulda been autumn 2006?) makes RV670 more like a year later - but 65nm appeared to cause a 1 quarter delay, which I guess applied to 55nm too, so...

Jawed
 
I think it makes economic sense for them to not try and push a refresh product through.

I don't think NV will have a real response until they depart completely from G80s scaler approach.. GT200 arch is sorta dead in the water IMO. If they cut it down any they wont be competitive on performance all the while still costing more to produce. So given the competitive landscape I would argue that there's not much incentive for a refresh (apart from trivial things like mem and clock speeds).

I would also argue that ATI's R&D resources are stretched thin with the development of both fusion and rv870 leaving little room for a hypothetical RV780.
 
I think it makes economic sense for them to not try and push a refresh product through.
Hmm, depends on whether a GX2 based on 55nm chips turns up from NVidia.

I don't think NV will have a real response until they depart completely from G80s scaler approach.. GT200 arch is sorta dead in the water IMO.
As Arun has pointed out in the other thread:

http://forum.beyond3d.com/showpost.php?p=1230016&postcount=251

ALU:TEX is due to increase and if NVidia plays with MAD:SF ratio (increasing it) then the effect on the density of compute (single-precision MADs, for the sake of argument, ignoring the rest) will be fairly dramatic. Sure the burden of all that control gubbins will continue to be very heavy, but ALU:TEX + very high shader clocks (compared with ATI - presumably much more achievable with 40nm than with 65nm) will claw back quite a bit of the deficit.

Perhaps NVidia can get down to a ~30% performance per mm2 deficit?...

If they cut it down any they wont be competitive on performance all the while still costing more to produce. So given the competitive landscape I would argue that there's not much incentive for a refresh (apart from trivial things like mem and clock speeds).
Every Christmas from 2004 onwards, NVidia has beaten ATI with some kind of novelty that ATI simply couldn't respond to. Why would AMD expect NVidia's going to behave differently this year (or at least do their darndest)?

I would also argue that ATI's R&D resources are stretched thin with the development of both fusion and rv870 leaving little room for a hypothetical RV780.
OK, so how did ATI produce its fastest ever top-to-bottom architectural replacement this past quarter? HD4xxx GPUs all in one quarter? HD2xxx is moot since R600 was epic in its lateness (November to April?), while X1K was planned to span May-October, roughly.

You could argue that AMD's done this in order to clear the deck for the next tranche of work. Dunno.

It might simply reflect the fact that AMD is not trying to design a 400-600mm2 monolithic ultra-enthusiast GPU - time/money saved there making for an awful lot of progress in other areas...

Jawed
 
Hmm, depends on whether a GX2 based on 55nm chips turns up from Nvidia.

I don't think that's really relevant... ati will give up that market segment if it comes down to it. Of course it'd be nice if they didn't have too, it's always good to kick NVs ass but it's not a hard requirement. Anyways I'm very doubtful that a GX2 part is feasible even on 55nm... do you think it is?

As Arun has pointed out in the other thread:

http://forum.beyond3d.com/showpost.p...&postcount=251

ALU:TEX is due to increase and if NVidia plays with MAD:SF ratio (increasing it) then the effect on the density of compute (single-precision MADs, for the sake of argument, ignoring the rest) will be fairly dramatic. Sure the burden of all that control gubbins will continue to be very heavy, but ALU:TEX + very high shader clocks (compared with ATI - presumably much more achievable with 40nm than with 65nm) will claw back quite a bit of the deficit.

Perhaps NVidia can get down to a ~30% performance per mm2 deficit?...

That's very interesting but still they'd be loosing the perf per mm2 game so ATI would still hold the power to set the prices in each performance segment.

Every Christmas from 2004 onwards, NVidia has beaten ATI with some kind of novelty that ATI simply couldn't respond to. Why would AMD expect NVidia's going to behave differently this year (or at least do their darndest)?

It's a big gamble for sure... However GT200 took ages to come out after G80 so ATI could be expecting a similar gap between the GT200 release and the next major revision of the arch.

I really think it all comes down to a resources issue (lack of) in the end though... obviously if they had the resources to do it they would.

You could argue that AMD's done this in order to clear the deck for the next tranche of work. Dunno.

It might simply reflect the fact that AMD is not trying to design a 400-600mm2 GPU monolithic ultra-enthusiast GPU - time/money saved there making for an awful lot of progress in other areas...

by not designing a 400-600mm2 monolithic GPU they aren't saving any money or resources they just don't have to increase the R&D budget beyond what it already is. Rv870 will have what ~2billion trans? that's a pretty huge leap in complexity already. A 500mm2 chip on 40nm would be absolutely massive! 3billion? 4? it'd require some pretty serious investment.
 
One of the 40nms seem to have taped out. A11 revision it seems. (nVidia uses another way of referencing to revisions, no?)

RV740 is obviously the "experience" chip it seems.
 
How likely is it to see the next X2 implemented with hypertransport and an NUMA architecture? I heard the FlexIO idea being tossed around a few pages back, but why not Hypertransport since Fusion will inevitably make use of it too and this'll be a nice practice part.
 
Something like that could only come in handy if the hypothetical card used something else than AFR. Then, perhaps, it wouldn't need to share the memory, but a fast link through which GPU1 could write into GPU's memory could do the job.
 
Back
Top