AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

  • Within 1 or 2 weeks

    Votes: 1 0.6%
  • Within a month

    Votes: 5 3.2%
  • Within couple months

    Votes: 28 18.1%
  • Very late this year

    Votes: 52 33.5%
  • Not until next year

    Votes: 69 44.5%

  • Total voters
    155
  • Poll closed .

Shtal

Veteran
AMD next generation RV870 key specs revealed?

It is said that AMD next generation graphic core possible be named RV870 and according to TSMC technology, it will use 40nm or 45nm technology. The core area of RV870 will be about 140m㎡ which is much smaller than RV770 260m㎡. As we know by now, it will have 192 ALU. RV770 each ALU matched 5SP and then RV870 will have 960SP. In order to control the core area, it is still 256bit. We believe RV870 will be 1.2 times than RV770 in performances, but this will be decided by the clock of RV870.

It is also said that AMD next generation R800 will use new design. We know Radeon HD3870X2 and the coming Radeon HD4870X2 both used single PCB+dual graphic core design while R800 will possible use real dual core design. If so, AMD next generation flagship R800 will be the first dual core GPU. The specs of R800 will double RV870.

Advanced 45nm (40nm?) will bring RV870 smaller core area. The current RV770 did well in performances but the temperature is really terrible. If RV870 can settle this problem and further improve performances, it will be really excited for us and it will be the first real dual core GPU possibly. http://en.hardspell.com/doc/showcont.asp?news_id=3768
 
140 sq.mm? I don't believe it, at least with all that bunch of I/O in the chip... unless 40nm scales damn well the padding areas.
 
The current RV770 did well in performances but the temperature is really terrible. If RV870 can settle this problem and further improve performances,

I wonder by how much temperature drop?

we know Radeon HD3870X2 and the coming Radeon HD4870X2 both used single PCB+dual graphic core design while R800 will possible use real dual core design.

Well, it's interesting.....

We believe RV870 will be 1.2 times than RV770 in performances

:( :(
 
The only way those specs would make sense is if it were some form of x2 card and not a chip. Divide all those specs by two and throw in the 140mm2 and it would sort of make sense.

Wasn't RV770 pad limited with the 256bit bus at 260mm2? Cutting the die size in half and maintaining that bus just wouldn't make sense.

Targeting a chip for the low end and utilizing 2-4 chip configurations for the mid/high end parts might make sense but I'd think that PCIe bus would get saturated fast at that rate. A really high end card would likely appear as >4 cores in some form of crossfire configuration. I'm not sure the scaling would still hold at that point in time. AFR would definitely be out if that was the direction they took.
 
At 140mm² I think it's time to consider not only 2 GPUs for R800. ;)

That said - maybe AMD starts to separate 2D-stuff also? So they could significantly improve shrinking pontential on future manufacturing processes - at least AFAIK.

So - how about a 5-chip R800? 4 Computing-Containers and one I/O?
 
That would make quite a lot of sense, but only if it were something like this:

RV740 - 128bit - ~140mm² [1.2 TF]
RV870 - 2xRV740 (MCM) - 2x128bit (256bit) - 2x140mm² (280mm²) [2.4TF]
R800 - 2xRV870 (2xMCM) - 2x 256bit - 4x 140mm² (560mm²) [4.8TF]

Now, packaging costs would be quite high, but this would give them an endless list of advantages. If they could pull that off and they found a way not to rely on AFR, so that both ASICs on a package work together (almost) transparently, AMD/ATI would not be limited by process maturity of newer nodes, the volume would be there pretty early, prices could be _very_ competitive again and they could also minimize tape-out costs.

With GDDR5 maturing, mass availability by next year and lower prices (more semis), BW would be about equal per core (ASIC), like to where it is now.
 
Last edited by a moderator:
If it's only going to be 20% performance increase over RV770.

Then specs should be like this:
RV870
800MHz GPU
192 *5D = 960SP's
20 ROP's
44 or 48 TMU's
256bit GDDR5 ~120 to 160GB's bandwidth.
 
If ROPs are tied to memory bus width, then they'd stay at 16. TMUs are tied to the shader core, so they'd be at 48. The rest of their numbers are just naive estimates: 960 / 800 = 1.2, 260 * 40^2 / 55^2 ~= 140.

They seem pretty confident about the 960 (though last time we heard both 480 and 800 :devilish:). How high is GDDR5 supposed to clock by next year? Could they get close to 4870 bandwidth with a ~128-bit bus?
 
If ROPs are tied to memory bus width, then they'd stay at 16.
I could see them move to 32 if GDDR5 got fast enough and they couldn't increase the GPU clock enough. If the core is 800 MHz and GDDR5 reaches 3 GHz, then 8 ROPs per 64-bit memory channel is actually 18
% more BW per ROP than what the 4850 (only 4 ROPs per 64-bit channel) gets.

RV770 is 10 SIMDs with one Quad-TMU each.
This rumour suggests RV870 is 12 SIMDs, and one Quad-TMU each would equal Pete's claim.
 
The 4870 Memory is running at 900 Mhz. Samsung is transitioning in K4G52324FG-HC05 memory chip, that last two digits are the speed of .5ns, that would be 2Ghz.

http://www.samsung.com/global/business/semiconductor/productInfo.do?fmly_id=675&partnum=K4G52324FG
No, that's not the same clock. The 05 part would be a base clock of 1Ghz, and the 06 part would be even too slow for the HD4870...
I don't think anything faster than 2.5Ghz will be available early next year. Eventually faster parts may appear...
 
Is it conceivable that the "crossfire link" on each chip could be used on an MCM package to share a memory controller given they would be on the same package using the same pins?

If so, would this result in a pseudo-256-bit bus for both cores that would use load balancing, or rather memory sharing between both exclusive perhaps 128-bit buses, or both?

Either/or would be a boon to the mgpu market.
 
MCM core packaging on a graphics card seems like an illogical choice, from both cost perspective and from a technical point, IMHO.
A GPU doesn't work like a CPU, no matter how much anyone tries to spin it...
A MCM would only make sense if the cost of making the package with the two GPU's and the memory controller was smaller than just build them all in one, and if the performance of the shared memory controller was substantially better and consumed less die space than one otherwise built-in.

Yet, even future Intel Atom CPU's (not to mention "Nehalem" and Athlon64/Phenom) are moving away from that, by integrating memory controllers directly in the core.
That should tell us something...
 
Last edited by a moderator:
You mean 2x performance!

Oh, really ? How much L1/L2 cache would you have to build into a GPU core to compensate to the lowsy communication to an-off core die where the memory controller would reside ?
Now multiply the resulting GPU by 2, add the external memory controller and the decidedly more complex packaging of all three chips and do the math...

Even Nvidia eventually let NV45's BR02 (that was a MCM package, after all) and G80's NVIO functionality back into the GPU core...
In a special purpose chip design, it is especially useful if you can contain off-chip communications to and from co-processors down to a minimum.
Look at how simply switching to a PCIe 2.0 bridge made the current Crossfire-on-a-Stick solution scale performance that much better than its predecessor (PCIe 1.1 bridge).
 
Last edited by a moderator:
I did not realize that their were any x2 boards with 2.0 PLX chips on them in the market yet...

Look at how simply switching to a PCIe 2.0 bridge made the current Crossfire-on-a-Stick solution scale performance that much better than its predecessor (PCIe 1.1 bridge).
 
Oh, really ? How much L1/L2 cache would you have to build into a GPU core to compensate to the lowsy communication to an-off core die where the memory controller would reside ?
Now multiply the resulting GPU by 2, add the external memory controller and the decidedly more complex packaging of all three chips and do the math...

Even Nvidia eventually let NV45's BR02 (that was a MCM package, after all) and G80's NVIO functionality back into the GPU core...
In a special purpose chip design, it is especially useful if you can contain off-chip communications to and from co-processors down to a minimum.
Look at how simply switching to a PCIe 2.0 bridge made the current Crossfire-on-a-Stick solution scale performance that much better than its predecessor (PCIe 1.1 bridge).

Maybe ATI/AMD has some-kind idea that we don't know.... Maybe using 512Bit GDDR5 instead 256bit GDDR5.
EDIT: I still believe that ATI R600 w/ 512bit bus was experimental project for the future. (Because everyone knows that R600 512bit available bandwidth was a waste) Unless if I downward ATI as an error.
 
Last edited by a moderator:
Back
Top