AMD: R8xx Speculation

Shtal · Jul 19, 2008

AMD next generation RV870 key specs revealed?

It is said that AMD next generation graphic core possible be named RV870 and according to TSMC technology, it will use 40nm or 45nm technology. The core area of RV870 will be about 140m㎡ which is much smaller than RV770 260m㎡. As we know by now, it will have 192 ALU. RV770 each ALU matched 5SP and then RV870 will have 960SP. In order to control the core area, it is still 256bit. We believe RV870 will be 1.2 times than RV770 in performances, but this will be decided by the clock of RV870.

It is also said that AMD next generation R800 will use new design. We know Radeon HD3870X2 and the coming Radeon HD4870X2 both used single PCB+dual graphic core design while R800 will possible use real dual core design. If so, AMD next generation flagship R800 will be the first dual core GPU. The specs of R800 will double RV870.

Advanced 45nm (40nm?) will bring RV870 smaller core area. The current RV770 did well in performances but the temperature is really terrible. If RV870 can settle this problem and further improve performances, it will be really excited for us and it will be the first real dual core GPU possibly. http://en.hardspell.com/doc/showcont.asp?news_id=3768

fellix · Jul 19, 2008

140 sq.mm? I don't believe it, at least with all that bunch of I/O in the chip... unless 40nm scales damn well the padding areas.

Shtal · Jul 19, 2008

The current RV770 did well in performances but the temperature is really terrible. If RV870 can settle this problem and further improve performances,

I wonder by how much temperature drop?

we know Radeon HD3870X2 and the coming Radeon HD4870X2 both used single PCB+dual graphic core design while R800 will possible use real dual core design.

Well, it's interesting.....

We believe RV870 will be 1.2 times than RV770 in performances

Anarchist4000 · Jul 19, 2008

The only way those specs would make sense is if it were some form of x2 card and not a chip. Divide all those specs by two and throw in the 140mm2 and it would sort of make sense.

Wasn't RV770 pad limited with the 256bit bus at 260mm2? Cutting the die size in half and maintaining that bus just wouldn't make sense.

Targeting a chip for the low end and utilizing 2-4 chip configurations for the mid/high end parts might make sense but I'd think that PCIe bus would get saturated fast at that rate. A really high end card would likely appear as >4 cores in some form of crossfire configuration. I'm not sure the scaling would still hold at that point in time. AFR would definitely be out if that was the direction they took.

CarstenS · Jul 19, 2008

At 140mm² I think it's time to consider not only 2 GPUs for R800.

That said - maybe AMD starts to separate 2D-stuff also? So they could significantly improve shrinking pontential on future manufacturing processes - at least AFAIK.

So - how about a 5-chip R800? 4 Computing-Containers and one I/O?

Sunrise · Jul 19, 2008

That would make quite a lot of sense, but only if it were something like this:

RV740 - 128bit - ~140mm² [1.2 TF]
RV870 - 2xRV740 (MCM) - 2x128bit (256bit) - 2x140mm² (280mm²) [2.4TF]
R800 - 2xRV870 (2xMCM) - 2x 256bit - 4x 140mm² (560mm²) [4.8TF]

Now, packaging costs would be quite high, but this would give them an endless list of advantages. If they could pull that off and they found a way not to rely on AFR, so that both ASICs on a package work together (almost) transparently, AMD/ATI would not be limited by process maturity of newer nodes, the volume would be there pretty early, prices could be _very_ competitive again and they could also minimize tape-out costs.

With GDDR5 maturing, mass availability by next year and lower prices (more semis), BW would be about equal per core (ASIC), like to where it is now.

Shtal · Jul 19, 2008

If it's only going to be 20% performance increase over RV770.

Then specs should be like this:
RV870
800MHz GPU
192 *5D = 960SP's
20 ROP's
44 or 48 TMU's
256bit GDDR5 ~120 to 160GB's bandwidth.

Pete · Jul 19, 2008

If ROPs are tied to memory bus width, then they'd stay at 16. TMUs are tied to the shader core, so they'd be at 48. The rest of their numbers are just naive estimates: 960 / 800 = 1.2, 260 * 40^2 / 55^2 ~= 140.

They seem pretty confident about the 960 (though last time we heard both 480 and 800

). How high is GDDR5 supposed to clock by next year? Could they get close to 4870 bandwidth with a ~128-bit bus?

AnarchX · Jul 19, 2008

Pete said:
If ROPs are tied to memory bus width, then they'd stay at 16. TMUs are tied to the shader core, so they'd be at 48.

RV770 is 10 SIMDs with one Quad-TMU each.

Pete said:
Could they get close to 4870 bandwidth with a ~128-bit bus?

3.6GHz GDDR5?

The highest number I heard was 3.5GHz, what is probably reached in 2010.

Sinistar · Jul 19, 2008

The 4870 Memory is running at 900 Mhz. Samsung is transitioning in K4G52324FG-HC05 memory chip, that last two digits are the speed of .5ns, that would be 2Ghz.

http://www.samsung.com/global/business/semiconductor/productInfo.do?fmly_id=675&partnum=K4G52324FG

Mintmaster · Jul 19, 2008

Pete said:
If ROPs are tied to memory bus width, then they'd stay at 16.

I could see them move to 32 if GDDR5 got fast enough and they couldn't increase the GPU clock enough. If the core is 800 MHz and GDDR5 reaches 3 GHz, then 8 ROPs per 64-bit memory channel is actually 18
% more BW per ROP than what the 4850 (only 4 ROPs per 64-bit channel) gets.

AnarchX said:
RV770 is 10 SIMDs with one Quad-TMU each.

This rumour suggests RV870 is 12 SIMDs, and one Quad-TMU each would equal Pete's claim.

mczak · Jul 20, 2008

Sinistar said:
The 4870 Memory is running at 900 Mhz. Samsung is transitioning in K4G52324FG-HC05 memory chip, that last two digits are the speed of .5ns, that would be 2Ghz.

http://www.samsung.com/global/business/semiconductor/productInfo.do?fmly_id=675&partnum=K4G52324FG

No, that's not the same clock. The 05 part would be a base clock of 1Ghz, and the 06 part would be even too slow for the HD4870...
I don't think anything faster than 2.5Ghz will be available early next year. Eventually faster parts may appear...

Sinistar · Jul 20, 2008

http://www.samsung.com/global/system/business/semiconductor/family/2008/5/26/511176Graphics_code.pdf

14~15. Speed

( Wafer/Chip Biz/BGD : 00 )

04 : 0.4ns (2500MHz)

4A : 0.416ns (2400MHz)

4B : 0.434ns (2300MHz)

4C : 0.454ns (2200MHz)
4D : 0.476ns (2100MHz)
05 : 0.5ns (2000MHz)
5B : 0.526ns (1900MHz)
5C : 0.555ns (1800MHz)

5D : 0.588ns (1700MHz)

Or are they comparing it to GDDR3?

mczak · Jul 20, 2008

One is command clock, the other data clock.

turtle · Jul 20, 2008

Is it conceivable that the "crossfire link" on each chip could be used on an MCM package to share a memory controller given they would be on the same package using the same pins?

If so, would this result in a pseudo-256-bit bus for both cores that would use load balancing, or rather memory sharing between both exclusive perhaps 128-bit buses, or both?

Either/or would be a boon to the mgpu market.

INKster · Jul 20, 2008

MCM core packaging on a graphics card seems like an illogical choice, from both cost perspective and from a technical point, IMHO.
A GPU doesn't work like a CPU, no matter how much anyone tries to spin it...
A MCM would only make sense if the cost of making the package with the two GPU's and the memory controller was smaller than just build them all in one, and if the performance of the shared memory controller was substantially better and consumed less die space than one otherwise built-in.

Yet, even future Intel Atom CPU's (not to mention "Nehalem" and Athlon64/Phenom) are moving away from that, by integrating memory controllers directly in the core.
That should tell us something...

Shtal · Jul 20, 2008

INKster said:
MCM core packaging on a graphics card seems like an illogical choice, IMHO.
A GPU doesn't work like a CPU, no matter how much anyone tries to spin it...

You mean 2x performance!

INKster · Jul 21, 2008

Shtal said:
You mean 2x performance!

Oh, really ? How much L1/L2 cache would you have to build into a GPU core to compensate to the lowsy communication to an-off core die where the memory controller would reside ?
Now multiply the resulting GPU by 2, add the external memory controller and the decidedly more complex packaging of all three chips and do the math...

Even Nvidia eventually let NV45's BR02 (that was a MCM package, after all) and G80's NVIO functionality back into the GPU core...
In a special purpose chip design, it is especially useful if you can contain off-chip communications to and from co-processors down to a minimum.
Look at how simply switching to a PCIe 2.0 bridge made the current Crossfire-on-a-Stick solution scale performance that much better than its predecessor (PCIe 1.1 bridge).

Sound_Card · Jul 21, 2008

I did not realize that their were any x2 boards with 2.0 PLX chips on them in the market yet...

Look at how simply switching to a PCIe 2.0 bridge made the current Crossfire-on-a-Stick solution scale performance that much better than its predecessor (PCIe 1.1 bridge).

Shtal · Jul 21, 2008

INKster said:
Oh, really ? How much L1/L2 cache would you have to build into a GPU core to compensate to the lowsy communication to an-off core die where the memory controller would reside ?
Now multiply the resulting GPU by 2, add the external memory controller and the decidedly more complex packaging of all three chips and do the math...

Even Nvidia eventually let NV45's BR02 (that was a MCM package, after all) and G80's NVIO functionality back into the GPU core...
In a special purpose chip design, it is especially useful if you can contain off-chip communications to and from co-processors down to a minimum.
Look at how simply switching to a PCIe 2.0 bridge made the current Crossfire-on-a-Stick solution scale performance that much better than its predecessor (PCIe 1.1 bridge).

Maybe ATI/AMD has some-kind idea that we don't know.... Maybe using 512Bit GDDR5 instead 256bit GDDR5.
EDIT: I still believe that ATI R600 w/ 512bit bus was experimental project for the future. (Because everyone knows that R600 512bit available bandwidth was a waste) Unless if I downward ATI as an error.

AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

Within 1 or 2 weeks

Within a month

Within couple months

Very late this year

Not until next year

Shtal

fellix

Shtal

Anarchist4000

CarstenS

Moderator

Sunrise

Shtal

Pete

Moderate Nuisance

AnarchX

Sinistar

I LIVE

Mintmaster

mczak

Sinistar

I LIVE

mczak

turtle

INKster

Shtal

INKster

Sound_Card

Shtal

Similar threads