ATI RV740 review/preview

Silent_Buddha · Mar 31, 2009

Jawed said:
Only 40% of RV770 is clusters, about 390M transistors. Of that, I estimate 64% is ALUs, excluding the redundant ALU lanes, which I like to lump in with the TU when looking at a die shot and which makes scaling estimates a bit simpler.

So RV770 has about 250M transistors for its 800 ALU lanes and 140M for the rest. RV740's 640 lanes would be ~200M transistors, subject to new functionality and lack of double precision. The clusters, as a whole, would be about 312M, leaving, as you observe, a hell of a lot of transistors, 514M, for MCs, RBEs, the hub, PCI Express etc.

This compares with ~566M in RV770

Analogue stuff isn't supposed to shrink particularly well - dunno how to account for that.

Jawed

Aren't the transistors for the ALU's significantly more densely packed than the rest of the chip? If so, isn't it possible that it has more transistors than just the die area alone would suggest?

Regards,
SB

Jawed · Mar 31, 2009

Transistor density will vary all over - typically memory is much denser. e.g. Cell has 4x the density in its local store memory than for general logic within each SPE.

With the clusters there's a lot of memory - out of those 389M transistors there's 2.5MB of register file (thats counting 16 of the 17 "pixels" - the 17th is for redundancy) and a load of L1 cache and some LDS and GDS.

Outside of the clusters there's L2 cache and various buffers, including those associated with the RBEs.

A quick check with a die photo indicates that what looks like register file is ~32% of the area of a "pixel". Without knowing the real ratio between the density of the memory and the rest we're stuck.

So the transistor density guessing game is pretty naive and really just for entertainment.

Jawed

Arty · Mar 31, 2009

And we've got the GeForce 9600 GT. Just a little more performance in some games, maybe a little less in others, with roughly the same cost. But if you want any more than that, you'll want to wait about a month.

For our ~$100 price point (plus or minus a bit) we are going to strongly recommend that people wait for about a month. This price point will be shaken up a bit in about that time and we really aren't comfortable recommending anyone purchase something in this market until sometime in early May. This may or may not further compress the sub $100 market, but there really isn't much more room down there, so we don't expect much change except at right around $100.

~$100 Recommendation: IT'S A TRAP!!! (wait about a month)

It just so happens that this price point is also the highest volume price point. Certainly neither AMD nor NVIDIA will be happy that we recommend waiting, but this is all about the consumer. If you are going to spend about $100 on a video card, just try really hard to wait a little bit longer.

Derek Wilson from Anandtech's Video Card Buyer's Guide - Spring 2009

Seems like a strong premature recommendation for the RV740 unless they are talking about the 9800GT Tipexx edition.

keritto · Apr 3, 2009

hkultala said:
8800GT is MUCH more expensive to manufacture than R740, so it really cannot compete.

(G92 is 334 mm^2, 256-bit us, RV740 is about 125 mm^2 (my estimate based on RV770 die size , mfg process and number of processing units) , 128-bit bus)

This makes you sound like some ati's marketing executive

I don't se why would be i too happy about die shrinkage if i don't get anything for it. Pretty much the same power consumption and what bothers me and seeem to be the fact only an single precision operation on RV740?! How can they call it dx10.1?? envidia wins again another match like with D3c which make it's way out after R200/R300 generation. :???:

And 256mm2 divided by 2 (in best circumctances, not even with obviousleakage-power problem on 40nm they mentioned) isn't small as 125mm2. And shaders are lot smaller in weight on overall die size than TMUs and RBEs that are roughly the same (40vs32 TMUs?) Well after all that's what makes ATi's architecture win over nV.

AlexV · Apr 3, 2009

keritto said:
This makes you sound like some ati's marketing executive I don't se why would be i too happy about die shrinkage if i don't get anything for it. Pretty much the same power consumption and what bothers me and seeem to be the fact only an single precision operation on RV740?! How can they call it dx10.1?? envidia wins again another match like with D3c which make it's way out after R200/R300 generation.

So you know how the RV740 looks, that's good. The bolded part makes no sense, please elaborate.

AnarchX · Apr 3, 2009

hkultala said:
(G92 is 334 mm^2, 256-bit us, RV740 is about 125 mm^2 (my estimate based on RV770 die size , mfg process and number of processing units) , 128-bit bus)

G92b (55nm) is 270mm² and RV740 136mm² according to AMD.

Of course still a big difference, but I think we can consider, that 40nm waferprices are higher than 55nm ones and the yields also should be lower.

On the other side, folks @Chiphell reported that HD4770 needs a 8-layer PCB, while 98 GT is now at the most partners 6-layer.
Both cards use 8 memory chips, HD4770 I think because of lower prices of 512Mbit 0.55ns chips over 0.5ns 1Gbit ones.

And in the end, GT215/214 (NVs 40nm competitor) should not be so far away.

CarstenS · Apr 3, 2009

keritto said:
envidia wins again another match like with D3c which make it's way out after R200/R300 generation.

What you mean is 3dc, and it was in R4xx generation. And it did not disappear, but was integrated into DirectX, because it was found to be quite useful.

keritto · Apr 3, 2009

Jawed said:
http://www.istartedsomething.com/20081029/windows-7-dwm-cuts-memory-consumption-by-50/

As long as NVidia continues to market D3D10 GPUs once W7 arrives, which could be quite a long time, they'll be fighting this. Unless the newest GT2xx GPUs have 10.1 support, of course.

Even their presenter calls that dce a hog

anyway it'll be pretty interesting when microcrap shouled try to explain to us why they use dce on obsolete engines when they're released dx9.0b on Vista that introduced dx10, and dx10.1 on NT7 that'll presumably have new dx11.

On the other hand they have "compatibility desktop mode" for Vista for pre-dx9.0b based cards so they'll have it for in their new bloat

and they'll ebven not try to fix it cause as we know dx10 is not so much prehistoric to dx10.1 as dx9.0b might be for ms sake to dx7+

And that on rv740 losing double precision of their older r700 gang sisters. Doesn't that make it only dx10 compatible cause dx10.1 requires double precision shader operation?

AlexV said:
So you know how the RV740 looks, that's good. The bolded part makes no sense, please elaborate.

Unfortunately i read that moderately long thread. And i give you referrence above. And it seems you lock me from posting so i dont see how you mean to elaborate it.

AnarchX said:
On the other side, folks @Chiphell reported that HD4770 needs a 8-layer PCB, while 98 GT is now at the most partners 6-layer.
Both cards use 8 memory chips, HD4770 I think because of lower prices of 512Mbit 0.55ns chips over 0.5ns 1Gbit ones.

Hm that doesnt's sounds reasonable to use 8-layer pcb on 128-bit wide memory bus. Shouldn't they jump over onto 128-bit just to reduce that pcb layers from 6 to 4 or something?

Dave Baumann · Apr 3, 2009

keritto said:
And that on rv740 losing double precision of their older r700 gang sisters. Doesn't that make it only dx10 compatible cause dx10.1 requires double precision shader operation?

Double Presicion is FP64 - thats not a DX requirement yet. DPFP is just something thats being optionally implemented for GPGPU purposes.

Kaotik · Apr 3, 2009

keritto said:
Even their presenter calls that dce a hog anyway it'll be pretty interesting when microcrap shouled try to explain to us why they use dce on obsolete engines when they're released dx9.0b on Vista that introduced dx10, and dx10.1 on NT7 that'll presumably have new dx11.

Win7 is NT6.1, not NT7, and DX11 is coming to both it and Vista

On the other hand they have "compatibility desktop mode" for Vista for pre-dx9.0b based cards so they'll have it for in their new bloat and they'll ebven not try to fix it cause as we know dx10 is not so much prehistoric to dx10.1 as dx9.0b might be for ms sake to dx7+

Where did you pull DX9.0b to this? Basic DX9 PS2.0 support is only thing required (among with memory) for Aero / DCE

And that on rv740 losing double precision of their older r700 gang sisters. Doesn't that make it only dx10 compatible cause dx10.1 requires double precision shader operation?

Unfortunately i read that moderately long thread. And i give you referrence above. And it seems you lock me from posting so i dont see how you mean to elaborate it.

Hm that doesnt's sounds reasonable to use 8-layer pcb on 128-bit wide memory bus. Shouldn't they jump over onto 128-bit just to reduce that pcb layers from 6 to 4 or something?

Like already said, FP64 isn't required by DX10.1
Also, these ARE 128bit, so no, it doesn't reduce the layers to 6 or 4 or something necessarily.

v_rr · Apr 3, 2009

AnarchX said:
And in the end, GT215/214 (NVs 40nm competitor) should not be so far away.

Glad to see that there are still people with faith and hope.

RV740 is going on Desktop and notebook. In news the AMD wafers to TSMC will jump in production for the next months coincident with RV790 and RV740 release.

mczak · Apr 3, 2009

AnarchX said:
G92b (55nm) is 270mm² and RV740 136mm² according to AMD.

Of course still a big difference, but I think we can consider, that 40nm waferprices are higher than 55nm ones and the yields also should be lower.

I'm not so sure about yields. Even if defects are higher per area on 40nm, the size difference could make that difference disappear (though that's assuming that defects per area aren't sky-high of course). And even if yields are somewhat lower and waferprice is higher, a factor 2 in size is quite something.

On the other side, folks @Chiphell reported that HD4770 needs a 8-layer PCB, while 98 GT is now at the most partners 6-layer.

This is interesting. With only 128bit memory bus you'd think routing would be easier (sure memory bus runs at high frequency but gddr5 should also help with that, with the compensation for unequal trace length).

Both cards use 8 memory chips, HD4770 I think because of lower prices of 512Mbit 0.55ns chips over 0.5ns 1Gbit ones.

Both Hynix and Samsung also offer slower 1Gbit chips - no idea about prices...
Using 8 chips would also make it possible to use the same pcb layout for 1GB cards. Dunno if we'd see anything like that, maybe cost of gddr5 is too high (that's really the big unknown factor here I guess).

And in the end, GT215/214 (NVs 40nm competitor) should not be so far away.

It has been really quiet lately about GT212/214/216, I wonder what's up with that...

Vincent · Apr 3, 2009

mczak said:
I'm not so sure about yields. Even if defects are higher per area on 40nm, the size difference could make that difference disappear (though that's assuming that defects per area aren't sky-high of course). And even if yields are somewhat lower and waferprice is higher, a factor 2 in size is quite something.

This is interesting. With only 128bit memory bus you'd think routing would be easier (sure memory bus runs at high frequency but gddr5 should also help with that, with the compensation for unequal trace length).

Both Hynix and Samsung also offer slower 1Gbit chips - no idea about prices...
Using 8 chips would also make it possible to use the same pcb layout for 1GB cards. Dunno if we'd see anything like that, maybe cost of gddr5 is too high (that's really the big unknown factor here I guess).

It has been really quiet lately about GT212/214/216, I wonder what's up with that...

I thought this time that both Nvidia and ATI will come up with multi-chip solution from mid-range to high end. The cost/performance has been attested by the GTX280 with its 512bit bus, which only outperform RV770XT by small margin, not only in theortical benchmark, but also in real testing game.

My 2010 mainstream prediction : GTX280 level GPU with 128bit GDDR5 7GHz

Jawed · Apr 3, 2009

mczak said:
I'm not so sure about yields. Even if defects are higher per area on 40nm, the size difference could make that difference disappear (though that's assuming that defects per area aren't sky-high of course). And even if yields are somewhat lower and waferprice is higher, a factor 2 in size is quite something.

ATI's ALUs are 16-in-17 redundant, so defects in that portion of the die have to be very severe to kill it.

Jawed

CarstenS · Apr 4, 2009

Jawed said:
ATI's ALUs are 16-in-17 redundant, [...]

Is that a known fact?

Jawed · Apr 4, 2009

Not if the relevant patent documents and die photos aren't enough evidence.

Jawed

CarstenS · Apr 4, 2009

Jawed said:
Not if the relevant patent documents and die photos aren't enough evidence.

Jawed

Thanks - must've missed all that.

rjc · Apr 6, 2009

mczak said:
Both Hynix and Samsung also offer slower 1Gbit chips - no idea about prices...
Using 8 chips would also make it possible to use the same pcb layout for 1GB cards. Dunno if we'd see anything like that, maybe cost of gddr5 is too high (that's really the big unknown factor here I guess).

A few pages back Anarchx posted it168 link showing the current and proposed amd lineup. There were 512M and 1G variants of the 4850(a $20 difference) and 4870(a $30 difference) so very very roughly GDDR5 is 50% more expensive. Last year gpu cafe reckoned it carried a 20-40% premium. In february they are saying samsung and hynix are in mass production, still looking for any cards actually carrying this memory. A couple of weeks ago samsung announced they are shipping their 50nm DDR3 parts, so i suppose the previously announced 50nm GDDR5 shouldnt be too far behind.

It has been really quiet lately about GT212/214/216, I wonder what's up with that...

The GT218 is supposed to debut next month, the GT216 shortly after, i think these are mainly intended as oem parts. The GT215(a 192 bit G92) is delayed till they can clear the G92/G94 stock, it is approx the same cost to produce as current 55m parts so nvidia are not going to hurry with it.

Jawed · Apr 6, 2009

rjc said:
The GT215(a 192 bit G92) is delayed till they can clear the G92/G94 stock, it is approx the same cost to produce as current 55m parts so nvidia are not going to hurry with it.

But current 55nm parts cost nothing to produce, since they're nventory.

Jawed

rjc · Apr 6, 2009

Jawed said:
But current 55nm parts cost nothing to produce, since they're nventory.

Unless the inventory is written off, the production and maybe storage costs are still attached. There are 2 problems intertwined 1) maybe 100 days worth of inventory 2) unit cost of replacement part is roughly the same.
This means normal strategy of producing new part to reduce the overall average unit costs of the inventory wont work. ie old part $30, new part $25 therefore if can produce enough new part average unit cost approaches $25 => will sell faster as can sell old stock at new lower price.

I think they are hoping 40nm yields will improve enough with time to get the above situation happening. Is a warning to others not to try a large chip on 40nm.

Back on topic - from other thread previously posted amd is ramping its wafers quite a bit over the next couple of months....what are they producing? Only a small proportion will be RV790, is it RV740 or something else?

ATI RV740 review/preview

Silent_Buddha

Jawed

Arty

KEPLER

keritto

AlexV

Heteroscedasticitate

AnarchX

CarstenS

Moderator

keritto

Dave Baumann

Gamerscore Wh...

Kaotik

Drunk Member

v_rr

mczak

Vincent

Jawed

CarstenS

Moderator

Jawed

CarstenS

Moderator

rjc

Jawed

rjc

Similar threads