AMD: R7xx Speculation

mczak · Jun 24, 2008

AlphaWolf said:
the ATi slide was flops /watt. There's a definite advantage there, but ATi does need to get the idle consumption down to where it should be.

Right, though flops/watt isn't really a very good metric for the main purpose of these chips (which are, after all, mostly still intended for 3d graphic cards).

AlphaWolf · Jun 24, 2008

mczak said:
Right, though flops/watt isn't really a very good metric for the main purpose of these chips (which are, after all, mostly still intended for 3d graphic cards).

Well flops and watts probably mean nothing to 95% of people who are going to buy these things. So perhaps they are thinking of building a box for pixar or something.

There are some oddities with the power consumption part of the review (GX2 numbers seem off from expected) we've been looking at though so I'd like to see some corroboration before jumping to any conclusions.

compres · Jun 24, 2008

I think we should wait for official reviews with release drivers before assuming anything with regards to idle power consumption.

And now that RV770 has been dealt with, what can we really spect from R700? Is nVidia in even more trouble than we though? Interesting indeed.

I would pick a 4870 if i have some free time this summer (which i don't sadly).

revan · Jun 24, 2008

mczak said:
I'll answer instead . Well these diagrams were leaked before already, so there's nothing new to see here.
I don't even know where to start describing all the differences - I don't think I'm alone not having expected that many changes.
1) tmu organization. No longer shared across arrays, each array has its own quad tmu. I'll bet the sampler thread arbiter had to change with that as well.
2) tmu themselves changed. While they always had separate L1 caches (I think - the picture is misleading), now the separate 4 TA and point sampling fetch units are gone (thus one tmu is 4 TA, 16 fetch, 4 TF instead of 8 TA, 20 fetch, 4 TF). Also, early tests indicate they are no longer capable of sampling fp16 at full speed (dropping to half and one quarter at fp32 IIRC).
3) ROPs. They now have 4xZ fill capability (at least in some situations) instead of just 2. The R600 picture indicates a fog/alpha unit which is now gone, though I doubt it really was there in the first place (doesn't make sense should be handled in the shader ALU). The picture also indicate shared color cache for R600, I don't know if this was true however. Could be though (see next item).
4) no more ring bus. Clearly with rv770 ROPs are tied to memory channels (just like nvidia G80 and up), and there are per-memory channel L2 texture caches. Instead of one ring-bus it seems there's now different "busses" or crossbars or whatever for different data (it's hot a "hub", it's got some path for texture data etc.)
5) Other stuff like the local data store, read/write cache etc.

That seems like the most important to me, architecture-wise (of course it got 10 shader arrays instead of 4 too...)

thanks., mczak...actually I tried to understand what goes wrong in R600 ( hope this evil never come back in future (ati)/(and why not nvidia) chips)... what you said, it is a start...hope more investigation will come in the next days....

pjbliverpool · Jun 24, 2008

mczak said:
Die size will still be almost twice as large

Why so big? I thought GT200 only had about 45% more transistors than the 4800?

AlphaWolf · Jun 24, 2008

pjbliverpool said:
Why so big? I thought GT200 only had about 45% more transistors than the 4800?

And doesn't G92b have 200 million less than rv770? Perhaps its just a difference in counting methods, or perhaps its ATi showing their experience at 55nm.

fellix · Jun 24, 2008

pjbliverpool said:
Why so big? I thought GT200 only had about 45% more transistors than the 4800?

1 - It's still on 65nm node.
2 - There is more "fillrate" hardware (hard to pack in dense structures), like ROPs and TMUs, than any GPU up to date.
3 - Beefy memory interface -- the primal enemy of any effective die shrink.

Mintmaster · Jun 24, 2008

pjbliverpool said:
Why so big? I thought GT200 only had about 45% more transistors than the 4800?

There will always be differences between what ATI and NVidia do when determining transistor count. Usually, density of one company's chips at a particular process node stays the same, though, because they're generally pretty consistent.

G92 has similar transistors/mm2 to GT200, and the same is true for RV670/RV770. Looking at G92b for scaling of GT200b makes more sense than looking at RV770.

Silent_Buddha · Jun 24, 2008

pjbliverpool said:
Why so big? I thought GT200 only had about 45% more transistors than the 4800?

They don't seem to be quite so densely packed in comparison to Rv770. If someone had mentioned a couple months ago that Rv770 would be approaching 1 billion transistors and still be 271mm^2 or smaller (I think consensus is that it's much smaller than 271mm^2), most people would have laughed and dismissed it as a pipe dream.

Regards,
SB

Wirmish · Jun 24, 2008

VoltMod: HD 4850 OC @ 800 MHz core / 1100 MHz memory -> screenshot

AlphaWolf · Jun 24, 2008

Wirmish said:
VoltMod: HD 4850 OC @ 800 MHz core / 1100 MHz memory -> screenshot

Any idea what cooling was used? It's clearly not Ln2, but I'm wondering if they went to a beefier air cooler or water or something. Temps certainly don't seem bad at that speed.

ChronoReverse · Jun 24, 2008

What did they use to overclock past 700MHz?

mczak · Jun 24, 2008

Wirmish said:
VoltMod: HD 4850 OC @ 800 MHz core / 1100 MHz memory -> screenshot

What was the stock and what the modded voltage?

mhouston · Jun 24, 2008

That's actually from toTOW in the Folding@Home forums. I'll see if I can get him over here to explain how he did this. It's on air. ;-) This totally voids the warranty.

Wirmish · Jun 24, 2008

Default : 2D = 500MHz @ 1.08V, 3D = 625MHz @ 1.19V.
toTOW Vmod -> (UP6201BQ) 100 KOhms, +0.025V.

Heatsink = Zalman VF700CU

HD 4850 + Zalman Heatsink -> Photo1, Photo2.

725 MHz @ 1.200V -> Screenshot
775 MHz @ 1.265V -> Screenshot
800 MHz @ 1.???V -> Screenshot

New Trojan Horse

MfA · Jun 24, 2008

It's slightly funny ... but I don't think that shop of that cartoon falls under fair use ...

trinibwoy · Jun 24, 2008

mczak said:
That seems like the most important to me, architecture-wise (of course it got 10 shader arrays instead of 4 too...)

That's interesting. A lot of those changes make the general setup a lot more similiar to G8x. Makes for easier comparisons I guess.

I don't think FP16 filtering is half speed in RV770 though. Measured numbers seem to indicate that it's pushing more texels than G92 and it gets up pretty close to GT200 too.

jimmyjames123 · Jun 24, 2008

Mintmaster said:
You gotta be crazy and/or a fanboy to want to spend over twice as much for a card that's maybe 25% faster on average, and the GTX 260 is DOA.

While I agree that GTX 280 is way overpriced, let's not get carried away and claim that GTX 260 is DOA. I'm sure when all is said and done that there will end up being several games where the GTX 260 has higher playable settings than the 4870 512MB, not to mention possibly lower idle power and also PhysX/CUDA support. That's not to take anything away from the HD 4870, which I'm sure is a great card and great value. But to simply look at max/average framerate like the chinese review really doesn't tell us the full story anyway. Also note that any review that compares Geforce GT200 vs Radeon 48xx with 8x MSAA enabled will almost always show advantages for the Radeon. As mentioned before, it doesn't make sense to run the Geforce cards with 8x MSAA, as performance is much much better with 16x CSAA, with minimal tradeoff in image quality.

A.L.M. · Jun 24, 2008

AlphaWolf said:
Well flops and watts probably mean nothing to 95% of people who are going to buy these things. So perhaps they are thinking of building a box for pixar or something.

There are some oddities with the power consumption part of the review (GX2 numbers seem off from expected) we've been looking at though so I'd like to see some corroboration before jumping to any conclusions.

I think so.
Look at the consumption test made by this very reliable italian site.

How can in the world the HD4870 consume more than a HD3870X2... This diagram shows clearly that is the idle consuption the problem, not the full. But the idle power draw can be fixed with a simple driver issue.

There's something very fishy in that test...

jimmyjames123 · Jun 24, 2008

thatdude90210 said:
Heh, this back and forth between the two competitors is kinda fun to watch. TGDaily is reporting that AMD will allow partners to overclock their 4850's to combat the GTX9800+, and these are set to start rolling out 2nd week of July.
"AMD is preparing an answer to Nvidia's recently released GeForce 9800GTX+ card. Overclocked Radeon 4850 cards are set for an introduction in the second week of July."

This cycle will never end. Considering that 9800 GTX+ is said to have very good headroom for overclocking, maybe vendors will come out with 9800 GTX+ OC versions too. That will certainly be a very good value compared to GT200.

AMD: R7xx Speculation

mczak

AlphaWolf

Specious Misanthrope

compres

revan

pjbliverpool

B3D Scallywag

AlphaWolf

Specious Misanthrope

fellix

Mintmaster

Silent_Buddha

Wirmish

AlphaWolf

Specious Misanthrope

ChronoReverse

mczak

mhouston

A little of this and that

Wirmish

MfA

trinibwoy

Meh

jimmyjames123

A.L.M.

jimmyjames123

Similar threads