G70 Vs X1800 Efficiency

russo121 · Oct 11, 2005

/*Start sarcasm

Joe DeFuria said:
And R520 has an 8 TMU and 8 ROP disadvantage. Point?
It all depends on what you're rendering.

Rendering games, maybe?

Joe DeFuria said:
(In case you haven't noticed, the design goals for the two architectures are different.)

I think the desing goal of ATI X1xx is PPU, no?! http://techreport.com/onearticle.x/8887

*/ End Sarcasm

To me, this test is very interesting because it can tell you if you should buy R520 or wait for G70 90nm - essencially if r580 and G70 have the same number of pipes and clocks (please don't tell me Nvidia will not achieve R5xx clocks because we don't know a thing - I don't know).

To me efficiency is directly connected to IPC - and efficiency of rendering games (faster for the same clock and pipes) - I don't care how much heat, noise it will produce neither power it will consume, because if you care about that, then SLI and Xfire don't make sense.

Sxotty · Oct 11, 2005

russo121 said:
/To me, this test is very interesting because it can tell you if you should buy R520 or wait for G70 90nm - essencially if r580 and G70 have the same number of pipes and clocks (please don't tell me Nvidia will not achieve R5xx clocks because we don't know a thing - I don't know).

But you should not get that message.

If you do then let me tell you a secret, the g80 will be better than the g70 @90nm, or the r520. Of course so wil the r580 so you should just wait until then at least...

Jawed · Oct 11, 2005

A "32 pipe G70" at 600MHz versus a 48 shader-pipe/16-texture pipe R580 at 600MHz, both with 750MHz+ memory should be interesting...

Jawed

trinibwoy · Oct 11, 2005

Jawed said:
A "32 pipe G70" at 600MHz versus a 48 shader-pipe/16-texture pipe R580 at 600MHz, both with 750MHz+ memory should be interesting...

Jawed

G70 wins in OpenGL and older titles that still rely heavily on texturing. R580 destroys it in anything with moderate-heavy use of dynamic branching. So it looks just like 6800 vs X1600XT today.

But who's to say the G70 refresh in the spring won't have decoupled texture units as well.There was a post with a recent patent around here somewhere.

rwolf · Oct 11, 2005

JoshMST said:
I for one think that this was a fun exercise in testing, but it probably should have been worded a bit differently. As it is, I agree that many will come away thinking "ATI sucks" when it is not the case. The whole exercise though brings up new questions, and I think that is a good thing.

The first thing that should come up in anyone's mind is how much of a clockspeed gain will NVIDIA get with 90 nm? Well, as you all know, 110 nm does not utilize low-K, and that has a significant effect on overall clockspeeds (10% to 15% over FSG). So, not only will NV be going down to 90 nm, but they will be producing their first low-K parts (ok, they are already producing 6100/6150... but let's not muddy the waters too much). So if minimal changes were made to G70, and they recompiled the design using the 90 nm data libraries, what kind of clock improvement will we see? I would guess that 550 MHz would be pretty standard for such a design, and the faster speed bin products would hit around 600 MHz. I would of course be very curious as to how the power draw and heat production for such a part would be increased over a lower clocked 110 nm product. I think it would be VERY interesting to see how the differences in design between NV and ATI would affect power and heat. If NVIDIA can produce a 24 pipe design running at 600 Mhz at the faster bin, all the while creating as much heat as a 7800 GTX and eating as much power, than ATI is going to have a hard sell to OEM's on their hands.

So, that was basically one line of thought that popped up after the article. There are others. But I don't think that we can stress enough that each design is unique, and each one accepts different tradeoffs to achieve their results. I think ATI's product has a stronger feature set than NV, but NV has a more proven track record with the 7800 GTX due to its availability and power features.

Still, good work on the article, it was very enjoyable.

I doubt that is going to happen. There is a benefit moving to 90nm, but the stages in the chip must be engineered to run at the clock speed intended. I doubt that they are going to add pipeline stages (not to be confused with pixel pipelines) to the architecture.

The other issue is power leakage which is more common with smaller transistor sizes.

Jawed · Oct 11, 2005

trinibwoy said:
But who's to say the G70 refresh in the spring won't have decoupled texture units as well.There was a post with a recent patent around here somewhere.

G70 refresh is just going to be a 90nm part, I reckon (sooner than spring). I actually doubt it'll have more pipes. I was being a little facetious.

That decoupled texture pipes patent (and the scheduling patent) would, I guess, arrive together in G80 around May/June time in the DX10 part, or maybe later.

Jawed

Ho_Ho · Oct 11, 2005

I wonder why it hasn't occured to any of you that for the same amount of pipelines R520 uses much more transistors than G70.

PS!!
I am no expert on the subject, most of the calculations are really simple and are probably not 100% correct. Especially I doubt I understood (2) correctly. Don't take my words as facts but just as something to think about. I don't guarantee that those two links have correct numbers but I belive they do.

24pipe GTX has about 302M wheras 16pipe R520 has 321M (1). If one disables 8 pixel pipelines on GTX then that would leave it with even less working transistors, I think somewhere around 220-230M after disabling 8 pixel pipelines. I was amazed to see that even then GTX could keep up wit R520 quite well.

Next lets see what could happen when NV moves from 110 to 90nm. When we belive what ATI sais (2) about the subject and my calculations are correct* then on the same die area as in G70 ~600M transistors can be put. Assuming that pipelines transistor counts scale lieneary that would mean about 16 vertex pipelines and 48 pixel pipelines running around 515MHz. Considering how well G70 OC's even without SoI I wouldn't be suprised if it would run around 600+MHz.
*) if they are not please tell me what I did wrong. If anyone is interested in my calculation formulas just let me know and I give the "formulas" of how to multiply a few numbers together

ATI can't have such a massive increase in pipeline counts because it already has moved to a smaller manufacturing process. It could increase its die size from 263mm2 to about the same size as G70 is (333mm2) but that would mean it could only fit about 9 vertex pipes and 20 pixel pipelines on the die of the size of G70.

1)transistor counts and die areas:
http://techreport.com/reviews/2005q4/radeon-x1000/index.x?pg=3

2)what happens when new technologies are taken to use:
http://www.hardocp.com/image.html?image=MTEyODI4MDE0MEFCVGlYSnBoRUNfM18xX2wuanBn

There is no doubt that R520 has more efficient when comparing speed on the same clock and pipelines. And why shouldn't it if it has about 40% more transistors per pipeline. When comparing transistor counts there is a whole different story though and it would take a massive clockspeed to macth the speed of G70.

Pete · Oct 12, 2005

But what if ATI spent some extra transistors to "decouple" the TMUs from the pixel pipes in order to make adding extra pixel shader units cheaper? The whole more, dumber pipes is better than fewer, more capable ones argument from NV30 -> NV40. I wonder how the transistor counts would compare b/w a 32 pipe G70 and a 48PS/16TMU R580, as Jawed (facetiously or not) suggested. Assuming, of course, that the two can clock similarly. Comparable power draws would make things easier, too.

I'm not sure why it's so surpising that a pipeline-reduced G70 at close to (or slightly higher than--has Veridian verified the GPU's individual domain clocks yet?) stock speeds can keep pace with a clocked-reduced R520, especially in the few tests conducted. They each trade off some of their advantages. If Digit-Life's die size comparison shots are correct, tho, I don't think nV will want to put up something the size of G70 against something the size of R520.

egore · Oct 12, 2005

If NV adds better dynamic branching performance and HDR+AA to compete with ATI wont that eat allot of the extra die space. I can't see NV adding any extra pipes or having a 24pipe G70 90nm running at 550 or 600 Mhz.

3dcgi · Oct 12, 2005

Nite_Hawk said:
Yeah, I found that link right before you edited your post. It looks like it is recommended to set 600MHz GDDR3 to CAS 9 and CAS 11 for 750MHz memory. Having said that how important is memory latency when dealing with videocards? It seems that a lot of the data (textures, etc) are quite large and hopefully somewhat sequential?

Well, atleast the 7800gt and the x1800xl seem to have similar memory configurations...

Nite_Hawk

Memeory latency is important. Graphics chips are designed to hide memory latency, but it can't always be hidden. Caches are sized to hide this latency and increasing latency from what is expected can increase the penalty. On top of this the ideal cache size can be a difficult thing to determine until a lot of performance modeling or real world testing has been done.

Ailuros · Oct 12, 2005

Jawed said:
A "32 pipe G70" at 600MHz versus a 48 shader-pipe/16-texture pipe R580 at 600MHz, both with 750MHz+ memory should be interesting...

Jawed

Depends how you count "pipes" in either case. Besides it could very well be 32*8 vs. 48*4....

Humus · Oct 12, 2005

Hellbinder said:
Who's the genious that thought up 12 pipelines and 4 ROPs??? Less that 3 Gpix Fill rate???

Most shaders have more than 3 ALU instructions ...

Razor1 · Oct 12, 2005

geo said:
You never know about scuttlebutt, and how much is "liar's dice" being passed on. Having said that, going back over the rumor mill on 16 pipes (and you'll all recall that was in a minority on its own right to the end), it seems to me the prevalent belief was 650mhz and north for the tippy-top part. I had PMs from some typically sensible types over the course of that period that suggested maybe quite a bit north. Which is just to say that I'm not convinced that ATI saw 625mhz as "sky is the limit" for this part.

It'll be very interesting to see what R580 is clocked at. Any hints there yet?

Actaully the clocks started uping after the g70 was released, devs were getting cards from 450 mhz to 550 mhz, always going up since the g70 was released. Well at 625 this thing produces quite a bit heat, its not really a choice for ATi to go higher since they can't. The reference cooler design was copper all the way back from e3, so at that time ATi knew this thing was a furance. The only way ATi would think this thing would go higher clocks on air is if they thought there would be no leakage or the leakage problem would be fixed.

Bruce · Oct 12, 2005

Hellbinder said:
Thats another thing that most people seem to be overlooking.

R520 has almost 10GB bandwidth advantage and a 625mhz core clock.. Yet most of the time it *barely* outperforms or loses to the 7800GTX.

10GB of bandwidth advantage,,, just to break even. That is somehow a more efficient architecture??? hardly..

Which is why driverheavens efficiency test is valid. It shows what is really going on. The R520 is not even close to an efficient architecture.. its uses extremely high clocks and Gross overkill in bandwidth just to get a 3 FPS lead...

huh?
not even close to efficient..?
i am actually wondering why its almost on par often and sometimes faster in shader benchies then? i think the r520 did very well in this test if you count in that it is used far beneath its designed operating ranges.

Dave Baumann · Oct 12, 2005

Razor1 said:
Actaully the clocks started uping after the g70 was released, devs were getting cards from 450 mhz to 550 mhz, always going up since the g70 was released.

They were always going to be higher than that - the issue prevented them from yielding much beyond those speeds though. If it wasn't architected to reach beyond 500MHz then they wouldn't have spent 6 months chasing their tail trying to fix it.

Mariner · Oct 12, 2005

I don't really see why so many people expect the 90nm iteration of G70 (whatever it may be) to have a much lower heat dissipation/power draw than R520. It seems likely to me that TSMC's 90nm process suffers considerable current leakage (seen before with Intel's 90nm process). As NV will be using exactly the same TSMC process as ATI, why should their chip have lower power draw? I do realise chip that design also affects power draw but I think we'll have to wait and see how "G7X" fares rather than making assumptions.

wireframe · Oct 12, 2005

Mariner said:
I don't really see why so many people expect the 90nm iteration of G70 (whatever it may be) to have a much lower heat dissipation/power draw than R520. It seems likely to me that TSMC's 90nm process suffers considerable current leakage (seen before with Intel's 90nm process). As NV will be using exactly the same TSMC process as ATI, why should their chip have lower power draw? I do realise chip that design also affects power draw but I think we'll have to wait and see how "G7X" fares rather than making assumptions.

I would think this depends on which route Nvidia takes. Lower frequencies should help and it would make sense to use 90nm for putting more transistors in there instead of cranking clocks. However, it seems most rumors are going the higher clock route instead of increasing pixel pipelines to 32.

stepz · Oct 12, 2005

3dcgi said:
Memeory latency is important. Graphics chips are designed to hide memory latency, but it can't always be hidden. Caches are sized to hide this latency and increasing latency from what is expected can increase the penalty. On top of this the ideal cache size can be a difficult thing to determine until a lot of performance modeling or real world testing has been done.

I was under the impression that caches are used to optimize bandwidth usage ie. not fetch the same texels twice. Although caches can also be used to store prefetched data it would seem to me that its smarter to use resources for keeping more threads in flight to work around the latency than to prefetch speculatively and add lots of caches so that the prefetches don't waste a lot of bandwidth.

Geo · Oct 12, 2005

Dave Baumann said:
They were always going to be higher than that - the issue prevented them from yielding much beyond those speeds though. If it wasn't architected to reach beyond 500MHz then they wouldn't have spent 6 months chasing their tail trying to fix it.

MuFu, for instance, was quoting 650 in public as early as March.

chavvdarrr · Oct 12, 2005

Mariner said:
I don't really see why so many people expect the 90nm iteration of G70 (whatever it may be) to have a much lower heat dissipation/power draw than R520. It seems likely to me that TSMC's 90nm process suffers considerable current leakage (seen before with Intel's 90nm process). As NV will be using exactly the same TSMC process as ATI, why should their chip have lower power draw? I do realise chip that design also affects power draw but I think we'll have to wait and see how "G7X" fares rather than making assumptions.

afaik, 1800XL has twice the transistior count as X800, yet has ~ same power draw
Why should we expect less from NV?

G70 Vs X1800 Efficiency

russo121

Sxotty

Jawed

trinibwoy

Meh

rwolf

Rock Star

Jawed

Ho_Ho

Pete

Moderate Nuisance

egore

3dcgi

Ailuros

Epsilon plus three

Humus

Crazy coder

Razor1

Bruce

Dave Baumann

Gamerscore Wh...

Mariner

wireframe

stepz

Geo

Mostly Harmless

chavvdarrr

Similar threads