So, do we know anything about RV670 yet?

3dilettante · Aug 15, 2007

cadaveca said:
LoL. Thank you so very much for illustrating my point. This complacancy leads to idleness and a catastrophic failure of innovation. I hope you're not one of those engineers!

What complacency?
All transistors leak.
All devices that have different voltage levels will leak.
You can make a device out of anything and the materials it is made of will always have non-ideal behavior.

There is no magical "frequency" that makes a transistor stop leaking, save the case where no power is fed to it at all.

you've just plainly stated "don't use it, and it won't leak". You've also said i'm not using the right term, because it might confuse people. lol.

I was restating your argument, or at the very least the implications of an argument I do not believe you have fully thought through.

Your use of words in ways they are not used when discussing this topic leads to confusion because nobody can tell which meaning you are using.
I can't read your mind if we were talking in person, and I certainly can't read it when it's text over the internet.

If we don't know what terms the other is using, we can't have a meaningful conversation.

you seem pretty defensive over this. I don't get it. Have a great day.

I've given you examples, I've pointed out that what you want is the removal of the very mechanism that makes semiconductor devices work.

I've pointed out that your reasoning about the usefulness of going multichip to combat leakage is incorrect.

To get this back on the topic of multichip design, I'm going to put out a list of some pros and cons of a multichip design:

Pro:
1) Die size is no longer a hard limit on the amount of transistors that can be used in a design.

2) As a result of this, a design can contain more features and more units while the cost of making the product remains at some more linear multiple of a chip produced at a manufacturing sweet spot. Larger chips scale superlinearly when it comes to price of manufacturing.

3) This approach is more flexible when it comes to amortizing design costs of the chip over multiple price segments.

4) Device variation and defects can be selected around with more granularity.
Less overall silicon can be tossed out due to one component of the die falling out of spec.
This is the one argument where multichip can help mitigate the impact of variation in device leakage on the binning of processors. Individual chips with poor leakage characteristics can be selected out or mixed and matched, instead of having larger monolithic cores that tend to either do very well or very badly.
This does not solve leakage, it just reduces the number of individual products that cannot reach some minimum spec.

5) Simpler chips can be less picky about power management over a single monolithic chip. Instead of complex arbiter logic for individual sectors, a multichip card can just throttle or gate an entire chip. This may or may not be a win, depending on just how far down on the complexity scale each die is.
The rise in complexity in other areas can counteract this advantage.

Con:
1) Multichip is more complex to manufacture. There is a balance between the defect rates of individual chips and the defect rates from packaging them together.
The primary tipping point will be based on the overall cost of discarded products and the distribution of dies amongst higher margin price segments.
Large dies scale worse than linearly in cost, while multichip modules have a more complex set of parameters, including the number of chips, the amount of interconnect on the package, and the pinout of the package.

2) Multichip, with all else being equal, is not a performance win per transistor. Any off-chip communication, even if on package, is longer latency and possibly more prone to error. Additional buffers, signal drivers, and control logic is necessary, whereas a single monolithic core would simply not need them.
This also has the side-effect of possibly worsening per-die yield rates, as control and communications logic must take up a larger portion of the die, and this is the hardest to keep redundant.

3) Multichip, with all else being equal, will likely scale better than multi-board GPUs. However, since GPUs are already so well distributed internally, the overall factor of improvement going multichip is not going to be quite as high as multichip or multicore is for CPUs. GPUs have already gathered a lot of low-hanging fruit in this arena. On the other hand, GPUs on current loads scale so well anyway, so it's more of a wash.

4) Multichip, aside from keeping some units from being discarded due to binning, does not improve power consumption or leakage. The hefty IO requirements for an increasing number of chips and a more complex package will worsen performance per watt.
At small numbers of chips, this should be a small amount. It will become more significant the more complex the package and pinout.

5) Software-wise multichip is more complex. Naively treating multichip as a single core is a fast way to run into performance problems. Communications overhead will be less than that for multi-board, but more than that of a single chip.
How access to video memory will be handled as chip counts rise will be an interesting thing to watch.

The moral of the story, as it is for CPUs, is that monolithic is the way to go unless it becomes impractical.
It seems that this point is approaching rapidly for both graphics IHVs.
One question I have is whether they will both hit this point at the same time and at the same segments.

Whichever IHV is able to keep monolithic GPUs longer and is able to keep them at higher market segments will likely have a much simpler time with performance and software issues.

cadaveca · Aug 15, 2007

LoL. You've put together some great points. thanks very much for taking the time to do so.

One question tho...just how good are yields currently that monolithic designs are really viable?

R600 seems to be too big, following your reasoning. A hot chip that consumes alot of power, and does only a decent amount of work is a leaky chip.

Because of this, leakage is an important factor for me.

Now, I never said it was the only reason. But i do see it as a way to bring final product yields higher, as you mentioned, and this is paramount, IMHO, for a company with late releases and poor performers(R520, R600).

Going by your list, we've got 5 pros and 5 cons. middleground, it seems. Obviously you side with monolithic designs, but, if you yourself were FORCED to go multichip, how would you do it, and why?

3dilettante · Aug 15, 2007

What constitutes viable yields is dependent on how much you can sell the product for.

For commodity products like DRAM, I've seen quotes of yield percentages in the high 90's, and with such tight margins even a few percentage points lower are considered unacceptable.

For higher margin products like CPUs, greater failure rates have been tolerated.
AMD at one point was rumored to have yield rates around 20% for its later K6 processors. Granted, that was considered to be very bad.

As for current designs, I haven't seen any numbers concerning yields. I don't think anyone wants to give that out.

At best, we're told yields are "good" or "very good", though compared to what is not disclosed.

R600 is said to have very good yields, as it is designed with a good amount of redundancy, though exactly how much is not disclosed.

G80 has redundancy built in as well, though the existence of the partially disabled lower-end SKU indicates it may not be as much.

That doesn't mean R600 doesn't have defective cores or similar failure rates. Without numbers, all we know is that AMD decided not to release a cut-down variant for a long time.

Whether R600 is too big is a question of how much it can be sold for. Given the graphics division's less than stellar financials, it can be assumed that R600 isn't too small. I haven't seen any data for the margins on R600, though I do expect some profit is possible, though possibly not enough to recoup other costs.

It's hard to say because the foundries tend to only charge for working chips, so they take a lot of the risk the GPU maker would otherwise have to factor into costs.

Whether a chip is leaky is somewhat subjective.
In power-constrained embedded situations, something like a fraction of a watt may be considered leaky.
Whether R600 is leaky is independent of how much performance it provides.
Leakage, especially static leakage, can be measured without regard to the work the chip does.
If a chip burns a lot of power not doing anything, it is leaky.
For dynamic power, saying something is leaky depends on how much of the power consumption can be traced back to transistor activity and how much can be traced back to power lost due to undesired current flow.

R600 does expend a good fraction more power due to leakage. Its designers admitted they knew the process it was on was leaky.

Much of what makes a chip leaky is the process it is on and the number and nature of the transistors in the design. It is possible for a chip to perform very well, but still be considered leaky.

As for going multichip, it seems the best performance option would be an MCM.
If a future iteration is based on R600, which has a lot of surplus bandwidth, two chips with one half the memory bus width each that devote some of the available ring stops and pinout to a fast bus could do well.

If transparency is desired for multi-board crossfire, the chips would be allowed to issue memory accesses through each other's memory controller. I'm not up on the low-level details to know how this is handled currently.

With a little coordination between the chips and the driver, they could be expected to perform close to what a doubled R600 could perform. If the pairing is transparent, there would be no need for profiles or other workarounds to get decent performance gains.

If higher chip counts happen in the future, more links and more cache will be needed. I don't see chip counts getting too high per board, as the cost and demands on power regulation are going to rise rapidly and memory capacity would be limited per board.

Perhaps a quad-GPU board could be done, though the routing sounds nasty. Anything more than that and there will be chips on the package that are likely to have no direct access to any memory.

CJ · Aug 21, 2007

Posted today on a Dutch forum. There will be 2 versions of the RV670, Gladiator and Revival, both featuring a 256-bit memory interface. Both versions will be made on TSMC's 55nm process and feature 320 streamprocessors. Samples are expected in September with mass production in December. Launch will be at the end of 2007 or the beginning of 2008. It's gonna be a single slot solution, be fully DX10.1 compliant (and thus feature SM4.1) and PCIe 2.0. It will have HDMI/HDCP/Audio/UVD and of course next gen CrossFire. Gladiator will run at 825Mhz and 1.2Ghz GDDR4 (512MB) while Revival will run at 750Mhz core and 900Mhz mem (256/512).

Supposedly the HD2900GT and Pro will still be released within a month or two, but suggested pricing for the Pro will be $299, which is way overpriced for it's projected 3DM06 score of 8000-9000. A GF8800-320 will probably be a better deal unless AMD drops this price to the original $199 that was floating around as the suggested price.

AnarchX · Aug 21, 2007

Interesting, @Chiphell I read today similar things. But they mentioned in addition, that RV670 will use a 8-layer-PCB and have power consumption below X1950 (XT I would guess).

digitalwanderer · Aug 21, 2007

CJ said:
Posted today on a Dutch forum. There will be 2 versions of the RV670, Gladiator and Revival, both featuring a 256-bit memory interface. Both versions will be made on TSMC's 55nm process and feature 320 streamprocessors. Samples are expected in September with mass production in December. Launch will be at the end of 2007 or the beginning of 2008. It's gonna be a single slot solution, be fully DX10.1 compliant (and thus feature SM4.1) and PCIe 2.0. It will have HDMI/HDCP/Audio/UVD and of course next gen CrossFire. Gladiator will run at 825Mhz and 1.2Ghz GDDR4 (512MB) while Revival will run at 750Mhz core and 900Mhz mem (256/512).

Supposedly the HD2900GT and Pro will still be released within a month or two, but suggested pricing for the Pro will be $299, which is way overpriced for it's projected 3DM06 score of 8000-9000. A GF8800-320 will probably be a better deal unless AMD drops this price to the original $199 that was floating around as the suggested price.

We know how much my Gladiator will cost me? :|

vertex_shader · Aug 21, 2007

CJ said:
Posted today on a Dutch forum. There will be 2 versions of the RV670, Gladiator and Revival, both featuring a 256-bit memory interface. Both versions will be made on TSMC's 55nm process and feature 320 streamprocessors. Samples are expected in September with mass production in December. Launch will be at the end of 2007 or the beginning of 2008. It's gonna be a single slot solution, be fully DX10.1 compliant (and thus feature SM4.1) and PCIe 2.0. It will have HDMI/HDCP/Audio/UVD and of course next gen CrossFire. Gladiator will run at 825Mhz and 1.2Ghz GDDR4 (512MB) while Revival will run at 750Mhz core and 900Mhz mem (256/512).

Late, they miss the in time release again

TMU/ROP count info can be much more interesant than SPU number

Grrrr gddr4, when AMD learn the lesson?

Supposedly the HD2900GT and Pro will still be released within a month or two, but suggested pricing for the Pro will be $299, which is way overpriced for it's projected 3DM06 score of 8000-9000. A GF8800-320 will probably be a better deal unless AMD drops this price to the original $199 that was floating around as the suggested price.

No surprise, all my fear about r600 based performance cards looks like coming true.
I have no idea who dream about a whole r600 based hd2900pro with only lower clock speeds for 199$, its not possible in the holiday season.

vertex_shader · Aug 21, 2007

digitalwanderer said:
We know how much my Gladiator will cost me? :|

Its cost a G92 based performance card + 3 months waiting time

219-239$

AnarchX · Aug 21, 2007

vertex_shader said:
TMU/ROP count info can be much more interesant than SPU number

With 320SPs it could imo only be 16 TMUs (maybe enhanced). Because if you cut down R600-design to 12 TMUs you have only 240SPs with 4 SIMDs, with 5 300 and with 6 360.

But ROPs is still a question mark (waiting on R600Pro numbers).

CJ · Aug 21, 2007

digitalwanderer said:
We know how much my Gladiator will cost me? :|

It's the official replacement of the X1950Pro, so the HD2950Pro will probably cost around $199-$249 at launch (closer to $199 than $249).

HD2600 will be replaced by RV635 in Januari. It's also a 55nm DX10.1 card with improved CrossFire. And the HD2400 will be replaced by the RV620 which is also a DX10.1 card on 55nm and unfortunately still has the 64-bit memory interface.

Furthermore AMD is really putting some work into CrossFire with Single Connector CF support coming soon as well as Software CF as well as CrossFire Overdrive as well as Tripod and QuadFire (8.44 DX10 driver, 8.46 DX9 driver) (last feature will come in the November/December timeframe).

vertex_shader · Aug 22, 2007

AnarchX said:
With 320SPs it could imo only be 16 TMUs (maybe enhanced). Because if you cut down R600-design to 12 TMUs you have only 240SPs with 4 SIMDs, with 5 300 and with 6 360.

But ROPs is still a question mark (waiting on R600Pro numbers).

Slowly i start think that rv670 is a r600 on 55nm with 256bit bus, much lower power consuption,little higher GPU clock speed, ~220$ MSRP and DX10.1 support, but the targeted GPU clock speed with rv670 (Gladiator 825mhz) looks very low, at 55nm the 10% GPU clock speed incrase from the 80nm r600 not sounds exciting, i except at least ~920mhz before.

vertex_shader · Aug 22, 2007

CJ said:
Samples are expected in September

This means when everything runs great than there is a little chance we see the cards in lower volume somewhere end of november<->mid december.

seahawk · Aug 22, 2007

Just how likely is that everything wokrs perfectl in 55nm ?

AnarchX · Aug 22, 2007

seahawk said:
Just how likely is that everything wokrs perfectl in 55nm ?

The Samples supposed to be made in 65nm, so 55nm in Q4/Q1 should not be a piece of cake. I wonder, that ATi still does this dangerous play with half-nodes, after they burned several time their hand with this in past.

So we should hope, that they will have this time more luck, because the specs sounds promising.

vertex_shader · Aug 22, 2007

I wonder, that ATi still does this dangerous play with half-nodes, after they burned several time their hand with this in past.

Its clearly visible why they take the risk.

AnarchX · Aug 22, 2007

vertex_shader said:
Its clearly visible why they take the risk.

The only reason is the smaller die-size, because a half-node is not equal to lower consumption and higher clocks, this can occur, but must not. 55nm in 2007 is my eyes a little bit risky, but we will see.

btw.
This must be the reason of information flow about RV670:

:smile:

vertex_shader · Aug 22, 2007

AnarchX said:
The only reason is the smaller die-size, because a half-node is not equal to lower consumption and higher clocks, this can occur, but must not. 55nm in 2007 is my eyes a little bit risky, but we will see.

In this situation AMD need to risk time to time again to catch up and earn some $, they have no other option left, pressure is high from the competition side, and from user side too

vertex_shader · Aug 22, 2007

Anyone remeber this from june 08?

Looks like vrzone now have better AMD infos than before

DegustatoR · Aug 22, 2007

vertex_shader said:
from june 08?

What?

Geo · Aug 22, 2007

320 SPs? That'd be a hell of a "V" part!

So, do we know anything about RV670 yet?

3dilettante

cadaveca

3dilettante

CJ

AnarchX

digitalwanderer

wandering

vertex_shader

vertex_shader

AnarchX

CJ

vertex_shader

vertex_shader

seahawk

AnarchX

vertex_shader

AnarchX

vertex_shader

vertex_shader

DegustatoR

Geo

Mostly Harmless

Similar threads