G71/G73: Summary & Pre-Launch Speculation Frenzy

Jawed · Mar 7, 2006

silent_guy said:
The parameters that *do* count are:
- die area
- % of area used for RAM. (On one side more is better, since it allows more redundancy. On the other side less is better, since densitity is higher which makes it hard to produce error free when there's no redudancy, which is the case when you're dealing with a lot of small RAMs.)
- Amount of logic redundany (typically, this is rather small.)
- cell density. This is mostly determined by the amount of wiring that interconnects that cells. An elegant design uses less wiring to to the same kind of stuff.

Excellent info, thanks.

I have a pet theory about R580 redundancy, which comes in two parts:

out of order scheduling in R5xx seems to require a larger register file than previous generations, to support the vastly increased number of fragments in flight - although I have no hard data for the previous GPUs (only a guess that there are about 1024 fragments in flight in R420, say - perhaps eight FP24s per fragment). R580 supports 512 threads, each of 48 fragments, each with two FP32 registers, each of 4 bytes = 192KB. On top of that there'll be extra memory relating to shader state for each thread plus shader constants and the memory required to hold the shader programs themselves (oh and then there's all the vertex shading hardware to count, too). Not a huge amount of memory, compared with a CPU, but I imagine there's considerably more than in R420. Regardless of the actual quantity, this RAM will have redundancy, and the effect of redundancy on absolute die size will be more noticable with a more RAM-intensive architecture.
"three" is a pretty funny number for a computing architecture. I wouldn't be surprised if each shader unit in R580 (and RV530) is actually composed of four quad-pipelines, with one dropped for the sake of redundancy. That's 25% redundancy, on a total of approximately 128M transistors (64 pipelines, 16 dropped for redundancy). If that's the case, then we prolly won't see a "36 pipeline" variant of R580.

Obviously, all guesses... The point being that R580's "size" might look like a huge disadvantage, but there are hierarchies of fine-grained redundancy at play, and I expect that redundancy in R580 is significantly more advanced than R420. I also suspect that the "3:1" architecture of RV530 and R580 adds a significant layer of redundancy. The end-result being, perhaps, that practically every core comes out as a fully functional R580 (or RV530).

Jawed

mikechai · Mar 7, 2006

dizietsma said:
Anyone kept an eye on the Far Eastern Web sites ? Galaxy have a promo banner on Gzeasy for 7900 and it mentions 3dmark but I cannot read the rest.

Nah nothing there. Galaxy is organizing an Overclocking contest, and 7900 is one of the prizes.

* 200 posts OMFG! *
________
Vaporize

CJ · Mar 7, 2006

Uttar said:
Supposed Facts, 7600GT (G73)
- Unknown MSRP at this point, probably at RV530 levels & above.

nVidia lowered the MSRP for the 7600GT from $249 to $199 in an attempt to compete better against the X1800GTO.

GF7300GT has been delayed until the end of Q2 for the retailchannel. Only the OEM channel will see the 7300GT at launch.

Oh and only the 7900GTX512 will be launched on the 8/9th of March (depends on where you live on the planet

). The rest will follow on the 20th of March.

trinibwoy · Mar 7, 2006

Speaking of MSRP - http://www.mwave.com/mwave/DeepSearch.hmx?scriteria=7900+GTX&ALL=y&TP=2

EVGA 512-P2-N570-AX GF7900GTX EGS 512MB PCI-E W/HDTV & DUAL DVI

$765.00

EVGA 512-P2-N575-AX GF7900GTX SUPERCLOCK 512MB PCI-E W/HDTV & DUAL DVI

$795.00

_xxx_ · Mar 7, 2006

Jawed said:
[*]"three" is a pretty funny number for a computing architecture. I wouldn't be surprised if each shader unit in R580 (and RV530) is actually composed of four quad-pipelines, with one dropped for the sake of redundancy. That's 25% redundancy, on a total of approximately 128M transistors (64 pipelines, 16 dropped for redundancy).

That's awful lot of die space wasted. I really can't imagine such a huge waste for redundancy, though I have no substantial facts to support the claim.

chavvdarrr · Mar 7, 2006

http://www.theinquirer.net/?article=30100

It seems to use the 256-bit memory interface.

what makes'em think so !?

Mariner · Mar 7, 2006

chavvdarrr said:
http://www.theinquirer.net/?article=30100

what makes'em think so !?

Erm, ignorance? :smile:

Sxotty · Mar 7, 2006

According to inq

It seems to use the 256-bit memory interface.

In regards to 7600... Well now

EDIT:
BTW do I believe it? Well I was under the impressionthey were going to be campatible with 6600 boards, so it seems doubtful, but what the heck it sounds nice anyway.

Arun · Mar 7, 2006

CJ said:
Oh and only the 7900GTX512 will be launched on the 8/9th of March. [...] The rest will follow on the 20th of March.

Are you sure of that? I was under the impression GTX & GT were launching at the same time. On the other hand... 7600GT 256-bit @ 125mmÂ²? Hmm, no.

Jawed · Mar 7, 2006

_xxx_ said:
That's awful lot of die space wasted. I really can't imagine such a huge waste for redundancy, though I have no substantial facts to support the claim.

If the yield model is based on every die having defects, so instead of throwing away entire dies when the defects are large or there are too many within one die, individual dies are so tolerant that almost no dies are ever thrown away.

It's just a curve, and I'm guessing that ATI has chosen a "higher" position on the curve. I'm assuming that it's possible to trade defect rates against failed dies on such a scale by judicious use of redundancy.

Instead of "failing" one entire quad, which is how we got X800Pro and 7800GT (etc.), I'm guessing ATI has designed failure at the "sub-quad" level, so that practically every die comes out with the full count of desired pipelines.

This could be associated with ATI's four tiers of dies (value, mainstream, performance mainstream and enthusiast): rather than using quad-level redundancy to cover the mainstream through enthusiast sectors with 8 or 10 SKUs (there were X800s with 2 quads, 3 quads and 4 quads functional, as well as core/memory speed options), surely it's more cost-effective to build one die per tier, with no need to pipeline-bin the resulting dies for the AIBs to consume.

So, instead of the AIBs having a catalogue of 10+ dies (varying pipeline counts and speeds) to choose from across all four tiers, there's 6 or 7 (4 basic dies, some with two speeds), and the AIBs can create multiple SKUs from a given die with a range of coolers (increasing choice of core clocks) and memory speeds.

Jawed

_xxx_ · Mar 7, 2006

Did you try to calculate the losses with (example)25% for redundancy versus losses of 25% in yields without such huge redundancies built-in (EDIT: meaning, say 20% smaller die)? That would be an interesting comparison. But we would need some numbers on yields in order to do that.

satein · Mar 7, 2006

Jawed said:
If the yield model is based on every die having defects, so instead of throwing away entire dies when the defects are large or there are too many within one die, individual dies are so tolerant that almost no dies are ever thrown away.
...

Jawed

It sounds like a self fail-proof design for chip manufacturing

. I wonder that during pass six months period of delay of R520, ATi may come up with this idea as how to make yeild satisfied (adding condition that the final chip should work at high clock rate to total quatities passed per wafer). It may sound cumbersome at first, but it would become more better on the next design.

dizietsma · Mar 7, 2006

Well if Uttars figures are that 7900GT has 6 quads also then what is soaking up the nvidia failed cores ? Should the 7900GT be 5 quads as per 7800GT or has nvidia got the perfect wafer or is there going to be a 7900GS with 5 quads soon ?

I tend to fvour it being 5 quads I think, just from the hysterical perspoctive

Jawed · Mar 7, 2006

_xxx_ said:
Did you try to calculate the losses with (example)25% for redundancy versus losses of 25% in yields without such huge redundancies built-in (EDIT: meaning, say 20% smaller die)? That would be an interesting comparison. But we would need some numbers on yields in order to do that.

My eyes glaze over when I see people "calculating" yield rates, so no, I leave that to the statisticians :!:

Far too many variables in this. For example, what's the relationship between defect density/type/size and clock rate?

I'm just putting stuff out there to see what others think.

Jawed

Arun · Mar 7, 2006

dizietsma: Geo had made the interesting comment a while back that for a given wafer with a process with roughly similar defect rates, the number of defects over the wafer is constant, yet there are more potential chips on it. As such, the number of chips with defects will be statistically lower. Since G70 was 334mmÂ² and G71 is 196mmÂ²... Well, you can guess the rest.

Geo · Mar 7, 2006

dizietsma said:
Well if Uttars figures are that 7900GT has 6 quads also then what is soaking up the nvidia failed cores ? Should the 7900GT be 5 quads as per 7800GT or has nvidia got the perfect wafer or is there going to be a 7900GS with 5 quads soon ?

I tend to fvour it being 5 quads I think, just from the hysterical perspoctive

This has been bothering me as well, tho keep in mind that 7800GT was not announced at the same time as 7800GTX. About two months later, IIRC. Given the paean to "configurability" (and really 7800GT) that their CFO recently engaged in as a signficant contributor to their margins success story --and the fact that he suggested that the new parts *increase* their "configurability", I've got to believe they have something in mind. How it fits into their pricing structure isn't readily obvious to me yet, however.

Geo · Mar 7, 2006

Uttar said:
dizietsma: Geo had made the interesting comment a while back that for a given wafer with a process with roughly similar defect rates, the number of defects over the wafer is constant, yet there are more potential chips on it. As such, the number of chips with defects will be statistically lower. Since G70 was 334mmÂ² and G71 is 196mmÂ²... Well, you can guess the rest.

Yup, that's true too. Even so, I don't see them giving up on the "configurability" thing entirely, even if it will be relatively less important to them with much smaller dies this time.

satein · Mar 7, 2006

geo said:
This has been bothering me as well, tho keep in mind that 7800GT was not announced at the same time as 7800GTX. About two months later, IIRC. Given the paean to "configurability" (and really 7800GT) that their CFO recently engaged in as a signficant contributor to their margins success story --and the fact that he suggested that the new parts *increase* their "configurability", I've got to believe they have something in mind. How it fits into their pricing structure isn't readily obvious to me yet, however.

I suspect it would be that without compatition during that moment, NV would expected to sell as many as 7800GTX as possible... by giving only choice at that time and I think the charm work really well since a lot of people ended up buy 7800GTX for upgrade instead of waiting longer. But the situation now is different since while there are so many choices lying in the market, I believe that 7900GT would reasonably appear at the same time as 7900GTX. If not, there would be something under this smoke...

Jawed · Mar 7, 2006

geo said:
Yup, that's true too. Even so, I don't see them giving up on the "configurability" thing entirely, even if it will be relatively less important to them with much smaller dies this time.

Configurability is presumably being able to produce at least 3 SKUs based on only one die and two PCBs: GTX, GT, GS.

Jawed

Geo · Mar 7, 2006

silent_guy said:
Just a question from a noob on this forum (though not a noob in chip design): I've been following this thread for a long time now and every now and then this discussion about the amount of transistors comes up.

Lovely "Hello" post there! Welcome to B3D; don't be a stranger, y'hear?

I would agree unreservedly that die size is *more* important. As to why transistor count is *interesting* tends to do with several factors, some of which we have no control over.

We are largely a speculative forward-looking lot here. Many of us are already turning our eyes to the next family of gpus, even tho we haven't even seen the reviews for the current version. And from a forward-looking pov, obviously the data points are thin on the ground. Transistor count seems to be something that does, on occasion get out into the public domain signficantly in advance --die size almost never does.

Now, to some degree is this PR manipulating us? Oh yeah, no doubt. But as the fellow found in the crooked casino noted, "It's the only game in town."

And it's not useless info. It can tell you something about process nodes that might be under consideration. It can give you an idea of die size, yields, clock speeds.

And, given the parallelization (I'm too lazy to look up the spelling --I always slaughter that one!) and variety of functional units in a GPU, it can be funsie and interesting to try to figure out what they are actually doing with a given transistor budget. . .

But, yeah, where we can get it we'd probably rather have die size.

Edit: Tho, come to think of it, if you're doing a delta between prev gen chip A and next gen chip B. . .and you do know the process node for A, but not for B. . .what would you rather have, die size of both or transistor count of both?

G71/G73: Summary & Pre-Launch Speculation Frenzy

Jawed

mikechai

CJ

trinibwoy

Meh

_xxx_

chavvdarrr

Mariner

Sxotty

Arun

Unknown.

Jawed

_xxx_

satein

dizietsma

Jawed

Arun

Unknown.

Geo

Mostly Harmless

Geo

Mostly Harmless

satein

Jawed

Geo

Mostly Harmless

Similar threads