ELSA hints GT206 and GT212

Novum · Jun 30, 2009

CarstenS said:
Possible?

http://home.arcor.de/quasat/GT2z.jpg

Nope. The TMUs are right next to the ALUs and should be much larger. The layout of the TMUs of this chip must be irregular.

GT200:

red = Vec8
green = octo TMU.

3xVec8 + octo TMU = TPC.

What you have marked as ROPs+TMUs ist most likely the GDDR5 interface.

Jawed · Jun 30, 2009

Jawed said:
GPU-Z/VR-Zone's reporting the wrong shader count, 24 instead of 16, on GT218.

Hmm, I've been told that NVidia's specifications page is wrong and it's 24.

So that would appear to indicate the entire line-up is based upon 3 multiprocessors with a pair of quad TMUs per cluster.

So TMUs appear to be:

GT218 - 8
GT214 - 16
GT215 - 32

Jawed

CarstenS · Jun 30, 2009

Novum said:
Nope. The TMUs are right next to the ALUs and should be much larger. The layout of the TMUs of this chip must be irregular.

GT200:

red = Vec8
green = octo TMU.

3xVec8 + octo TMU = TPC.

What you have marked as ROPs+TMUs ist most likely the GDDR5 interface.

I know it's this way with GT200 and older chips. GT21x are a new breed and 'til now, I fail to identify the TMU area(s) on those GPUs.

Jawed · Jun 30, 2009

Novum said:
3xVec8 + octo TMU = TPC.

What you've marked doesn't add up to an entire cluster. Could be general control or it could be TMU. Dunno.

Also, what's interesting is that in the GT215 die shot the clusters appear to contain much less logic than GT200 (the ratio of area for "ALUs" to "TMUs" is wildly different comparing the two) - implying that the layout of GT215 doesn't have clusters as single contiguous units.

Either that or there's much less TMUs. Or that scaling to 40nm has been wildly non-linear depending upon unit :???:

The scaling of the ALUs, for what it's worth, appears to be ~2x, from 65nm GT200 to 40nm GT215. One "ALU" in GT200 is 0.654mm² and the same unit in GT215 is 0.323mm².

Jawed

fellix · Jun 30, 2009

There are four similar structured rectangle blocks, situated between the pairs of TPCs distinguishable in the die shot -- those could be texturing hardware, being just the samplers, mapping units or even both (too small for eight TMU quads, anyway... duh!). :???:

trinibwoy · Jun 30, 2009

Jawed said:
So TMUs appear to be:

GT218 - 8

GT214 - 16

GT215 - 32

What's GT214? Isn't it GT216?

RussSchultz · Jun 30, 2009

Jawed said:
What you've marked doesn't add up to an entire cluster. Could be general control or it could be TMU. Dunno.

Its oddly marked, that's for sure, but there's obviously 10x(3x+1) instances.

For the die shot linked to, I don't believe the areas marked 'SIMD' should cover the area that they do. Each SIMD block does seem to represent 3x of something, but the piece attached to it (which I'm saying shouldn't be part of it) isn't a duplicate on each of the different blocks. It might be a routing issue that's making them look different (and they're only instanced on lower metal layers), but I kinda doubt that.

I don't think that what that person has labeled as the same thing on the lower and left hand edges are actually the same thing.

What I see is
4x(3x)--what's mark SIMD
8x --what's marked octo-dunnos
8x --what's marked QTU on the left
4x --what's marked QROP of the left
8x --what's marked QTU on the bottom
4x --what's marked QROP on the bottom

I'd gather that there are 4 functional units, each composed of:
3x something (SIMD)
2x something (QROP of the left)
2x something (QROP of the bottom)
2x something (OCTO on teh top)
1x something (QTU on the left)
1x something (QTU on the bottom)

Jawed · Jun 30, 2009

trinibwoy said:
What's GT214? Isn't it GT216?

yep!

Jawed

Jawed · Jun 30, 2009

RussSchultz said:
Its oddly marked, that's for sure, but there's obviously 10x(3x+1) instances.

For the die shot linked to, I don't believe the areas marked 'SIMD' should cover the area that they do.

Agreed.

Each SIMD block does seem to represent 3x of something, but the piece attached to it (which I'm saying shouldn't be part of it) isn't a duplicate on each of the different blocks. It might be a routing issue that's making them look different (and they're only instanced on lower metal layers), but I kinda doubt that.

I don't think that what that person has labeled as the same thing on the lower and left hand edges are actually the same thing.

What I see is
4x(3x)--what's mark SIMD
8x --what's marked octo-dunnos

Appears to be PCI Express

8x --what's marked QTU on the left
4x --what's marked QROP of the left
8x --what's marked QTU on the bottom
4x --what's marked QROP on the bottom

IO connections for GDDR, with what's labelled QROP actually prolly corresponding with command bus with the remainder being data bus.

Jawed

fellix · Jun 30, 2009

RussSchultz · Jun 30, 2009

How sure are you about that?

Those seem awfully busy and large to be pads and drivers for 4 pins for each square.

fellix · Jun 30, 2009

LOL, you obviously haven't seen what a truly large pad array looks like.

Jawed · Jun 30, 2009

RussSchultz said:
How sure are you about that?

Compare with the ATI die, where exceedingly similar regions of the die (which are also of corresponding size) are explicitly labelled as memory interface:

http://www.techreport.com/articles.x/14990/2

Oh and look at the layout of signal pads and the plethora of ground and power lines in figure 26:

Under Bump Routing Layer Method and Apparatus

have to admit I haven't read it through.

Jawed

CarstenS · Jul 1, 2009

Jawed said:
I think fellix's ideas are on the right track, though the "stacked" stuff is quite a puzzler.

Carsten if you compare with the annotated GT200 here (even if there are some who doubt its accuracy):

http://www.techreport.com/articles.x/14934/2

you should see that TMUs and ROPs take up acres of space.

Jawed

That shot doesn't distinguish at all between SIMD-Control- and TU-logic - it's all "Texture", whereas the GT21x-shot at least shows some additional logic besides the actual ALUs in the SIMD-parts of the die - but just not enough to make me believe, that TUs are still incorporated.

I'm not saying, fellix is wrong and I am here, but I've yet to see convincing evidence for either position.

And take into account, that for DX10.1 Nvidia would have to overhaul their TMUs either way - and maybe they tried to get away with less space, maybe combining some of the stuff for accessing memory, which is replicated in both TMUs and ROPs.

I could imagine, you can get away with less space when routing a dual-lane (1 for ROP-use, 1 for TU-use) to memory compared to having to to the individual routing from two far away parts of the die (I guess that's the principle of highways or autobahns also).

fellix · Jul 1, 2009

fellix said:
There are four similar structured rectangle blocks, situated between the pairs of TPCs distinguishable in the die shot -- those could be texturing hardware, being just the samplers, mapping units or even both (too small for eight TMU quads, anyway... duh!).

Just to make my statement more figurative (the red outline):

Jawed · Jul 1, 2009

CarstenS said:
That shot doesn't distinguish at all between SIMD-Control- and TU-logic - it's all "Texture", whereas the GT21x-shot at least shows some additional logic besides the actual ALUs in the SIMD-parts of the die - but just not enough to make me believe, that TUs are still incorporated.

Try this, too:

http://pc.watch.impress.co.jp/docs/2008/0617/kaigai_16l.gif

Even though it, too, doesn't make the distinctions you require.

I agree there should be some control stuff per cluster and I've no idea of the extent of the SIMD-specific stuff (i.e. 3x MAD-8, MI-2 and DP-1).

And take into account, that for DX10.1 Nvidia would have to overhaul their TMUs either way - and maybe they tried to get away with less space, maybe combining some of the stuff for accessing memory, which is replicated in both TMUs and ROPs.

Yes, they definitely have to do extra things (e.g. gather). We still don't even know how many TMUs there are. For all we know there's only 16 of them

I could imagine, you can get away with less space when routing a dual-lane (1 for ROP-use, 1 for TU-use) to memory compared to having to to the individual routing from two far away parts of the die (I guess that's the principle of highways or autobahns also).

Yes, to a degree "repeater islands" across the die imply that routing will agglomerate. The routes themselves don't take space since they are in metal layers under the logic layer.

Jawed

Novum · Jul 1, 2009

Jawed said:
What you've marked doesn't add up to an entire cluster. Could be general control or it could be TMU. Dunno.

What's missing? NVIDIA marks it the same.

fellix said:
Just to make my statement more figurative (the red outline):

Hrm that could explain it. But then that NVIDIA picture is wrong:

There should be more "random" logic that is not texture that belongs to the ALUs.

RussSchultz · Jul 1, 2009

fellix said:
Just to make my statement more figurative (the red outline):

The red squares don't look to be instances. The contents look similar, but they don't look like instances.

And the blue squares seem to be too big. (the areas closer to the center line do not match across instances)

fellix · Jul 2, 2009

I think we've already concluded here, that irregularities between similar block instances are due to employing a full automatic design & tuning for the selected logic circuits.

RussSchultz · Jul 2, 2009

What you're pointing to doesn't look like a sea of gates, either (i.e. the product of automatic place and route).

I mean, I guess it just doesn't make sense to me to 'halfway instance' something in a way that looks close to the same, but not quite.

Usually instancing is either plopping hard macros down, or just letting the auto place and route do its thing and ending up with a sea of gates.

Of course, I only tangentially work in the back end of chip design, so I just might not be familiar with the technique.

ELSA hints GT206 and GT212

Novum

Jawed

CarstenS

Moderator

Jawed

fellix

trinibwoy

Meh

RussSchultz

Professional Malcontent

Jawed

Jawed

fellix

RussSchultz

Professional Malcontent

fellix

Jawed

CarstenS

Moderator

fellix

Jawed

Novum

RussSchultz

Professional Malcontent

fellix

RussSchultz

Professional Malcontent

Similar threads