DM's 65 nm transistor density analysis based on Sony EE data

How many logic transitors does PPC 970 have?

Anyway, while everyone is yelling at Deadmeat, how about answering this question:

What transitor count do YOU expect from PS3's incarnation of CELL?
 
They are expecting 500 million to 1 billion. I expect around 166 million.

Okay now I don't know much about this infact I know next to nothing about what has been said here about SRAM gates and whatnot, but do you think that Sony and IBM would be that far off their figures? Not likely

I mean the AMD operon supposedly has just over 100 million transistors only going up by 50 million over the next 2-3 years seems a little small now doesn't it?
 
But CELL is not one of these, CELL packs in more FPUs at the expense of SRAM gates, thus its gate density is much lower than any of above designs.

They do plan to put in SRAM in there (4MB worth), and eDRAM too (64 MB worth), if they succeed or not, that's another story.

But if they do succeed, logic gates would be low in comparison to memory gates.

Beside even looking at EE real estate, VU memory, took up more real estate compare to the VU it self. But still both VUs and its memory took a little bit more than 1/4 of EE real estate.
 
DeadmeatGA said:
How many logic transitors does PPC 970 have?
52 million.

But then:

52 million/118 mm2 = 0.460 mt/mm2 @ .13 um

At .65 um 1+ mt/mm2 would seem within reach. (280 million transistors for 280 mm2 chip).

I would expect PS3's CPU to weigh in at ~300 million transistors at the very most (probably between 250 and 300 million). I don't see them going anywhere over 300 million, however.
 
PS3 is using some rather high density eDRAM, so that should boost the transistor count dramatically. Not to mention we have no idea how many layers it's going to be.
 
52 million/118 mm2 = 0.460 mt/mm2 @ .13 um

At .65 um 1+ mt/mm2 would seem within reach. (280 million transistors for 280 mm2 chip).
PPC 970 has a 0.5 MB L2 cache worth 20 million transistors in addition to 96 KB L1 cache worth 5~6 million. The pure logic transistor count is about half of 52 million. And with this 26 million transistors worth of logic gates IBM was able to cram only 1 PPC core and 1 Altivec unit. Do the math and things look bleak for Sony indeed.
 
...

kaigai09.jpg

See for yourself. 20 million worth of L2 SRAM transistors densely packed into an area less than one fifth of overall die size. While the remaining 26 million logic transistors take up 4/5th of die.[/img]
 
I don't know about you, but they seem to have a LOT of SRAM arrays there, not all cache of course. These is a lot of buffering that goes on in various areas, irrespective. I think it takes up a bit more room than that.

BTW, do you know what the L2 and L1 SRAM cells are exactly -- 6T or 4T?
 
Re: ...

You could use a frickin' MIPS core for the PE. You don't need a full Power4 derivative. This is becoming insane. And beside that, you're wrong.

How many more articles do I need to post that explicitly state the upperbound on logic counts on an SoC for the 90nm process (a true 90nm process) is approaching 100M. This is fact, I've already posted links to this from respectible sources in this thread. I mean, your so called "estimate" will be met (or on the same lvl) as what nVidia and ATI do on a 130nm process.

You are dead wrong.
 
...

You could use a frickin' MIPS core for the PE.
I doubt IBM would be happy with that idea... The thing was designed at IBM's Austin center, of course IBM engineers are going to use what they are familiar with, a PPC core...

And beside that, you're wrong.
All talk and no proof.

How many more articles do I need to post that explicitly state the upperbound on logic counts on an SoC for the 90nm process (a true 90nm process) is approaching 100M.
Yea, if you are fabricating an 100 million SRAM gates. 100 million logic gates???? Hell will freeze before that happens.

I mean, your so called "estimate" will be met (or on the same lvl) as what nVidia and ATI do on a 130nm process.
A CPU is not a GPU.

You are dead wrong.
We will see about that.
 
DeadmeatGA said:
PPC 970 has a 0.5 MB L2 cache worth 20 million transistors in addition to 96 KB L1 cache worth 5~6 million. The pure logic transistor count is about half of 52 million. And with this 26 million transistors worth of logic gates IBM was able to cram only 1 PPC core and 1 Altivec unit. Do the math and things look bleak for Sony indeed.

Ahem...

DeadmeatGA said:
How many logic transitors does PPC 970 have?
52 million.

I was just working with the number you gave me. BTW, why do think .5 MB of L2 is worht 20 million t?
 
Deadmeat said:
A CPU is not a GPU.
So what you're saying is that identical piece of circuit (say, a VU) magically takes less die space if you put it on a chip and call it "GPU" as opposed to naming the same chip "CPU" ? :LOL:
 
Assuming 4T SRAM...

KB to B to b conversion

512 * 1024 * 8 = 4194304

That's how many cells you'll have, so that * 4 gives you 16,777,216 transistors. Not including any of the control logic.

With 6T SRAM that number grows to 25,165,824.

The later is faster, but that would likely blow the power budget. The former should be plenty.
 
Panajev2001a said:
Fafalada said:
Deadmeat said:
A CPU is not a GPU.
So what you're saying is that identical piece of circuit (say, a VU) magically takes less die space if you put it on a chip and call it "GPU" as opposed to naming the same chip "CPU" ? :LOL:

Yeah, didn't "you" know that ? ;)

:lol


guys!! where have u been for the last 10 years?!?!?! :LOL: :LOL: :LOL:
 
...

Faf

So what you're saying is that identical piece of circuit (say, a VU) magically takes less die space if you put it on a chip and call it "GPU" as opposed to naming the same chip "CPU" ?
Vertex Shaders were simpler because they didn't have branches.(Don't know about VS3s) VU does have branches and in fact has a 16 bit CPU controlling a 128 bit VectorFPU.

Saem

Assuming 4T SRAM... KB to B to b conversion

512 * 1024 * 8 = 4194304 That's how many cells you'll have, so that * 4 gives you 16,777,216 transistors. Not including any of the control logic.
You have to add the cach line tag(4 byte per 128 byte cache line?) and control logic. The "tax" is about 10%~15% to the number.
 
Yeah, I didn't bother, because I figured the margin of error wouldn't be that big and really, it isn't, IMO.

The cache lines are 128B?
 
Deadmeat said:
Vertex Shaders were simpler because they didn't have branches.(Don't know about VS3s) VU does have branches and in fact has a 16 bit CPU controlling a 128 bit VectorFPU.
While the latter is true, I wouldn't be so quick to dismiss VS as being lower on transistor budget. Afaik the little things have pipelines over 100stages long (as opposed to 4stages in VU) coupled with the necessary logic that abstracts any scheduling issues from the programmer - something that is done entirely by compiler on VU (or the programmer).

But that's really not important anyway - we are talking about graphics processors of the future which surely won't be taking a step back ;) So my point from the previous post stands either way.
 
I saw some die shots of GF3/4 in the preview articles many sites published a couple years back. The VS(es) took up a pretty hefty chunk of space, that's for sure, and those were VS1.1 shaders, thus rather limited compared to EE VUs.


*G*
 
Back
Top