Panajev2001a
Veteran
Re: ...
Even keeping their current implementation, but going for a 280 mm^2 die size and a 65 nm manufacturing process ( which would produce a total die shirnk to about 1/4th of the toal chip size in 180 nm: the EE and GS in 180 nm took ~4.35x more space than the EE and GS combined using 90 nm technology, so I am not going for the best case scenario ) they should be able to fit 15-16 of those chips in a single die.
This would mean ~400 GFLOPS at 200 MHz.
Bump the clock-speed of the chip to 500 MHz and you get 1 TFLOPS.
You use F.U.D.-dy math, but that allows me to do the same
You do not think 65 nm can make the processor around 4x smaller ?
Fine, let's assume it makes it only 2x smaller.
We could fit only 8 of those CPUs ina single 280 mm^2 die obtaining 200 GFLOPS at 200 MHz.
Clock the puppy at 1 GHz and again you obtain 1 TFLOPS
1 GHz = 5 * 200 MHz
5 * 200 GFLOPS = 1 TFLOPS
Brimstone, the IRAM idea also reminds me a bit of Mitsubishi's 3DRAM, but doesn't reminds you of CELL as well ? ( no pun intended with the rhyme ).
Embedding DRAM and logic ( CPU + embedded DRAM ) is something that has been present in IBM's R&D labs for quite a while ( Prof. Nair papers are a good indication of this ).
DeadmeatGA said:Well, I got the die size data of CS301.
For CELL fans, don't cheer yet because CS301 is a SIMD processor; only one instruction decoder, one control unit and one instruction cache shared among 64 FPUs, and is not heavily pipelined to support higher clock. EE3 on the other hand is a 18-way MIMD(2 PPC cores + 16 active VUs + 2 spare VUs) plus 2 MB of SRAM cache, so it will no doubt be massive in dize size and give a poor yield, plus a programming model that makes CS301 programming look like a kiddy stuff.41 million transistors take up 72 square mm using an IBM 0.13 silicon-on-insulator process
Sorry, Kutaragi's dream of 1 TFLOPS per chip still has to wait until 2007 or 2008 and still cost a bucketload of cash to afford....
Even keeping their current implementation, but going for a 280 mm^2 die size and a 65 nm manufacturing process ( which would produce a total die shirnk to about 1/4th of the toal chip size in 180 nm: the EE and GS in 180 nm took ~4.35x more space than the EE and GS combined using 90 nm technology, so I am not going for the best case scenario ) they should be able to fit 15-16 of those chips in a single die.
This would mean ~400 GFLOPS at 200 MHz.
Bump the clock-speed of the chip to 500 MHz and you get 1 TFLOPS.
You use F.U.D.-dy math, but that allows me to do the same
You do not think 65 nm can make the processor around 4x smaller ?
Fine, let's assume it makes it only 2x smaller.
We could fit only 8 of those CPUs ina single 280 mm^2 die obtaining 200 GFLOPS at 200 MHz.
Clock the puppy at 1 GHz and again you obtain 1 TFLOPS
1 GHz = 5 * 200 MHz
5 * 200 GFLOPS = 1 TFLOPS
Brimstone, the IRAM idea also reminds me a bit of Mitsubishi's 3DRAM, but doesn't reminds you of CELL as well ? ( no pun intended with the rhyme ).
Embedding DRAM and logic ( CPU + embedded DRAM ) is something that has been present in IBM's R&D labs for quite a while ( Prof. Nair papers are a good indication of this ).