IBM, Sony and Toshiba extends the chip alliance toward 32nm

xbdestroya said:
That's a pretty good way of looking at it, didn't think of that. But PS3 was targeting 65nm a couple of years ago before they realized they'd have to ship at 90nm, so still possible we might see PS4 at 45nm.

I would thinks so also. They do it now right in the end of its crossover-period.
About the topic i think that it would be corporate suicide not too follow the Cell "way" but first things that come to mind are ISA additions.
 
london-boy said:
What are the manufacturers going to do when they reach silicon's smallest possible process (can't remember what size it actually is)?

Still haven't heard of any commercially viable alternatives, but they must be preparing themselves for the jump cause it's not TOO far away... Unless i got my maths completely wrong, which is highly possible.

the alternatives that come to mind so far are: using nano tubes. a nano tube transitor has already been made. the other is using entaglement in a quantum computer. i'm not sure how long it will take for these to be viable tho. could take 50 years.

latest quatum computer development i heard of http://www.newscientist.com/article.ns?id=mg18424733.100

latest on carbon nanotube transistor http://www.newscientist.com/channel/mech-tech/dn7847
 
Last edited by a moderator:
Not true. As has been said by MS, they'd have chosen a 10 GHz single core over XeCPU any day. The reason for going multicore is clockspeeds have hit a limit on current complexity, so the only way to get more power is to add more cores (assuming you're already efficient with your single-core architecture).

The problem is you can't make something complex and fast, that's what went wrong on the P4, it tried to do both. The problem has hit everyone now though.
The old people now really pushing clock speed is IBM, POWER6 is expected at over 4GHz but it's not as complex as POWER5.

Thus multicore has no limits in that you can keep increasing the number of cores, whereas single core is limited to fabrication techniques and clockspeeds and offers a peak-performance bottleneck compared with multicore.

Multicore does have limits, just different ones.
To double the number of cores you have to more than half the power consumption per core. That's not easy.

Then there's connecting all the cores together and keeping the caches consistent, this adds complexity limiting the clock speed. STI have been very clever here in that the SPE's local stores do not need to be consistent with each other, makes them much easier to scale.

A huge problem is software, very little is written for multiple cores.
Best of all is Amdahl's law which poses fundamental limits on the speed ups you can get by going parallel...

That said someone did a chip with 4000 cores a couple of years back...

--

Out of personal interest - where would you like to see the current design go, assuming some 5 times greater transistor budget?

At 32nm thats 8X more transistors.

I reckon Cell will break into branches with high end and low end versions, I'd like to see a high end version with better dual precision support and a POWER6 core instead of the PPE. Bigger LS would always be useful as well.

They could add things like branch prediction and hardware threading but they may slow it down in some areas. They already have SPE software threading in the works (8 threads per SPE), I'm very curious to see how well it does on the stuff Niagara does well on.
 
ADEX said:
I reckon Cell will break into branches with high end and low end versions, I'd like to see a high end version with better dual precision support and a POWER6 core instead of the PPE. Bigger LS would always be useful as well.

They could add things like branch prediction and hardware threading but they may slow it down in some areas. They already have SPE software threading in the works (8 threads per SPE), I'm very curious to see how well it does on the stuff Niagara does well on.

Would it be of benefit or even needed/possible to add L1cache on top of the LS and have the theoretical beffier L2 cache on the "PPE2" with lockable blocks to the "SPEs2" cache/LS?.
 
Back
Top