CELL architect speaks on future chip design and fabrication

Deadmeat

Banned
As scaling slows, designers must pick up slack

By David Lammers

EE Times
February 2, 2004 (4:51 p.m. ET)

AUSTIN, Texas — As performance gains become harder to wring out of CMOS scaling, the chip industry increasingly will need to create more powerful design techniques to keep the chip industry on a growth path, said IBM fellow Jim Kahle during a keynote speech here at the 2004 Tau Workshop Monday (Feb. 2nd).

Kahle, who is working on the "Cell" processor design team that includes engineers from IBM Corp.'s microelectronics division, Sony Corp., and Toshiba Corp., said "CMOS is running out of steam," with interconnect delays and leakage power becoming more challenging.

"The biggest limit going forward is the human mind. How do we exploit the parallelism available to us?" Kahle said at the workshop, which focuses on timing issues. The two-day meeting, with about 75 attendees this year, is organized by the IEEE and the Association of Computer Manufacturers as the International Workshop on Timing Issues in the Specification and Synthesis of Digital Systems.

Process technology still has some tricks up its sleeve, such as silicon-on-insulator, strained silicon, multi-gate devices, and other technology advances. But overall, the 70-percent shrinks that bulk CMOS devices undergo every two or three years no longer deliver commensurate performance improvements. Metal interconnects pose a particular challenge, as copper and low-k dielectrics will not improve fast enough to keep wiring delays balanced with transistor delays, Kahle said.

To keep the chip industry healthy, companies must come up with new ways to add value that pick up the slack in bulk CMOS scaling. Design tools must be developed that are able to handle variability stemming from multiple cores, multiple threshold voltages, more complex clock distribution and wire delay models, as well as shifts in temperature and droops in the supply voltage.

IBM is working the challenge presented by these variables, which become more impactful as devices shrink. Chandu Visweswariah, a member of the technical staff at IBM's T.J. Watson Research Center, and Kerim Kalafala, an EDA engineer at IBM Microelectronics, presented work on "EinsStat," a tool under development that will work with IBM's timing analysis tool "EinsTimer."

EinsStat uses statistical techniques, rather than the traditional corner-based static timing methodology, to measure on-chip variabilities. While statistical analysis has been under study for many years, previous attempts have resulted in overly-long simulation times. IBM was able to test variables on a 2.1-million gate Asic design in one hour and 10 minutes, plus setup time. A 3,000 gate design required only 18 minutes of CPU time, said Visweswariah.

Kalafala said the project requires cooperation among process, EDA, and circuit modeling enginers, which must create a design methodology that makes sense to circuit and chip designers.

"This kind of project is not an area where you can snap your fingers and make it happen. It requires a lot of different groups to work together, which is something that IBM can do because we have all of these different kinds of expertise under one roof," he said.

Last week, IBM announced that it would merge its microelectronics and server divisions. Visweswariah said the "EinsStat" tool will prove useful initially in developing microprocessors used by the IBM server division. He declined to speculate when EinsStat would come into use within IBM.
Kahle isn't directly speaking about CELL here, but the design and fabrication technology to be used on CELL instead. What he implies here is that the days of automatic clockspeed improvement and power consumption reduction from moving to smaller geometry is over as CMOS is running out of steam.

CELL@65 nm won't particularly be better than 90 nm devices on both clockspeed and power consumption, only smaller. So the forthcoming 90 nm devices will give us a good idea of CELL@65 nm's clockspeed and power consumption level.

Edit. I changed the title because Kahle is the chief architect of CELL, so what he is saying right now about sub 90 nm fab technology has direct implications for CELL....
 
...

What Kahle said about CELL in the past

IBM's Cell is a great example. In Austin, Tex. 250 IBM engineers and 50 engineers from Toshiba and Sony are finishing the blueprints for a chip designed to handle floods of multimedia data in a wide variety of devices. The first intended use is Sony's next generation PlayStation, which executives describe as a "broadband game network."

The project got its start in early 2000, when Sony Chairman Nobuyuki Idei called IBM Chairman Louis V. Gerstner Jr. Sony wanted the PlayStation to be the gateway into a world of broadband multimedia, Idei told Gerstner. Would IBM design a new chip for it? Make it happen, Gerstner told his team.

The project has been run in deep secrecy for a year. James Kahle, IBM's lead architect for the chip, says the Cell design borrowed from IBM's Power4 chip (used in high-power servers) but evolved differently. "We started with a blank sheet of paper and asked, ‘How do people interact with a machine?'" A hint: They want to talk to it rather than pound on a keyboard, and they demand undivided attention.

Although Kelly and Kahle are coy about the details, Cell will be a parallel processor on a chip, with multiple components that can be reprogrammed--on the fly--to handle any task. Make a wireless call and that component swings into action; switch to a videogame and the graphics piece takes over. By contrast, in Intel's system-on-a-chip approach, the XScale core acts like an orchestra conductor, surrounded by different components set to handle pre-defined tasks.

Both approaches have tradeoffs. Intel's components may handle their jobs more efficiently but also spend more time idle. IBM's Kahle is betting that Cell will better adapt to surges of activity and could be "self-healing" by rerouting tasks if a part of the system gets overloaded.
Cell based on Power4??? XCPU2 based on Power5 which itself is an enhanced Power4? Purhaps PSX3 and XBox Next aren't all that different afterall...
 
Re: CELL architect speaks on future chip design and fabricat

Deadmeat said:
Kahle isn't directly speaking about CELL here, but the design and fabrication technology to be used on CELL instead. What he implies here is that the days of automatic clockspeed improvement and power consumption reduction from moving to smaller geometry is over as CMOS is running out of steam.

CELL@65 nm won't particularly be better than 90 nm devices on both clockspeed and power consumption, only smaller. So the forthcoming 90 nm devices will give us a good idea of CELL@65 nm's clockspeed and power consumption level.

Edit. I changed the title because Kahle is the chief architect of CELL, so what he is saying right now about sub 90 nm fab technology has direct implications for CELL....

WTF! This is becomming out of control. You're a little troll who posts out of sheer bias and seemingly lives his life to do nothing but bash a company in every way possible, including those which aren't even implied.

Just for your information, what he's talking about wrt CMOS scaling is that you can't just die shrink an architecture and keep getting the preformance yeilds you once did; instead you need to be intelligent and turn to alternative architecture - like Cell which is massivly parallel. This then requires more intelligent ways to actually layout the physical IC - which needs to be automated and concurrently processed; ergo EinsTimer.

To state that Cell@65 is no better than Cell@90nm is beyond retarded, infact I can't think of anything to describe how pathetic this has become.

Dave, I know we've had our differences and misunderstandings, but when is enough enough?
 
Re: CELL architect speaks on future chip design and fabricat

Deadmeat said:
CELL@65 nm won't particularly be better than 90 nm devices on both clockspeed and power consumption, only smaller.

I think being smaller is the important part here, as Cell will need lots of transistors, and presumably would be too big @ 90nm (otherwise, why would they go with 65nm, if 90nm would work just as well?).
 
"Dave, I know we've had our differences and misunderstandings, but when is enough enough?"

Lol......pot meet kettle. Dave should have banned your ass a long time ago too. Guess he's to nice a chap though.
 
Re: ...

Deadmeat said:
What Kahle said about CELL in the past

IBM's Cell is a great example. In Austin, Tex. 250 IBM engineers and 50 engineers from Toshiba and Sony are finishing the blueprints for a chip designed to handle floods of multimedia data in a wide variety of devices. The first intended use is Sony's next generation PlayStation, which executives describe as a "broadband game network."

The project got its start in early 2000, when Sony Chairman Nobuyuki Idei called IBM Chairman Louis V. Gerstner Jr. Sony wanted the PlayStation to be the gateway into a world of broadband multimedia, Idei told Gerstner. Would IBM design a new chip for it? Make it happen, Gerstner told his team.

The project has been run in deep secrecy for a year. James Kahle, IBM's lead architect for the chip, says the Cell design borrowed from IBM's Power4 chip (used in high-power servers) but evolved differently. "We started with a blank sheet of paper and asked, ‘How do people interact with a machine?'" A hint: They want to talk to it rather than pound on a keyboard, and they demand undivided attention.

Although Kelly and Kahle are coy about the details, Cell will be a parallel processor on a chip, with multiple components that can be reprogrammed--on the fly--to handle any task. Make a wireless call and that component swings into action; switch to a videogame and the graphics piece takes over. By contrast, in Intel's system-on-a-chip approach, the XScale core acts like an orchestra conductor, surrounded by different components set to handle pre-defined tasks.

Both approaches have tradeoffs. Intel's components may handle their jobs more efficiently but also spend more time idle. IBM's Kahle is betting that Cell will better adapt to surges of activity and could be "self-healing" by rerouting tasks if a part of the system gets overloaded.
Cell based on Power4??? XCPU2 based on Power5 which itself is an enhanced Power4? Purhaps PSX3 and XBox Next aren't all that different afterall...

Selective reading my friend...

"We started with a blank sheet of paper and asked, ‘How do people interact with a machine?
 
Re: CELL architect speaks on future chip design and fabricat

Vince said:
Deadmeat said:
Kahle isn't directly speaking about CELL here, but the design and fabrication technology to be used on CELL instead. What he implies here is that the days of automatic clockspeed improvement and power consumption reduction from moving to smaller geometry is over as CMOS is running out of steam.

CELL@65 nm won't particularly be better than 90 nm devices on both clockspeed and power consumption, only smaller. So the forthcoming 90 nm devices will give us a good idea of CELL@65 nm's clockspeed and power consumption level.

Edit. I changed the title because Kahle is the chief architect of CELL, so what he is saying right now about sub 90 nm fab technology has direct implications for CELL....

WTF! This is becomming out of control. You're a little troll who posts out of sheer bias and seemingly lives his life to do nothing but bash a company in every way possible, including those which aren't even implied.

Just for your information, what he's talking about wrt CMOS scaling is that you can't just die shrink an architecture and keep getting the preformance yeilds you once did; instead you need to be intelligent and turn to alternative architecture - like Cell which is massivly parallel. This then requires more intelligent ways to actually layout the physical IC - which needs to be automated and concurrently processed; ergo EinsTimer.

To state that Cell@65 is no better than Cell@90nm is beyond retarded, infact I can't think of anything to describe how pathetic this has become.

Dave, I know we've had our differences and misunderstandings, but when is enough enough?

Even if ( we cannot assume that their 65 nm and 45 nm processes are not also faster ) CELL@65 nm is only smaller ( and not much faster in clock-speed ), I would call it a win in a parallel design such as CELL.

More Transistors = more APUs, more DRAM, etc...
 
Even if ( we cannot assume that their 65 nm and 45 nm processes are not also faster ) CELL@65 nm is only smaller ( and not much faster in clock-speed ), I would call it a win in a parallel design such as CELL.

Sure from the vendors POV, smaller == cheaper to churnout == good cost performence ratio. But from a performence standout I can;t see how this is inherently 'good'.
 
...

Just for your information, what he's talking about wrt CMOS scaling is that you can't just die shrink an architecture and keep getting the preformance yeilds you once did;
Exactly. Kahle is stating that the days of automatic performance improvement from overclocking is coming to an end. Why won't new generation of chips overclock? Because power consumption level won't go down with each die shrinkage anymore.

Expect CELL@65 nm to have a similar power consumption level and clockspeed property as present 90 nm devices. In other word, 65 nm is no magic warp drive that ends all engineering troubles and take us to lightspeed and beyond, it is bound by same restriction as the current generation of 90 nm devices, meaning stalled clockspeed improvement, 50~100 W power consumption level, etc.
 
Hmm, I know Intel for one is still aiming to agressively ramp clockspeed. We haven't hit the "wall" yet...
 
Expect CELL@65 nm to have a similar power consumption level and clockspeed property as present 90 nm devices.

and how the helkl did you takle that leap without any figures relating to your arguement. The man is clearly talking about the trend andd limits of the technology NOWHERE DOES HE EXPLICITLY STATE OTHERWISE.

are you really expecting ppl to look pass the gaping holes in your reasoning or are you just being an cheese humper?
 
Re: ...

Deadmeat said:
Just for your information, what he's talking about wrt CMOS scaling is that you can't just die shrink an architecture and keep getting the preformance yeilds you once did;
Exactly. Kahle is stating that the days of automatic performance improvement from overclocking is coming to an end. Why won't new generation of chips overclock? Because power consumption level won't go down with each die shrinkage anymore.

Expect CELL@65 nm to have a similar power consumption level and clockspeed property as present 90 nm devices. In other word, 65 nm is no magic warp drive that ends all engineering troubles and take us to lightspeed and beyond, it is bound by same restriction as the current generation of 90 nm devices, meaning stalled clockspeed improvement, 50~100 W power consumption level, etc.

From current SOI devices at 90 nm, maybe... neither Prescott, nor products based on CMOS4 wil show you anything useful.

IMHO, I cannot wait until CMOS6 hits with capacitor-less e-DRAM.
 
notAFanB said:
Even if ( we cannot assume that their 65 nm and 45 nm processes are not also faster ) CELL@65 nm is only smaller ( and not much faster in clock-speed ), I would call it a win in a parallel design such as CELL.

Sure from the vendors POV, smaller == cheaper to churnout == good cost performence ratio. But from a performence standout I can;t see how this is inherently 'good'.

Smaller or same size with many more transistors.

CELL plays the braniac game, in some ways related to the same bet Intel made with IPF.

Look at Prescott, soo many transistors ( SMT/Hyper Threading, ultra advanced branch prediction, ultra complex out of order scheduling and instruction re-ordering [lots of instructions in flight], etc... ), such a high power consumption, 31 stages pipeline... all to keep fed a design who makes of the narrow and deep philosophy of CPU design its source of inspiration.

Memory is also not keeping up very well so you need more and more humongous caches ( and you have to make them as low latency as possible ) to keep the CPU fed.

The current CPUs are designs optimized for low ILP code that try to spend a gazillion of transistors to evolve and encompass more and more scenarios ( including ones like 3D Graphics, Phisics, etc... which show a great deal of parallelism ).

The problem is that they are trying to go after those "extra" scenarios with the same philosophy that carried them so far, optimizing even more the "extract ILP from a non very parallel processing friendly code".

This also got us with processors that for normal Desktop PC users ( who do not play video-games ) do not need.

IPF, on the PC space will have a big comeback: with the IA-32 EL the IA-64 cores can shed off all the x86 Hardware Compatibility Logic and they will be able to spend the extra budget in things such as multiple cores.
 
...

From current SOI devices at 90 nm, maybe... neither Prescott, nor products based on CMOS4 wil show you anything useful.
1. CMOS4 is not a real 90 nm process.
2. All the fab leaders with "real" 90 nm processes are having trouble with yield, power consumption. Traditionally, the power consumption level went down with each fab generation, but the opposite is happening with 90 nm process generation.(Desktop CPUs now threaten to exceed 100 W).

Hmm, I know Intel for one is still aiming to agressively ramp clockspeed. We haven't hit the "wall" yet...
Unlike Intel, SCEI is bound to single clockspeed, the clockspeed chosen at the time of PSX3's launch, and will stay with same clockspeed until the death of PSX3. So whatever the clockspeed SCEI obtains from the first 65 nm devices has great implications.
 
2. All the fab leaders with "real" 90 nm processes are having trouble with yield, power consumption. Traditionally, the power consumption level went down with each fab generation, but the opposite is happening with 90 nm process generation.(Desktop CPUs now threaten to exceed 100 W).

Depend where you look.

While Intel continues to have problems with the power consumed by its 90nm 'Prescott' processor - 100W at around 3.2GHz - IBM's own documentation claims the 90nm 970 eats 24.5W at 2GHz. By comparison, the 130nm 970, currently used by Apple in its Power Mac G5 desktop line, consumes 51W at 1.8GHz.
 
Back
Top