Clearspeed announces CELL-like processor

Paul said:
That must be a design fallacy of STI's.

Must be.

Atleast on a purely marketing point of view. I can imagine people like Chap running around come a year going,

"only 1ghz clock? xbox2 is 3ghz... :oops: :oops: :oops:"

Although I am truely doubting a 1Ghz clock..

never cared for clkspeed today, why should i tomorrow? ;)
neither do i care for floppies toppies, but more for the end results.




SO after all the FLOP talk, are we supposed to expect Cell to render audio, video and entertainment, beyond competing hardware at that time? Any guesses? Any visualisation on what a Cell console can do? The type of games it can create? The extras activities that it can power?

Anyone :?: :idea: :?:
 
Panajev2001a said:
I see them opting for this kind of configuration: 1 GHz for busses ( or 500 MHz DDR signaling ), 1 GHz for the e-DRAM ( or 500 MHz DDR signaling ), 4 GHz for the Register file, 4 GHz for the FP/FX Units, 2-4 GHz for the SRAM based LS ( 2 GHz would mean that there would be a bit more of latency involved with LS reads and writes from the Register file: 1 LS clock would be 2 FP/FX Units clocks so there would be an extra latency cycle for LOAD/STORE instructions which touch the LS, which is acceptable ) and 1-2 GHz for the PUs.

This way you see that we do not need the whole chip running at 4 GHz and this has a direct impact on the chips' overall power consumption.

Having multiple clock domains on the chip is not impossible ( Intel has been doing it for years and they are not the only ones ) and is one of the tricks I can see the STI guys pulling off ( plus tons of other tricks I am not even thinking about at the moment )

If you look at the current Pentium 4 clocked at 3.2 GHz the fast ALUs, the fast AGUs, the Register file, etc... are already running with a physical clock of 6.4 GHz ( double-pumping is real and not a DDR trick ).

Only a certain portion fo the CELL chip will run at 4 GHz in such a configuration, as I explained before.

I read what you wrote, and I agreed w/ the tiered memory architecture, with different speeds for different "layers" of memory. I was referring to the core speed. I don't expect STI to get 1:1 with Intel on clock speed, even core clock speed.

Also, I dunno what the read times for eDRAM are, but 1 GHz seems awfully fast - 1 ns read time? Do you have links to this?

(BTW, Transmeta announced their new processor today - busy news day. I'll read up on that tomorrow)
 
chaphack

never cared for clkspeed today, why should i tomorrow? ;)
neither do i care for floppies toppies, but more for the end results.

erm...... yeah sure.


SO after all the FLOP talk, are we supposed to expect Cell to render audio, video and entertainment, beyond competing hardware at that time? Any guesses? Any visualisation on what a Cell console can do? The type of games it can create? The extras activities that it can power?

Anyone :?: :idea: :?:


yeah, let me pull out my future-goggles so i can show u how PS3 games will look like...

we'll have to wait chappy, i know it's hard, but u can start finding other things to do to avoid the thought....
 
...

Yes, why did it die?
PixelFusion couldn't make the damned thing to perform. Hardware was completed, but software couldn't be.

To be honest, the PixelFusion approach is fundamentally different from CELL; it is an SIMD machine with only single stream of instruction while
CELL is a 16-way(PSX3 version) MIMD vector machine. PixelFusion architecture was many many times simplier than CELL yet its developers couldn't make it work. Now imagine the horror on developer's face when they are shown the PSX3 development kit for the first time.

Making CELL work requires nothing less than SCEI delivering a fully debugged and compilable engine source code in a CD, so that the developers don't have to touch the ugly hardware and only concentrate on creating the art to feed into the engine.
 
Re: ...

DeadmeatGA said:
Making CELL work requires nothing less than SCEI delivering a fully debugged and compilable engine source code in a CD, so that the developers don't have to touch the ugly hardware and only concentrate on creating the art to feed into the engine.


yeah we knew THAT, Deadmeat..... :rolleyes:

of course, a "fully debugged" source code will never happen, at least AT LAUNCH, but maybe that's just me....

and i'm amazed at your choice of words... i mean "ugly" hardware.... meh... from all points of view, apart from "being a b|tch to code for", CELL seems to crap over everything else in the same price band and then some... of course, we will have to wait and see the final implementation, it's easy to talk about something that DOESN'T FRIKKING EXIST...
 
There's a lot of ways to tier the memory, exactly what permutation they choose, I dunno. You guys who have backgrounds in computer graphics would have a better idea of the memory/bandwidth requirements. Multiple clock domains aren't impossible, I agree.

The interesting thing about this Clearspeed chip is it has the same kind of tiered memory scheme CELL has, 64 byte registers, 4k memory per PE, 128 SRAM scratchpad. And the fact that it is designed for massive parallel number crunching - just exciting stuff. I wouldn't mind having THAT for a coprocessor sitting next to my CPU on my motherboard...


Or how about a Micron Yukon processor?




Data Bandwidth: Memory Should Take 'Pride of Place'
By Peter Coffee
October 28, 2002





What looks like an insurmountable problem may just be a well-disguised assumption. The problem of data bandwidth across the border of a microprocessor chip may just be a symptom of the assumption—that the so-called central processor has to be at the center of the machine, with memory as a peripheral. What would happen if memory took "pride of place," with processing power arrayed around its edges?

Micron Technology Inc.'s Yukon device, soon to cross the line from concept to prototype, answers this question—not with a replacement for the CPU but with an additional system resource of distributed processing power that can take full advantage of the 200G-bps bandwidth inside a synchronous dynamic RAM chip.

In applications such as image processing and data mining, a relatively small number of operations are executed against a huge volume of data. Simple processing elements, placed as close as possible to the data, can receive their instructions from a more general-purpose device and then turn their power loose without constantly fighting cross-chip bottlenecks.

At 200MHz, the Yukon prototype should deliver peak processing rates exceeding 50 billion eight-bit operations per second or sustained processing rates of more than 200 million double-precision floating-point operations per second, according to the Microprocessor Forum presentation by Micron's Graham Hirsch, chief architect of the company's Active Memory Program.

"Memory is not the problem," said Hirsch, at the forum in San Jose, Calif. "The bus is the problem. Picking up data, moving it and putting it down is the problem. The bandwidth inside a memory chip is very large. There should be some way of using that."

http://www.eweek.com/article2/0,3959,652912,00.asp



Graham Kirsch of Micron then described his work on massive parallelism on a single chip. His chip, code named Yukon, is essentially a memory chip with logic attached. He argued that the vast bandwidth of SDRAM facilitates parallel processing. The Yukon chip is a .15 micron, 128 Mbit eDRAM combined with .18 micron logic. The memory is mapped onto the microprocessor bus, and the chip operates at 200MHz. Yukon is capable of a peak 51.2 billion 8-bit operations per second, and a sustained rate of 190 MB/second input/output. Yukon will transition to .11 micron in 2003, and will scale up to large arrays of chips. It should be ideal for image and video processing, speech, and datamining

http://www.geek.com/news/geeknews/2002Oct/bch20021016016851.htm

The Yukon chip boasts 256 tiny, eight-bit processors operating in parallel, with 16 megabytes of memory all on one chip. And while it's not powerful enough to replace the main microprocessor in a computer, adding the ability to crunch numbers on the memory chip itself can save power and time on computing tasks that typically involve moving the data from memory, to the main processor and back again.

"Yukon is intended to be an enhancement to a system. We're not trying to replace the main processor in a system," Kirsch says. He adds that the chip would be particularly useful in voice-over-Internet Protocol hardware and also in video processing applications.

http://www.forbes.com/2002/10/17/1017micron.html

Micron, one of the world's largest DRAM companies, announced the "Yukon", an array of memory units designed to simulate compute processes.

Since microprocessors must pull data from a cache or local memory to act on it, minimizing memory latency and bandwidth have become key design targets. The thinking, executives said, was that if you can't bring the memory to the processor, bring the processor to the memory. Like the company's other efforts, Yukon's goal is to spur customers to buy more DRAM.

"Now you can use the memory for more than just storing (data)," said Graham Hirsch, chief architect of the active memory program at the company's U.K. design center. "Memory's not just a bucket any more. "It's a resource which has processing potential."



http://www.tomshardware.com/technews/20021017_062400.html



By the way, does anyone besides me think Micron is making fun of Rambus technology by calling its chip Yukon? Rambus had named their technology Yellowstone and Micron counters by calling their extreme bandwidth technology solution Yukon. Microns engineers seem to have a good sense of humor.
 
well i guess the technology for VERY VERY high bandwidth is already here in some form. i'm soooooo looking forward to see it implemented in the next gen hardware... forget for a minute all this Tflop and polygons crap, with all this bandwidth, and whatever performance next gen chipsets will deliver, devs will be able to make our brains melt down, be it Graphics, sound or physics... i left gameplay out because i don't think bandwidth will necessarely improve on gameplay, although i'm sure some crazy head will find some time to innovate a rather stagnant part of videogames in the last 10 years
 
I wish there was a way of putting 32 ATI R3XXs in a single system, the size of a console, or at most, the size of a desktop computer. that would provide more graphics power than i could ever want. you could pretty much do midrange CGI on such a machine, sans things like raytracing.


Oh yeah, I'll just wait until SGI brings out their highest-end Onyx4 UltimateVision system :)
 
The transistor count can't be too high, since it's only 3W.

The reason I think CELL can do 1Tflops is because just by taking this Clearspeed chip, and clocking it to 2Ghz (10x clock speed), you can get 250 GFLOPS. That's already pretty close. Surely STI can make a chip that does four times more ops per clock cycle than a startup out of the UK...and that brings us to 1Tflop.

Obviously you don't believe it, so I'm not trying to convince you, just showing you where I'm coming from for these claims.


I see what you are saying here, and you make an exellent point.

hey,
I wonder what the chip has the highest FLOP per transistor ratio....
 
By the way, does anyone besides me think Micron is making fun of Rambus technology by calling its chip Yukon? Rambus had named their technology Yellowstone and Micron counters by calling their extreme bandwidth technology solution Yukon. Microns engineers seem to have a good sense of humor.

Probably :LOL:
 
Any guesses? Any visualisation on what a Cell console can do? The type of games it can create? The extras activities that it can power?

I've had estimations and diagrams of what a PS3 based off of Cell would be able to do giving into account the leap of graphics quality from the same type of game from psone to ps2 to ps3.
 
I wish there was a way of putting 32 ATI R3XXs in a single system, the size of a console, or at most, the size of a desktop computer. that would provide more graphics power than i could ever want. you could pretty much do midrange CGI on such a machine, sans things like raytracing.

Hmmm... we can still hope next gen consoles... can deliver... such perf... or more...
 
Micron's feelings towards RAMBUS can probably be summed up with this Moby Dick quote.

"To the last I grapple with thee; from hell's heart I stab at thee; for hates sake I spit my last breath at thee."
 
Re: ...

Josiah said:
The unification of shaders has to do with sharing of hardware resources, something Cell does not do.

:?: How does Cell not "share resources". You can run anything from Physics to basically any Shader program. It's the exact embodiment that, on a high-level, we'll one day see PC IC's processing.

To say that a Unified Shader is unlike what you can do with Cell is like saying that Icing doesn't work on Cake.

Cell can be compared to a giant FPGA controlled by software. The chip itself does very little inherently, it's just a fast platform for the applets ("cells") to run on.

Ok, now this is a very obtuse comment IMHO. So, by this very logic, the EmotionEngine (which is clearly a "father" to Cell) is also like a giant FPGA. As are the TCL-front-ends in the NV3x and R3x0 which have Vec processors arranged in a "loose" contruct. As will be the future IHV hardware with a Unified Shading Model running on them.

A DirectX style GPU OTOH has a clear graphics pipeline defined by silicon. With NV3X the chip is the architechture, with Cell the software is the architechture.

Ok, now this is so very wrong. DirectX is an abstraction, totally irrelevant to what's running below it, obscured by the driver. The NV3x is the shining example of my case IMHO that clearly shouts Bullshit! to your argument. And you can see the effect better "software" does to an architecture like it by looking at the latest benchmarks using their new driver, I think it's 52.xx. There was a massive, almost 50%+, improvement in preformance just by having the driver better arrange and mediate register use and data flow between DX and the underlying hardware. Which just demonstates the entire argument I'm making - I mean, GPUs have their own, basically complete, instruction set with DX9! The concept of a pipeline for non-'dumb' functions is dying; in with the concurrent micro-programs, down with the pipes.

Your argument was valid 3 years ago, but it's totally inadequate today.
 
Katsuaki Tsurushima; [url said:
http://eetimes.com/semi/news/OEG20031015S0027[/url]]The Japanese consumer electronics giant announced earlier this year that it will invest a total of $4 billion over the next three years in semiconductor-manufacturing facilities, and another $4 billion in R&D for key devices, including semiconductors, displays and batteries. The total includes investment plan for 65-nanometer process technology on 300-mm wafers, which Sony considers critical to the Cell processor it is designing jointly with IBM Corp. and Toshiba Corp.

The Cell microprocessor, expected to be the main product at a new 300-mm wafer fab to be constructed by Sony, is targeted to provide teraflops performance and consume relatively little power. The processor will be used in future versions of the company's Playstation game console, as well as in various broadband network nodes, according to Sony.
 
Vince said:
Katsuaki Tsurushima; [url said:
http://eetimes.com/semi/news/OEG20031015S0027[/url]]The Japanese consumer electronics giant announced earlier this year that it will invest a total of $4 billion over the next three years in semiconductor-manufacturing facilities, and another $4 billion in R&D for key devices, including semiconductors, displays and batteries. The total includes investment plan for 65-nanometer process technology on 300-mm wafers, which Sony considers critical to the Cell processor it is designing jointly with IBM Corp. and Toshiba Corp.

The Cell microprocessor, expected to be the main product at a new 300-mm wafer fab to be constructed by Sony, is targeted to provide teraflops performance and consume relatively little power. The processor will be used in future versions of the company's Playstation game console, as well as in various broadband network nodes, according to Sony.
Dang thats alot of money if the ps3 tanks . Lets hope there are no acts of god that destroy those plants and cause the ps3 to fail .


Oh and clock speed doesn't matter for the end result . It only matters when compareing the same chips . Higher mhz is allways better . But chip to chip alot more plays into it . Look at the athlon and the p4.

I doubt the cell chip will be 1tflop . It will be high for the time but not 1tflop. I would put it around 500gflops at the most and most likely 250 gflops sustained and it will still be extremly impressive .
 
I don't expect STI to get 1:1 with Intel on clock speed, even core clock speed.

Intel had nice portions of their chips running at 4 GHz in 2001-2002 when the Pentium 4 at 2 GHz was announced: the two fast ALUs, the two fast AGUs, the Register file, etc... were running at the physical frequency of 4 GHz ( the clock was doubled locally, there is such a thing as a fast clock generated on the Pentium 4 ).

Intel in 2003 will have a 3.2+ GHz processor in which you will have significant circuitry running at 6.4+ GHz ( ALUs, AGUs, Register file, etc... ) and I see them pulling off 4 GHz in 2004 if x86-64 gives them a hard time.

For late 2005 Intel would be almost approaching 5 GHz or more depending how much AMD pushes them.
 
jvd said:
Vince said:
Katsuaki Tsurushima; [url said:
http://eetimes.com/semi/news/OEG20031015S0027[/url]]The Japanese consumer electronics giant announced earlier this year that it will invest a total of $4 billion over the next three years in semiconductor-manufacturing facilities, and another $4 billion in R&D for key devices, including semiconductors, displays and batteries. The total includes investment plan for 65-nanometer process technology on 300-mm wafers, which Sony considers critical to the Cell processor it is designing jointly with IBM Corp. and Toshiba Corp.

The Cell microprocessor, expected to be the main product at a new 300-mm wafer fab to be constructed by Sony, is targeted to provide teraflops performance and consume relatively little power. The processor will be used in future versions of the company's Playstation game console, as well as in various broadband network nodes, according to Sony.
Dang thats alot of money if the ps3 tanks . Lets hope there are no acts of god that destroy those plants and cause the ps3 to fail .


Oh and clock speed doesn't matter for the end result . It only matters when compareing the same chips . Higher mhz is allways better . But chip to chip alot more plays into it . Look at the athlon and the p4.

I doubt the cell chip will be 1tflop . It will be high for the time but not 1tflop. I would put it around 500gflops at the most and most likely 250 gflops sustained and it will still be extremly impressive .

I posted the same ind of news ( only with links to Sony's own site ) a while ago and jvd notices it now ?

This was all present in Sony's IR website for a good while and I have repeated over and over the $4 Billions invested in CELL R&D and the $4 Billions invested for Semiconductor R&D all over the course of the next three years.

:( no recognition for the good ol' Panajev :(

I like Vince's link as it puts, one more time, CELL together with PlayStation 3 even though I would think that by now there would be no doubt about that :p

CELL will not be only used for PlayStation 3 and that investment is needed to help Sony save money in the future and mantain their competitive edge technologically wise: if CELL and other chips produced thanks to these investments allow Sony to produce most of the ICs it now buys from third parties ( $2+ Billions worth of ICs each year ) the whole R&D investment will kinda pay for itself if you think about it.
 
Panajev2001a said:
For late 2005 Intel would be almost approaching 5 GHz or more depending how much AMD pushes them.

Intel's 65nm processor is destined to start @ 10GHz and run from there I thought. Time is irrelevent, by this I mean Intel could delay it's introduction due to market conditions (eg. AMD sucking), but the fact that the techology can yeild a 10+GHz IC is what's important.

Panajev said:
posted the same ind of news ( only with links to Sony's own site ) a while ago and jvd notices it now ?

Yeah... but I posted this one.. < runs > ;)
 
Back
Top