If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Registered
Join Date: Nov 2006
Posts: 3
|
I occaisionally hear hype about the Cell processor, but I wonder if it really has a chance. It seems to have fatal flaws.
The upside of the Cell is that it has 200 GFLOPS peak performance per chip. This performance number comes from each SPU running at 3.2 GHz, able to perform 4 multiplies & 4 adds simultaneously, which is 25 GFLOPS per SPU, times the 8 SPUs on the chip. I wonder if you can really get to 50% of peak performance. A single modern Intel core running at 3 GHz can do 4 multiplies or adds at the same time, which is 12 GFLOPS. It's not that hard to get to peak performance of an x86-64 [COLOR=#000080! important][COLOR=#000080! important]CPU[/COLOR][/COLOR]. That's 100 GFLOPs too. And you can put 8 of them in a cheap box. A major problem with the Cell is that it uses expensive XDR memory and you can only put 2 Gbytes on a node. That is very limiting. A Cell blade is very expensive, ~$10,000. And, Cell isn't improving as fast as Intel/AMD is. So, the Cell doesn't look that great with price/performance, it has limited memory, it has little software infrastructure, and uncertainty with its future. Does it have a chance? |
|
|
|
|
|
#2 | |
|
Member
Join Date: Jul 2007
Location: Russia
Posts: 96
|
It is possible to get 99% of peak performance on certain tasks like matrix multiply.
Quote:
x86 is not the most popular processor in the world, anyway. |
|
|
|
|
|
|
#3 | ||||
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
|
Quote:
Quote:
Since memory latency and bandwidth has become so important, the greater control Cell has for memory access is in many areas far superior to the current broadcast coherency schemes of x86 chips. It should also be noted that the x86 system that can hit 100 GFLOPS does so with two chips with TDPs of 120W. That's several times the TDP of one Cell. Power concerns are going to be dominant from now on, as it now constrains clock speeds, system footprint, and operating costs for a system. Quote:
I'll cover the volume and price considerations at the end of this. Quote:
The desktop? Basically none. Future game consoles? Maybe one of them. HPC? Probably the best chance it has for creating a niche, much in the way Blue Gene's processors have their own small space. Other fields? Maybe something here or there, but the support isn't all that enthusiastic. The primary reasons for doubt is that Cell so far has not realized the volume that commodity x86 has attained. Given market trends and costs, this may prove telling. The more likely outcome is that future x86 chips are going to copy most of what makes Cell perform so well, leaving Cell with little to offer.
__________________
Dreaming of a .065 micron etch-a-sketch. |
||||
|
|
|
|
|
#4 |
|
B3D Scallywag
|
I thought Core2 could perform 4 dual precision but 8 single precision operations per cycle (per core that is)?
__________________
PowerVR PCX1 4MB --> Voodoo Banshee 16MB --> GeForce2 MX200 32MB --> GeForce2 Ti 64MB --> GeForce4 Ti 4200 128MB --> 9800Pro 128MB --> 8800GTS 640MB --> Radeon HD 4890 1GB --> GeForce GTX 670 DirectCU II TOP 2GB |
|
|
|
|
|
#5 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
|
I was going by the DP throughput of a two-socket Yorkfield system which is roughly 100 GFLOPS, while the HPC Cell with enhanced DP throughput also tops out at ~100 DP GFLOPS.
edit: Cell would also have double the SP throughput over DP for the HPC version.
__________________
Dreaming of a .065 micron etch-a-sketch. |
|
|
|
|
|
#6 | |
|
B3D Scallywag
|
Quote:
I wonder if we'll see a new, beefier Cell before Nehalem arrives. I expect so but it would be strange to see a single socket x86 matching or exceeding Cell in peak floating point.
__________________
PowerVR PCX1 4MB --> Voodoo Banshee 16MB --> GeForce2 MX200 32MB --> GeForce2 Ti 64MB --> GeForce4 Ti 4200 128MB --> 9800Pro 128MB --> 8800GTS 640MB --> Radeon HD 4890 1GB --> GeForce GTX 670 DirectCU II TOP 2GB |
|
|
|
|
|
|
#7 |
|
Registered
Join Date: Nov 2006
Posts: 3
|
Where can I find more information on Cell boards with DDR2? The IBM web site doesn't have any.
The Core2 can do only 4 single precision operations per cycle and 2 double precision. It can't do simultaneous multiply & add, like the SPE on the Cell. But, it's hard to keep simultaneous multiply & adds busy. |
|
|
|
|
|
#8 | |
|
Friends call me xbd
Join Date: Feb 2005
Posts: 6,293
|
Quote:
Anyway this is the thread that would probably be your best introduction to the DDR2/HPC Cell: http://forum.beyond3d.com/showthread.php?t=40661 I'll mention also that it's this version of Cell that's going to go into Roadrunner. It's not available to the 'general' public right now, but as time goes on I'm sure you'll see it pop up. As to the original point of the thread, frankly I think Cell has done very well for itself considering it's a new architecture.
__________________
Somebody set up us the bomb. |
|
|
|
|
|
|
#9 | |
|
B3D Scallywag
|
Quote:
http://www.behardware.com/articles/6...-duo-test.html "Core uses two floating point calculation units, one dedicated to addition and the other to multiplication and division. Theoretical calculation capacity is 2 x87 instructions per cycle and 2 SSE 128 bit floating point instructions per cycle (that is 8 operations on 32 bit simple precision floating points, or 4 operations for double precision 64 bit floating points). Core is, in theory, two times faster for this type of instruction than Mobile, Netburst and K8." That would result in a theoretical peak of 96 GFLOPs for the fastest single socket CPU.
__________________
PowerVR PCX1 4MB --> Voodoo Banshee 16MB --> GeForce2 MX200 32MB --> GeForce2 Ti 64MB --> GeForce4 Ti 4200 128MB --> 9800Pro 128MB --> 8800GTS 640MB --> Radeon HD 4890 1GB --> GeForce GTX 670 DirectCU II TOP 2GB |
|
|
|
|
|
|
#10 | |
|
Registered
Join Date: Nov 2006
Posts: 3
|
I am a HPC software developer. My software is used on about $100 million of hardware. It's pretty important for new hardware to recruit HPC software developers.
I need to be careful about what I invest my time in. With the high cost of Cell boards, the limited memory, and the limited install base, I don't have confidence Cell will become mainstream for commercial HPC ($500K-$10 million clusters). I agree with the above comments that Quote:
And while I don't want to diverge this discussion on Intel hardware, the above Intel information is misleading. Yes, the Intel chips can work on a SIMD multiply and add at the same time, but they take more than a clock cycle. You can submit a SSE multiply but it takes 5 cycles to complete. 1 clock cycle after the submit, you can submit another SSE instruction, such as an SSE add, and they will work at the same time, but you won't get 8 flop throughput per cycle. You can only submit one SSE instruction at a time. |
|
|
|
|
|
|
#11 |
|
Regular
Join Date: Jun 2005
Posts: 24,905
|
stof, what kind of HPC software ? Is it media related ? or scientific computing ?
|
|
|
|
|
|
#12 | |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
|
Quote:
Core2 also has SSE units on 3 issue ports, 1 port for FADD, 1 port for FMUL, and 1 port for other ops. The peak number would seem to hold unless you can't find any non-dependent multiplies.
__________________
Dreaming of a .065 micron etch-a-sketch. |
|
|
|
|
|
|
#13 | |
|
Senior Member
|
Quote:
I'm a developer at the Minnesota Supercomputing Institute. Similar feelings about Cell. I really wish they would make development hardware cheaper to attract more attention. $10k isn't that much in the grand scheme of things, but it's not exactly throw away money either. Cell is a popular topic around here (MSI) mostly because it's neat and exotic. There are few people here that are actually doing any real work on them. Nite_Hawk |
|
|
|
|
|
|
#14 |
|
Grumpy Mod
Join Date: Dec 2004
Location: In a pretty pink padded cell
Posts: 25,988
|
For the sake of experimentation, isn't PS3 a suitable introduction to try things out and gauge performance? IBM's libraries support distributed processing over networked PS3's, right? So you could get 2 or 3 and try out some algorithms and see how well you think it manages for a grand or so. Less if you know a few PS3 owning mates who wouldn't mind lending you their PS3's to run a bit of Linux code on!
__________________
Shifty Geezer ... Tolerance for internet moronism is exhausted. Anyone talking about people's attitudes in the Console fora, rather than games and technology, will feel my wrath. Read the FAQ to remind yourself how to behave and avoid unsightly incidents. |
|
|
|
|
|
#15 |
|
Member
Join Date: Jul 2005
Posts: 627
|
There's little interest at RAL, given its commodity focused HPC efforts. DP performance of the PS3 is just not worth it, and the eventual HPC Cell products will be out of reach. And that's forgoing the problems how RAL operates in terms of library and application support. In fact it's probably the latter which has more influence. Neat and exotic just isn't in the language.
|
|
|
|
|
|
#16 | |
|
Now Officially a Top 10 Poster
Join Date: May 2006
Location: Maastricht, The Netherlands
Posts: 12,879
|
Quote:
|
|
|
|
|
|
|
#17 |
|
Junior Member
Join Date: Nov 2007
Location: Minnesota
Posts: 44
|
It's a good testbed, I think. I did a bunch of stuff on cell simulators early on, and the PS3's faster, even if it's not quite the same.
I was going to get one of the actual dev systems, but I never got so much as a call back when I tried to contact the nice folks at Mercury. Apparently, they're WAY too busy with important things to even bother to tell me that they don't want my business. |
|
|
|
|
|
#18 |
|
Now Officially a Top 10 Poster
Join Date: May 2006
Location: Maastricht, The Netherlands
Posts: 12,879
|
|
|
|
|
|
|
#19 |
|
Junior Member
Join Date: Nov 2007
Location: Minnesota
Posts: 44
|
Well, to be fair, I'm just some guy. I wasn't even affiliated with a company -- I just wanted a cell blade system because I do a lot of technical writing, and I could have taken it as a deductible expense, and PROBABLY paid for it with work eventually.
But I'm just one guy, there's no company involved, so I assume they just figured there wasn't enough business there to justify the effort. It's not as though, if I wrote a lot of articles about it, I'd come back and buy fifty or a hundred more. |
|
|
|
|
|
#20 |
|
Now Officially a Top 10 Poster
Join Date: May 2006
Location: Maastricht, The Netherlands
Posts: 12,879
|
Probably not, but if they were genuinely bored (i.e. not be at 100%+ work capacity), my guess would have been that they'd have gladly sold you one, precisely because you do write articles about it. That's just speculation on my part though.
|
|
|
|
|
|
#21 |
|
Junior Member
Join Date: Nov 2007
Location: Minnesota
Posts: 44
|
It might be. One of my coworkers dealt with them in another capacity once, and apparently they tend to blow off anyone who isn't likely to directly buy a LOT of hardware. I figure there's no reason for them to check that, out of a hundred people who said "I want to write about this", one particular guy might be a moderately successful writer whose articles might get read, when most of them are just dead blogs.
Still, it's sort of a shame. I really want one of those to mess around with. What Cell programming I've done has been neat, but I'd rather have a blade with real memory than a PS3 with 6 available SPEs and barely over 200MB to play with. |
|
|
|
|
|
#22 | |
|
Senior Member
|
Quote:
Nite_Hawk |
|
|
|
|
|
|
#23 |
|
Regular
Join Date: Jun 2005
Posts: 24,905
|
Where are you guys located ? I know of institutions with donated Cell Blades to encourage R&D activities.
EDIT: Oh... in Minnesota. Have you approach the schools for some value exchange (write about their programs in exchange for use of Cell and whoever are working on the Cell) ? I also know of an oversea location that allow companies to use their grid network and Cell blades for free (Some strings attached). |
|
|
|
|
|
#24 |
|
Junior Member
Join Date: Nov 2007
Location: Minnesota
Posts: 44
|
I'm in Minnesota, just ilke it says in the post.
The thing is, I'm not an "institution". I'm some guy. If I got a Cell system, it'd probably be in the basement about five or ten feet from the dryer. This is not an environment conducive to sales people drooling over the future sales prospects. |
|
|
|
|
|
#25 |
|
Member
Join Date: Jul 2007
Location: Russia
Posts: 96
|
seebs
it'd probably be in the basement Probably garage is a better place. Some great things started it's life in a garage |
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|