Does Cell Have Any Other Advantages Over XCPU Other Than FLOPS?

dodo3 said:
Where did ya'll learn this stuff from? I'm quite interested but I am not sure where to start. I've actually read a summary of the Cell processor but barely understand how it will work with the RSX and XDR RAM and the possibilities of bottlenecks... can someone please point me in the right direction!? :cry:

I say you've to start at the basics:

howstuffworks.com
offers simple information(sometimes they err a bit, but for the most part they tend to get things right.), and offer further links.

http://arstechnica.com/articles.ars
The articles here offer more information.

http://www.beyond3d.com/articles/index.php
This very site has some articles that could clear things up especially with regards to gphx stuff.

There's also:
wikipedia.org
One of my fave sites

Amazon.com or the border's bookstores if you'd like to get a book on the subject... and of course let's not forget college/university.

Many many other sites offer more info, even very detailed info, but all of this requires time...
 
PeterT said:
Most of them didn't learn anything. They just repeat the random garbage spouted by thousands of other monkeys like them on hundreds of forums. You can identify bullshit posts by some factors: using a lot of generalizations, absolutes like "can't" or "doesn't" in reference to the capabilities of a CPU to run some software, and talking about "performance", providing percentage figures without clarifying what kind of processing they are talking about.

I have a BS in CS and am currently working on my Master's, have designed a (very simple) CPU and love hardware details and low-level programming in general and 3D in particular and wouldn't dare making even half of the assertions thrown around here. So, either all the people here are in fact EEs in chip design or hardcore game devs (I know that a few actually are), or most of them are just bullshitting. Take your pick.

Well, if you only want to join the console wars then you're already overqualified. If you really want to learn about this stuff then I can't really tell you how or where short of an university ;), but I can tell you that a console forum, even this one, is probably the wrong place to start.

Sorry for the rant, but some posts in this thread are truly inane.

Great post.
 
Edge said:
What a bunch of nonsense. CELL has no problems at all running 8 "general purpose" threads, that are far more effective than the 360's CPU secondary threads running on each CPU.

Xbox 360 CPU: 3 main threads, and 3 secondary threads.

CELL: 8 main threads, and only 1 secondary thread running on the PPE.

Secondary threads only add about a 10 to 20 percent performance advantage over primary threads. CELL has 8 computing cores, and Xbox 360 CPU has only three cores!

Exercise caution...the 10-20% comes from PC applications that are not specifically designed with SMT in mind. The performance increase of SMT from intelligently designed and complementary threads can exceed 50% easily.

CELL is SUPERIOR to the Xbox 360 CPU in many ways, especially in floating point performance (twice as powerful).

Downplaying the SPE's by claiming they cannot run general purpose threads is FALSE!

...

Just a perfect sign, you don't understand the technology at all, and a misleading statement.
The last line of your post summarizes your own post. You took an assertion regarding "general purpose" computing (which I take to mean integer and logic processing) and avoided the issue completely, instead sounding like a marketing droid while producing nonsense numbers such as aggregate memory bandwidth.

You need to do some serious research into the integer and logic capabilities of the SPUs (or lack thereof) to address the poster's claim, something you clearly have no understanding of. Please spare us with the strawmen and marketing speak.
 
PeterT said:
I have a BS in CS and am currently working on my Master's, have designed a (very simple) CPU and love hardware details and low-level programming in general and 3D in particular and wouldn't dare making even half of the assertions thrown around here. So, either all the people here are in fact EEs in chip design or hardcore game devs (I know that a few actually are), or most of them are just bullshitting. Take your pick.
.

Actually there were a few threads here a while back talking about CPU physics and offloading geometry on the SPE's. I learned most of what I know from there. Which admittedly isn't much.
 
Last edited by a moderator:
PeterT said:
Most of them didn't learn anything. They just repeat the random garbage spouted by thousands of other monkeys like them on hundreds of forums. You can identify bullshit posts by some factors: using a lot of generalizations, absolutes like "can't" or "doesn't" in reference to the capabilities of a CPU to run some software, and talking about "performance", providing percentage figures without clarifying what kind of processing they are talking about.

I have a BS in CS and am currently working on my Master's, have designed a (very simple) CPU and love hardware details and low-level programming in general and 3D in particular and wouldn't dare making even half of the assertions thrown around here. So, either all the people here are in fact EEs in chip design or hardcore game devs (I know that a few actually are), or most of them are just bullshitting. Take your pick.

Well, if you only want to join the console wars then you're already overqualified. If you really want to learn about this stuff then I can't really tell you how or where short of an university ;), but I can tell you that a console forum, even this one, is probably the wrong place to start.

Sorry for the rant, but some posts in this thread are truly inane.

Questionable post, as your hardly contributing to the discussion now are you?
 
Last edited by a moderator:
Asher said:
The last line of your post summarizes your own post. You took an assertion regarding "general purpose" computing (which I take to mean integer and logic processing) and avoided the issue completely, instead sounding like a marketing droid while producing nonsense numbers such as aggregate memory bandwidth.

You need to do some serious research into the integer and logic capabilities of the SPUs (or lack thereof) to address the poster's claim, something you clearly have no understanding of. Please spare us with the strawmen and marketing speak.

You could not be more wrong. I well aware, that all integer code has to be vectorized to run on the SPE's, but just because they have to be vectorized, does not mean it cannot run "general purpose" code. The conversion to vector code can occur during compile time, and it's only if an integer result is needed, that inefficiencies creep in.

I provided some FACTS about the CELL capabilities, but if you want to call that marketing talk, fine by me.

What do you have anything to provide here?
 
Last edited by a moderator:
Asher said:
Exercise caution...the 10-20% comes from PC applications that are not specifically designed with SMT in mind. The performance increase of SMT from intelligently designed and complementary threads can exceed 50% easily.

That seems highly unlikely. The dual threading of the Xbox 360 cores is seperate register banks to allow fast context switches. That's great, but I can't imagine a 50 percent increase. Can you provide any evidence of this?
 
Don't most CPUs (including Xenon, with it's VMX unit having separate units for Dot product, vector float, scalar float, and vector simple) have multiple "execution units"? In which case, multithreading might just be switching threads in and out of context to hide latencies(correct?) but if you programmed for it, you may be able to avoid using the same execution unit at the same time in both threads all the time. Which could give you gains greater than just achieving full efficiency, for example.
 
scooby_dooby said:
Some might say CELL is FLOP overkill, and XeCPU is a much more balanced solution, providing a traditional CPU/Multi-threading approach with very beefed up FLOP capabilities.

People act like XeCPU has no FLOP capabilities, in fact it's very high for a conventional CPU, it's just not as high as the theoretical peaks of the yet untested CELL.

I don't think that's correct at all. I don't think you can ever have enough FLOPs power, but there's always a limit to how much general purpose performance you need. AI, for example, will only slightly grow from generation to generation. AI is currently not limited by what our hardware can do, at least in its current form (script and event based). Physics and Graphics, on the other hand, are absolutely limited by what a CPU can do with floating point calculations.

In other words, more general purpose performance really isn't going to help a videogame application that much, if at all. More floating point performance, on the other hand, will help it no end. Microsoft knows this, and they did sacrifice GP for FLOPS in their CPU design, but were not able to get as much FLOPS power out of their CPU as with the Cell. That's the long and short of it.

Any game developers care to comment on what I've said? I am an IT student, aiming to get into videogame developement, but I've not yet developed any commercial quality games.
 
CELL:

220 Gflops
450 Gints
64 threads
6 cycle MADD
6 cycle LocalStorage

XENON :

75 Gflops
150 Gints
6 threads
10 cycle MADD
5-40 cycle cache
 
Barry_Minor :

IBM's SPE XLC compiler is adding the function to compile to register ranges which would enable the threading model I talked about. We have coded examples of this in SPE asm to validate the concept.

The multi-threading example you sited is another way to cover up DMA latency (the most common being multi-buffering). This can be implemented in software on the SPEs by segmenting the large register file into smaller ranges, compiling different threads for each register range, and switching (branching) to a different thread after each thread issues a DMA read. The threads stay resident in local store (no context switching), thread switching is light weight (1 cycle branch), and with some clever programming you can even defer the switch based on the DMA tag being ready (BISLED). If you're memory bound and can't predict your memory references ahead of time this is a good solution as you could write your code for size instead of speed and pack 4-8 threads in each SPE local store.
 
scooby_dooby said:
People act like XeCPU has no FLOP capabilities, in fact it's very high for a conventional CPU, it's just not as high as the theoretical peaks of the yet untested CELL.

...funny how you're forgetting to mention that you are comparing them with theoretical peaks of the yet untested XeCPU as well... oh wait! Untested? Both CELL and XeCPU have been in the hands of developers at this point. They're both as equally "tested" as "untested" - we just don't know how well they are / can be utilized at this point - but we'll never know as it is in each developers hand to extract the performance and the architectures potential. Having said that, comparing two architectures on their potential performance is the only thing we as outsiders can do. Since this is Beyond3d, a forum that aims to discuss 3d architectures and their potential advantages and disadvantages, I suggest you start taking part of the discussion at hand instead of downplaying each and every potential advantage CELL has to offer.
 
scooby_dooby said:
Not taking IBM's word for it? Who wrote the paper that was presented at ISSCC? IBM. Who wrote the white-papers that tehse articles are based on? IBM. Everything we know about CELL is from IBM, of course we are taking their word for it.
I thought you meant we couldn't trust info like flops figures. Are you trying to tell me we can't trust chip architectures either? Are you suggesting Cell doesn't really have 256kb of local storage per SPE or at least we shouldn't believe it until we've cut up a chip ourselves? Just because IBM give use such details, and ringbus speeds, and all that jazz, none of it's believable? Yuo also think they provided the ISSCC with a load of bunkum specs and imaginary numbers??

People act like XeCPU has no FLOP capabilities, in fact it's very high for a conventional CPU, it's just not as high as the theoretical peaks of the yet untested CELL.
Hang on. How do you know XeCPU has a very high FP capability? Don't want to worry you but the info we have on XeCPU comes from the same source as the info on Cell. You only believe XeCPU has a high flop capability because IBM have told us as such. If you're not willing to take IBM's word on Cell's performance, why take their word on XeCPUs?

"Don't believe anything you hear about Cell. It'll never reach the performance claims IBM have made for it. IBM just make up numbers to push there hardware. But XeCPU, that's an impressive chip, that'll achieve great FP performance I'm sure even though it's untested, because IBM have supplied the specs and it can do 100+ GFLops, but Cell's 200+GFlops from the same source is imaginary numbers we can't believe."

That strikes me a pretty hypocritical attitude. Plus you go on and on about Cell being untested, but that's nonsense. We have real-world programs running on it, showing how the chip's architecture facilitates a high feed to the ALU's and keeps it churning out the numbers. The fact that it can process lots of floats means in games, it will, because developers will write software that uses lots of floats. You don't need to see a working game to believe that'll happen. Software developers aren't going to be writing PC code for the PS3 and watching it crawl along and leave that floating capability untouched, anymoreso than they will with XB360 or any other system. You write to the hardware.

The worrying thing though is that if you're not willing to look at hardware specs and details and try to work out how good or bad it is, what are you doing on this forum?! That's exactly what this forum is supposed to be about! Looking at white papers and patents and discussing how hardware tries to solve problems and where we think it's taking a good approach or a bad approach. If you feel that's impossible and won't accept any such discussion as valid, surely you shouldn't be here? What do you hope to gain from being here, and what do you hope to contribute?
 
Last edited by a moderator:
version said:
Barry_Minor :

IBM's SPE XLC compiler is adding the function to compile to register ranges which would enable the threading model I talked about. We have coded examples of this in SPE asm to validate the concept.

The multi-threading example you sited is another way to cover up DMA latency (the most common being multi-buffering). This can be implemented in software on the SPEs by segmenting the large register file into smaller ranges, compiling different threads for each register range, and switching (branching) to a different thread after each thread issues a DMA read. The threads stay resident in local store (no context switching), thread switching is light weight (1 cycle branch), and with some clever programming you can even defer the switch based on the DMA tag being ready (BISLED). If you're memory bound and can't predict your memory references ahead of time this is a good solution as you could write your code for size instead of speed and pack 4-8 threads in each SPE local store.

Since this is a pissing contest (and nothing else) why stop at 8 threads ? Why not make 64 two-register contexts/threads per SPE ? Then the SPEs could provide 448 threads!!!!

Oh and by the way, the same mechanism can be used on both the PPE and the XCPU, and probably will be (vertical threading anyone?)

Cheers
Gubbi
 
SynapticSignal said:
Cell with integer and general purpose takes a hit of 90%
Cell can't do 9 general purpose threads
Cell can do 2 general purpose threads
spu are not general purpose cores
Cell has only one PPe with little cache, this, with the in order question, makes of cell a low performance cpu for gaming and pc general uses, high performance cpu for multimedia-streaming tasks
the spu can do some work as physics, not others, like game code, or ai, rich of logical jumps
developing Cell is "pain in the ass" (courtesy of Carmak, Gabell)

the cpu of ps2 have many more Flops the the celeron of xbox1, but celeron put it to the dust, so I don't jump in the sony's hype chariot of "fantaflops"


This is just nonsense. Define 'general purpose'? Again if it's integer, then Cell has the higher theoretical* max. Your regurgitated Major Nelson-esque information comes from the school of thought that calls SPEs DSPs, discounts them immediately, and focuses solely on the abilities of the Power cores.

*Scooby very true, I should have qualified my previous post to add 'theoretical.' XeCPU's integer power should be more readily accessible, but the SPE's will be brought to bare slowly but surely as the gen progresses.
 
Last edited by a moderator:
PeterT said:
Most of them didn't learn anything. They just repeat the random garbage spouted by thousands of other monkeys like them on hundreds of forums. You can identify bullshit posts by some factors: using a lot of generalizations, absolutes like "can't" or "doesn't" in reference to the capabilities of a CPU to run some software, and talking about "performance", providing percentage figures without clarifying what kind of processing they are talking about.

I have a BS in CS and am currently working on my Master's, have designed a (very simple) CPU and love hardware details and low-level programming in general and 3D in particular and wouldn't dare making even half of the assertions thrown around here. So, either all the people here are in fact EEs in chip design or hardcore game devs (I know that a few actually are), or most of them are just bullshitting. Take your pick.

Well, if you only want to join the console wars then you're already overqualified. If you really want to learn about this stuff then I can't really tell you how or where short of an university ;), but I can tell you that a console forum, even this one, is probably the wrong place to start.

Sorry for the rant, but some posts in this thread are truly inane.

Peter I sort of have a problem with this post. It's not that I take issue with the premise you based it on per se, but it's a little 'short-thrift' of you to write off everyone that doesn't have a degree in electrical engineering or comp-sci. Do you know anything about politics? Probably you do. Do you have a degree in political science? Well I'm thinking you don't. What about history, or biology? Or cars (as many on this forum are want to discuss)? A genuine interest in a certain facet of knowledge and the desire to learn can lead to quite a bit of independent exploration and gain, and I think that it would be wrong to limit your 'credibility' criteria simply to what degrees or formal training one holds. The message being more important than the messenger, and not just in technical matters. In fact your own conclusion to your post implies the same; you take a post from a poster with little technical expertise, and try to encourage his desire to learn. Granted you favor he go to a university should he truly wish to pursue it, but I think you grant that he might learn some even outside of that environment. ;)
 
Titanio said:
Define "general purpose"?
I define general purpose processing as not geared toward a particular type of processing (i.e. streaming) but well-suited for a wide array of processing styles. A jack-of-all-trades and a master of none.

For example, there will be A LOT of multiplatform games next generation. These games will not necessarily be fine-tuned to one hardware implementation over another. The engines will be hardware agnostic, taking the common denominator into account, which across all next gen consoles seems to be a single Power Processor core with some cache. Any additional hardware like full cores or SPE's have rather menial duties. But other than that we have one PPC doing the heavy lifting.
 
so only when used properly with extensive effort and budget will devs get the most out of cell? i dont see how that is good when devs are trying to keep costs to a minimum in all but the most elitest of games such as GT5. If 360 is going to be the main platform for many of multiconsole games then it doesn't bold to well for cell. just look at this gen, most of the devs that use the ps2 as the main platform to dev for barely tapped the GC's and Xbox's more advanced features.
 
Back
Top