A glimpse inside the CELL processor

Shifty Geezer said:
Just write the data to an address in RAM from one core, and read it from that address in the other.

Sounds very producer/consumer-y to me ;) If you have two different producer/consumer threads working at the same time, you may want communication mechanisms between the two to signal the status of the buffer etc.

My point is, thread communication mechanisms existed long before SPEs did, and for good reason.

Shifty Geezer said:
How can physics on one core and AI on another work better passing data between rather than working on their own? Or whatever other tasks there are.

It depends how you design your code. You could have everything preprocessed before doing your AI for example , so it has all the data it needs before it starts working. Or you may have your physics threads feeding data to your AI threads as it works stuff out. Either way, there's certainly 'shared concerns' between something like AI and Physics (and pretty much everything really). But how much you would have threads talking to one another directly is really up to your design and requirements I guess.

My points, btw, have nothing specifically to do with PS3 or 360. I don't know how they compare as far as support for thread communication..I know it's desireable to have certain things supported by hardware, but I also couldn't say how much is really necessary versus "nice".
 
Last edited by a moderator:
Titanio said:
My points, btw, have nothing specifically to do with PS3 or 360. I don't know how they compare as far as support for thread communication..I know it's desireable to have certain things supported by hardware, but I also couldn't say how much is really necessary versus "nice".

Afaik there simply is no such thing in the Xenon. I think it was MS's idea to split up tasks on independent cores and let them each do their job. I guess the reason behind this is becauses the cores are just derivates of the CELL PPU and thus have no communication hardware to handle this task (in contrast to CELL).

The only way i could think of is by locking the cache for a period of time and do communication this way (if it's possible at all, at least it is for Xenos to retrieve vertex data this way). But this is a pretty nasty workaround and unefficient as hell :)
 
You're telling me threads on one core can't signal a threads on others or whatever..? Is hardware support required for that? I know it's good to have hardware atomic operations like CAS etc. but I'm not sure about signalling or whatever.
 
Titanio said:
You're telling me threads on one core can't signal a threads on others or whatever..? Is hardware support required for that? I know it's good to have hardware atomic operations like CAS etc. but I'm not sure about signalling or whatever.

There is no interconnection bus in the Xenon. Threads cannot share any data and have no access to each other, they don't even "see" each other afaik.

The only way to handle this is by going through the main RAM (which is very slow) or by cache looking (if that works, which im not sure of).

cmiaw :)
 
elementOfpower said:
As of now, developers are offered a console (the 360) which is already popular and extremely easy to program for compared to what the Cell will be.

We'll need some multi-platform devs to provide some feedback to validate this assumption.

elementOfpower said:
When the PS2 came out, it only had to compete with Dreamcast. When the PS3 comes out, it will have to compete with the 360 & Gears of War, with Halo 3 on the horizon.

You're giving me the impression that the PS2 didn't have much competition vai the DC. I the see the X360 still very focused on the FPS and PC centric games (due to the current games), where the PS3 covers quite a wider range similar to the PS2.

elementOfpower said:
I think Sony really has their work cut out for them but I hope they pull it off again this gen. They had better get a few more titles on the release list that look worth buying, though. Right now, Fall of Man is the only one worth picking up.

That depends on the gamer... but personally MSG4, FFXIII and a few others are worthy system sellers...
 
Getting good performance from the X360 (as in, not just higher res Xbox games we seem to get a lot these days, but properly next gen games that take advantage of the architecture) is hardly "easy", or there wouldn't be such a gap in what some games show compared to the higher budget ones.

It's obvious that both consoles will be hard to fully exploit, so please don't go around telling people how the X360 is "so much easier" to develop for, cause really it isn't.

Budgets will be high for either console regardless of Cell. Heck, PS3 uses OpenGL which is hardly "difficult" for people who have been using that for years.

Like it will be hard to get to grips with the highly parallel architecture of Cell, it will be hard to get amazing performance out of the XCPU due to the 3 cores.

GPU-wise, they're both rather "easy" in that a lot of devs are used to PC GPUs.
 
Jesus2006 said:
The only way to handle this is by going through the main RAM (which is very slow) or by cache looking (if that works, which im not sure of).
Only if your accessing the different memory pools such that the L2 cache doesn't do it's job effectively. The fact you're using main memory to communicate means you're sharing an address space for that data, which means it'll be cached. Only if you're working on the data from the same memory location at very different times is it not going to be resident in cache. And if your doing that, chances are you're doing it for a reson. eg. You need to run through the physics completely to determine entities' positions before you can run AI on the situation to decide responses - in such a case you can't share object position data directly from the physics thread with the AI thread, and the relevant data won't be in cache when the AI thread goes looking for it.
Titanio said:
But how much you would have threads talking to one another directly is really up to your design and requirements I guess.
True. From where I'm sitting though, I can't see many situations with conventional game engines where you'd want that. Most tasks are dependent on previous tasks being run to completion to give the world view. You have to have done all the user input before you work on animation. You'll have to have all the animations done before you calculate collisions. You'll need all phsyics complete before you evaluate the situation in your AI. And you'll need the AI events completed before you do sound. If you don't do the tasks serially, you could miss out on events, calculating collisions incorrectly because you're comparing animated meshes from the previous frame with an updated animation this frame of the current object.

Still, I'm not writing game engines, so maybe i'm missing the trick!
 
london-boy said:
Getting good performance from the X360 (as in, not just higher res Xbox games we seem to get a lot these days, but properly next gen games that take advantage of the architecture) is hardly "easy", or there wouldn't be such a gap in what some games show compared to the higher budget ones.

It's obvious that both consoles will be hard to fully exploit, so please don't go around telling people how the X360 is "so much easier" to develop for, cause really it isn't.

Budgets will be high for either console regardless of Cell. Heck, PS3 uses OpenGL which is hardly "difficult" for people who have been using that for years.

Like it will be hard to get to grips with the highly parallel architecture of Cell, it will be hard to get amazing performance out of the XCPU due to the 3 cores.

GPU-wise, they're both rather "easy" in that a lot of devs are used to PC GPUs.

No doubt, getting really good performance from either console will be hard, if nothing else because of multithreading, especially as there will be quite a few threads to keep managing. Maybe the symmetrical nature of Xenon and maybe better tools from MS might make it less hard to develop for it compared to Cell, but in no way easy, if you are out to take the performance to a maximum.

As for the GPU, there I will give the upper hand to RSX for easyness. Sure the Xenos can be used as a "normal" GPU, but if you really want to take advantage of its strengths then you will have to work harder compared to RSX...
 
Shifty Geezer said:
Only if your accessing the different memory pools such that the L2 cache doesn't do it's job effectively. The fact you're using main memory to communicate means you're sharing an address space for that data, which means it'll be cached. Only if you're working on the data from the same memory location at very different times is it not going to be resident in cache. And if your doing that, chances are you're doing it for a reson. eg. You need to run through the physics completely to determine entities' positions before you can run AI on the situation to decide responses - in such a case you can't share object position data directly from the physics thread with the AI thread, and the relevant data won't be in cache when the AI thread goes looking for it.

So you can never rely on it and you might run into a situation where cache runs out (not too hard to a achieve on a 6 thread machine with only 1 meg cache) and the games will stutter in worst case (maybe that's a reason for the bad framerates of almost all 360 games?)
 
Jesus2006 said:
So you can never rely on it and you might run into a situation where cache runs out (not too hard to a achieve on a 6 thread machine with only 1 meg cache) and the games will stutter in worst case (maybe that's a reason for the bad framerates of almost all 360 games?)

Or maybe because it is a new console, with mostly ports and hardly any games that were built from the ground up for it? And I didn't know that more or less all the games on it have frame rate problems...
 
Cache is reliable if you work within it's limits. There was a thread yonks back where someone measured cache hits/misses, and it was very good indeed. I know that if I write a vector to #0036FF5BA or whatever address, and then the next cycle in a different core read from #0036FF5BA, it'll still be in cache. It won't get removed unless the cache logic deems it's not important, based on patterns of access and recent use. Now if your other 4 threads and hammering the cache so everything is being flushed out and refreshed, yeah it'll have issues. In such a case sharing data across threads will likely be the least of your worries!
 
Jesus2006 said:
Sorry, but the multithreading on the 360 aint much different than on the PS3. There are even some drawbacks on the 360 which have to be solved by complicated means (e.g. communication between threads is hardly doable, whereas the same is supported "in hardware" on the Ps3).

I'm no technical guru, but in every article I've read so far "the challenge of programming for cell" has been discussed. It seems to be that for that to be mentioned there must be some glaring challenges. It's a brand new architecture. Multi-core processors, like are in the 360, have been out in the PC world for quite a while; the Cell hasn't.

The compiler is a Metroworks, same on the 360 (afair).
And OGL aint much different from DX. So it's not really fair to say 360 is "extremly easy" to program for, but if it's the case, the same would apply to the PS3 :)

Go back and read; I didn't say that. I said "compared to the PS3" it's "extremely easy to program for."

I could be wrong about that, but at least quote me in context.

And Gears aint that technical wonder that one might expect (speaking of framerate and AA).

Uhhh, ok.

When i bought my 360 last year in November, all of the launch games had some technical drawbacks, like bad framerates or no AA.

I seriously disagree with you here. Kameo has a steady framerate, plenty of AA, and HDR to boot, and it looks amazing. Top Spin 2 is the same. I will agree with you that several launch games had their problems (Perfect Dark, GRAW, etc.) but to say "ALL" is being overdramatic.

Currently i cannot see this happen on the PS3, as titles seem to be much more advanced and technical superior over what the 360 is to offer, so im really looking forward to it :)

You must have access to pictures and video clips that I don't, because I've yet to see a title coming out for the PS3 that looks as good as Dead Rising, Lost Planet, or Gears of War. In fact, I've yet to see a title that looks even as good as Kameo, which was a launch title. I don't see this "technical superior[ity]" you're talking about.
 
elementOfpower said:
You must have access to pictures and video clips that I don't, because I've yet to see a title coming out for the PS3 that looks as good as Dead Rising, Lost Planet, or Gears of War. In fact, I've yet to see a title that looks even as good as Kameo, which was a launch title. I don't see this "technical superior[ity]" you're talking about.

:rolleyes: how about you check out IGN’s HD footage of metal gear solid 4 then naughty dog's new game along with factor 5 's lair and of course Heavenly Sword.
 
elementOfpower said:
I'm no technical guru, but in every article I've read so far "the challenge of programming for cell" has been discussed. It seems to be that for that to be mentioned there must be some glaring challenges. It's a brand new architecture. Multi-core processors, like are in the 360, have been out in the PC world for quite a while; the Cell hasn't.
I think this is more a slant of the coverage of these processors. Cell is a potentially wide-reaching platform, represents a shift in technology that will be finding similarities in other upcoming designs, and thus is garnering attention for investigation. XeCPU has mostly been overlooked in contrast. It's sufficient for whoevers interested (gaming press only) to say is a PowerPC varient with 3 cores, without delving into the particulars of writing for XeCPU. To put it another way, where you have stories about Cell being hard to program for, you don't have many saying how easy XeCPU is to program for. Devs have commented on the ease of porting code between symmetrical cores, but I haven't heard any talk about (good or bad) how effectively the cache works, RAM is shared, dealing with IO execution, dual-threading per core, feeding the vector units to keep them churning out those Flops, and so forth - the silence regards XeCPU isn't so much that it's easy, but it's overlooked.

Looking at the hardware, and leaked docs from MS, there are common issues shared across both architectures. When one of these is a problem for Cell, it's also a problem for XeCPU. If more people talked about XeCPU you'd probably be hearing as much I think. The problems it doesn't share with Cell are memory micromanagement (automatic cache on XeCPU) and a new instruction architecture (PPC instructions on XeCPU). Cell adds a complexity in writing for it. Both show similarities in concerns regards using the processors efficiently. I think another factor for Cell is how open-ended it is. There's different memory access models, the possibility of using software caching, and the like. More choices adds to complexity, but they're to do with efficiency. The development process could be kept in it's simplest, but then you're letting a whole load of potential go to waste.
 
Darkon said:
:rolleyes: how about you check out IGN’s HD footage of metal gear solid 4 then naughty dog's new game along with factor 5 's lair and of course Heavenly Sword.

And not everyone has access to IGN's HD content. ;)

I would check out Gametrailers.com. They're still free.
 
Alstrong said:
And not everyone has access to IGN's HD content. ;)

I would check out Gametrailers.com. They're still free.

Gametrailer hasn't got the same quality trailer as the IGN’s one and neither does Gamespot from what I have seen.
 
Shifty Geezer said:
... dual-threading per core...

I remember an article/interview of CryTek talking about this issue, in which the compare the Xenon dual threading to a P4 HT and say it's a little worse, whereas the PS3 dual threading is a little better than the P4's (atlough only one core of course).
 
Darkon said:
:rolleyes: how about you check out IGN’s HD footage of metal gear solid 4 then naughty dog's new game along with factor 5 's lair and of course Heavenly Sword.
minus Heavenly Sword, you're pointing to movies not games as those on 360 referenced above.

let's just wait until November and see what we get from PS3 before we start comparing.
 
Tap In said:
minus Heavenly Sword, you're pointing to movies not games as those on 360 referenced above.

let's just wait until November and see what we get from PS3 before we start comparing.

Déjà Vu :cry:
 
Xenon programming for games isn't necessarily going to be easier than Cell - at least if you want to make use of all the cores. Running multiple threads on a triple core SMP machine is easy enough for desktop PC running many applications at the same time. The execution threads are automatically distributed between the cores to share the load - which is very easy.

In game programming which is time critical and tied into the video frame, you need to keep different threads syncronised with each other and with the video frame and ensure code is deterministic as far as possible, which can get quite complicated. The easiest way to do this is to have one master core acting as a controller handing off work to other slave cores - the way the Cell architecture is organised. This can be done in Xenon as well, but in Cell, this concept is easier due to the fact that the concept is built into the hardware and software tools, and more cores are available (so you are less likely to need to share a core between multiple processes). What is likely to be more time consuming on Cell is that SPE code and architecture is different to PPE code and architecture, so moving some code written for the PPE initially to the SPE to accelerate it, requires a rewrite (including probably the algorithm).
 
Back
Top