PS3's Cell implementation is gimped?

I thought the PPU can set up the entire SPU environment at will ? They may not need to fix the SPU #... just their relative position within the pictured framework above.

I don't know for sure, not having a PS3 dev kit. On Linux, you don't seem to have any way to know for sure exactly which SPEs are physically located where.

I suppose you could do it by benchmarking madly. :)
 
I guess the worst part of all of this is that PS3 developers won't be able to take advantage of some algorithms (i.e. high performing taking advantage of affinity) developed for "full" Cell processors. Effectively, you split the Cell development community into two, PS3/Cell developers and other/Cell developers.
 
Maybe. You could present people with a consistent "3 up, 3 down" view of the CConsoles, though, are in a way hard realtime; it is perfectly fine to have every frame ready .2ms before you need it, but dropping frames gets you dinged points in reviews, so if performance is even SLIGHTLY unpredictable, you have to leave larger margins.

Yap... I was told Resistance catered for worst case scenarios. That's why the game runs so smoothly and responsively even when action becomes heavy. Beyond a certain point, naturally we will run out of resources.

I don't know for sure, not having a PS3 dev kit. On Linux, you don't seem to have any way to know for sure exactly which SPEs are physically located where.

I suppose you could do it by benchmarking madly. :)

*chuckle*

I am thinking if affinity is a feature that can be turned on and off based on the platform, then it is probably a system software thing (instead of a CPU thing). So it's up to Sony what they think is best.
 
As I said before, PS3 linux is open - and it's a trivial problem to test all possible combinations of 6 source to 6 drain DMAs - I dont think it's actually possible to reach the peak EIB transfer rate using 6SPE's only - so there should be headroom left for the evil Sony hypervisor :)
 
As I said before, PS3 linux is open - and it's a trivial problem to test all possible combinations of 6 source to 6 drain DMAs - I dont think it's actually possible to reach the peak EIB transfer rate using 6SPE's only - so there should be headroom left for the evil Sony hypervisor :)

That's right. I forgot about that.
 
I would like to know how much unpredictability this adds in reality? Given that a lot of todays games don´t have a fixed allocation of SPUs for certains tasks (not including hypervisior SPU) I´d say it is close to nothing. If they start to have tasks assigned to fixed SPUs then it might have some effect, but probably still very small.

It should also be noted that the internal bus is completely configurable through firmware so at startup they could likely assign any number to any SPU helping to ensure you have the supervisor on a fixed position in the ring and consecutive numbered SPUs are always consecutive if that is what you want them to be. The only uncertainty would be if two consecutively numbered SPUs have the I/O interface (BIF/IOF) placed between themself. That uncertainy is only there for two SPUs, so if you design a program using two consecutve SPUs that must not have the I/O interface placed between themself for some? performance reason, then you should just avoid using those two positions when assigning SPUs for those tasks.

Please do also keep in mind that the bus consist of four rings moving data, two moving data clockwise and two moving data counterclockwise, pretty flexible I´d say.

I seriously think that the blog writer is barking up the wrong tree if he thinks this anomaly would have the significant impact as he is implying.
 
I suspect the HV does a lot less work during a purely computational load under Linux than it does, say, in a game that may be streaming data from blu-ray.

BTW, yes, I did make an obvious error in the diagram above; it's SPE 4-7, not 5-8, on the bottom. Sorry!

I am not sure that you would want the GameOS/HV to abstract data streaming from either HDD or Blu-Ray disc (besides the natural I/O tasks an OS would want to run)... you would handle it in your own code I'd think.
 
Last edited by a moderator:
The Cell SDK simply documents that affinity isn't available on the PS3 platform.

This could mean that it is not exposed to the Other OS, it does not say much unfortunately (either way I mean) about the Game OS which runs concurrently to the HV...
 
Last edited by a moderator:
AFAICR, it was done (although this is a single example and covers DP FP heavy processing):

http://www.cs.berkeley.edu/~samw/research/papers/sc07.pdf

It shows benchmarks run on the the blade with 1 CELL BE (8 SPE's) vs PS3 (6 SPE's).

I can't say I have too much faith in a paper that tries to draw a meaningful conclusion about real world performance from a test that results in a 2.2 Ghz dual core AX2 significantly ouperforming a 2.33Ghz quad core clovertown

Are we to believe that Clovertown only achieving 3% of its peak throughput is a "normal" situation?
 
I am not sure that you would want the GameOS/HV to abstract data streaming from either HDD or Blu-Ray disc (besides the natural I/O tasks an OS would want to run)... you would handle it in your own code I'd think.

That's the point! You can't, because you don't have access to the hardware, only to the hypervisor's abstraction layer. The whole purpose of the hypervisor is to require you to use it.

This could mean that it is not exposed to the Other OS, it does not say much unfortunately (either way I mean) about the Game OS which runs concurrently to the HV...

They both run concurrently; seems likely there's a lot of similarity in what the hypervisor gives them.
 
Duff information and scaremongering. They go together like nuts & gum!

Duff information
Reserved SPU runs the hypervisor? No, it doesn't.. the hypervisor runs on the PPU. The reserved SPU runs security stuff.

Scaremongering
A system update could cause the reserved SPU to eat all bandwidth? Yes.. that would be technically possible, but it would be retarded. It's also equally possible (and equally unlikely) that a future Xbox 360 firmware update ends up stealing an entire core. But I've yet to see anyone mention that.

In reality, SPU positioning on the EIB isn't (in my view) likely to be something that affects game performance. Developers are way more likely to look for algorithmic optimisations (that involve better data structure, better SPU layout etc) than they are likely to look at optimising based on the position of SPUs across the EIB. Which, for a number of reasons, is gonna be a real bitch to do.

Still, it all makes for a good and scary blog/forum post, doesn't it?

Dean
 
And that settles that really. I don't think there's anything more to be gained in this discussion -> Thread Locked. PM me (or use site feedback if you are unable) to make a case for re-opening this thread.
 
I'm just wondering if any PS3 games ever made full use of the 6th SPU which was supposedly shared between the OS and games, with one reserved full time for the hypervisor and the other 5 free for use?
 
I think it is an interesting post-mortem topic worth revisiting. Now that we can see the results we can discuss better about its implementation
 
Back
Top