The SPE as general purpose processor

Frank

Certified not a majority
Veteran
In the other thread about this, it is asked in how far a SPE can function as a general purpose CPU. I have a very simple answer to that (which I cannot post there as it is now locked):

You cannot get more general purpose than when running an OS. And AFAIK, the OS on the PS3 runs on one of the SPE's. So it's definitely very capable of doing just about anything you would want it to, with a response time that is fast enough to service IO and timing. So, it's a good general purpose processor.

Is it as good at that as the PPE? No. But good enough.
 
One of the SPEs is reserved for the use of the OS to provide certain services but an SPE is not capable of running a modern OS on its own. SPEs do not support different privilege modes for code (on a modern OS the kernel runs at a higher privilege level than user code), they do not support any kind of virtual memory or memory protection, they have very limited support for interrupts and they do not have full access to hardware for I/O.
 
I'm by no means an expert on the topic, but it seems a few of the characteristics you mention are really a feature of the OS rather than indications of hardware support.
 
Certainly, they are useful features for the latest modern processor. Additionally, general processors have been on the scene and done their job for, what, nearly 3 decades w/o the latest and greatest in hardware threading/authentication support.

Suffice to say, I don't think any of that affects the lowest bar on what is considered a "general processor". Does an ARM or a MIP or a 603 count as a general processor? Hell, why not a PPC440? I think that is the region we are looking at to determine if an spe can be in the same genre or not.
 
Last edited by a moderator:
Mr. Hanky said:
Certainly, they are useful features for the latest modern processor. Additionally, general processors have been on the scene and done their job for, what, nearly 3 decades w/o the latest and greatest in hardware threading/authentication support.

Suffice to say, I don't think any of that affects the lowest bar on what is considered a "general processor". Does an ARM or a MIP or a 603 count as a general processor? Hell, why not a PPC440? I think that is the region we are looking at to determine if an spe can be in the same genre or not.
Agreed.
 
The OP suggested that running an OS is a criterion for a general purpose processor. He then said that since the SPEs run the OS on the PS3 they should be counted as general purpose processors. The SPEs do not run the OS on the PS3 (one SPE is used by the OS to accelerate some OS services) and a modern OS like Linux, Windows XP, Mac OS X or indeed the Cell OS cannot run on an SPE because an SPE does not offer the necessary hardware features. On that basis his argument is majorly flawed.

None of that means that SPEs can't be called general purpose processors. They are capable of running pretty much any standard C++ code, subject to memory restrictions, so they are certainly more general purpose than VUs or GPU shader units.
 
Any OS that wants to be considered "modern" must necessarily have the capability to do preemptive multitasking. OSes that are "modern" by this definition include Windows from Win95 onwards, pretty much all versions of Unix/Linux, MacOSX, BeOS etc. A standalone SPE does not have the hardware capabilities needed to run preemptive multitasking on its own, much less pre-empt any of the other SPEs; as such, it is less "general" than even a late-80s 386SX.
 
SPEs do not support different privilege modes for code (on a modern OS the kernel runs at a higher privilege level than user code), they do not support any kind of virtual memory or memory protection, they have very limited support for interrupts and they do not have full access to hardware for I/O.

I believe you are correct on privilege modes and interrupts but when DMA'ing they have full virtual memory / memory protection support (there's an MMU in each SPE) and access to I/O. There's no memory protection in the local store but you don't need it there.

The full OS would run on the PPE because it has the facilities (privilege modes) for it. These could be put into an SPE but there's not much point as that's what the PPE is there for.

The points made in the other thread were in answer to a poster who said the SPE couldn't run general purpose code. You should be able to run pretty much anything you like but they're more designed for compute intensive tasks rather than control intensive tasks. There are certain types of algorithms / data structures you should avoid as they will run slowly. The PPE will be better at these sorts of tasks (and do less well on compute intensive tasks).

The SPEs don't have OOO, rename registers or branch predictors but that's because it doesn't need them. There are a large number of registers, branch hints and branch replacement instructions in their place.

That said there's no point individually comparing SPEs or PPEs to other processors because in Cell they work together.


BTW Manic miner was a Spectrum game wasn't it? An SPE should run it like a bat out of hell!

Besides, the SPEs are much more sophisticated CPUs than any of the 8 bit chips.
 
ADEX said:
I believe you are correct on privilege modes and interrupts but when DMA'ing they have full virtual memory / memory protection support (there's an MMU in each SPE) and access to I/O. There's no memory protection in the local store but you don't need it there.
The SPE isn't able to control the actual memory mappings used for virtual memory/memory protection (unless the PPE actually maps the page tables themselves into the memory area that the SPE is allowed to acess).

However, other than the OS stuff, the SPE should be perfectly capable of running any kind of program just fine, albeit with a substantially different performance profile than you would expect from a more 'standard' CPU.
 
Are SPE accesses through the DMA engine atomic? I think there are a few OS duties that really get funky if they can be interrupted halfway through by another operation.

I know that general purpose multiprocessors have atomic memory operations for synchronization.
 
3dilettante said:
Are SPE accesses through the DMA engine atomic? I think there are a few OS duties that really get funky if they can be interrupted halfway through by another operation.

I know that general purpose multiprocessors have atomic memory operations for synchronization.

This might be it?

The DMA control unit processes queues of DMA commands. It consists of two DMA queues, one each for PPE-initiated DMA and SPE-initiated DMA. The MFC also contains an atomic unit (ATO), which performs atomic DMA updates for synchronization between software running on various SPEs and the PPE. Atomic DMA commands are similar to PowerPC locking primitives (lwarx/stwcx).

http://www-128.ibm.com/developerworks/power/library/pa-celldmas/
 
That's where it is. I knew the PPE had that ability, but I wasn't sure if the SPEs were granted similar authority.
 
arjan de lumens said:
Any OS that wants to be considered "modern" must necessarily have the capability to do preemptive multitasking. OSes that are "modern" by this definition include Windows from Win95 onwards, pretty much all versions of Unix/Linux, MacOSX, BeOS etc. A standalone SPE does not have the hardware capabilities needed to run preemptive multitasking on its own, much less pre-empt any of the other SPEs; as such, it is less "general" than even a late-80s 386SX.
Yes. But is the PPE there to run all that OS code, or just intercept requests and delegate them? As ADEX said, what would be the use in replicating the same circuitry when there is no need and you only need a central place to "catch" and distribute the requests?

The PPE is there for coordination. So, when an interrupt or a task switch timer event happens, you have a lookup table that jumps you to a very small piece of code that determines where to jump next, to handle (process) the task. And when you have an SPE actually run the show, the PPE only has to add the state to the global event table. That's less than ten instructions. And then the SPE can spend as much time processing all those things as it needs to do.

It's essentially just a vector table and writing some lookup values to a global table, at most interrupting the SPE that runs the OS if it gets an NMI. And that SPE is what does all of the work in running the OS.

Sure, they could have used a vector table chained to one of the SPE's directly, but does it matter? At least this way you have a single, hard entry point that can be handled completely by software. With minimal overhead. As you might want to use the PPE for other things. And that SPE is more than capable of doing everything else.

Looks like good engineering to me.
 
IBM sum it up pretty well in the Cell Broadband Engine Architecture document:

Synergistic Processor Unit
The intent of the SPU is to fill a void between general-purpose processors and special-purpose hardware. Where general-purpose processors aim to achieve the best average performance on a broad set of applications, and special-purpose hardware aims to achieve the best performance on a single application, the SPU aims to achieve leadership performance on critical workloads for game, media, and broadband systems. The intent of the SPU and the CBEA is to provide a high degree of control to expert (real-time) programmers while maintaining ease of programming.

...

The SPU has the following restrictions:
• No direct access to main storage (access to main storage using MFC facilities only)
• No distinction between user mode and privileged state
• No access to critical system control such as page-table entries (this restriction should be enforced by
PPE privileged software).
• No synchronization facilities for shared local storage access
 
IBM said:
The SPU has the following restrictions:
• No direct access to main storage (access to main storage using MFC facilities only)
• No distinction between user mode and privileged state
• No access to critical system control such as page-table entries (this restriction should be enforced by
PPE privileged software).
• No synchronization facilities for shared local storage access

Ok.

- No direct access to main storage: doesn't matter at all.
- Switching between user- and privileged mode: this only matters if you want to prevent other processes of accessing hardware directly. If you want your program to run: don't do that, or delegate it.
- Page tables: Ok, so you need the PPE to write a value if you want it to. Can an SPE interrupt the PPE?
- Synchronization for the local storage: no need.
 
DiGuru said:
Ok.

- No direct access to main storage: doesn't matter at all.
- Switching between user- and privileged mode: this only matters if you want to prevent other processes of accessing hardware directly. If you want your program to run: don't do that, or delegate it.
- Page tables: Ok, so you need the PPE to write a value if you want it to. Can an SPE interrupt the PPE?
- Synchronization for the local storage: no need.
Yeah, it can do most general purpose computing tasks except running a modern OS. Which was my point.
 
heliosphere said:
Yeah, it can do most general purpose computing tasks except running a modern OS. Which was my point.
Well, I can make a long list of things it cannot do either. Like, directly flipping a bit on your harddisk, or changing the state of one of the LED's on your keyboard. But neither can the PPE, or the CPU in a PC. But all of them could initiate the action to make that happen. And some of those actions are executed by other processors, and others by dedicated hardware.

Some of those other processors could even run their own OS. Most of them do, actually. That's what firmware is all about.
 
Last edited by a moderator:
Can the SPUs run general purpose C++ code? Yes.

Could some hypothetical system that had only SPUs and no PPE run a modern OS? No.

Does that matter? No, because that's why the Cell has a PPE - to run the OS and manage the SPUs.

Being able to run the OS or not is not really relevant to the question of how 'general purpose' the SPUs are - they were never designed to do that. As IBM say, the SPUs are designed to fall somewhere between a traditional general purpose processor and traditional special-purpose hardware. Arguments over whether they are general purpose or not are somewhat pointless in that light - they were designed to fall in the grey-area between special and general purpose.
 
Hardware support for memory protection isn't really needed as long virtualization exists, and virtualization doesn't neccessary imply hardware support either, if you are running with a runtime compiled language (e.g. CLR/JavaVM)
 
Back
Top