How many developers are using all 3 cores for X360?

onetimeposter said:
Unlike PS3 where you have to use some if not all the SPEs instead of only the Core PPC, Xbox 360 games can be made on one core only , so which ones are using all 3 of them.

EPIC have said there UE3 demo on PS3 was only running on 1 core
Heavenly Sword Dev's said there only on one core
 
scificube said:
I'm thinking of Deano CALVER's comments as to not thinking of the PPE as a single 3.2GHz core but two 1.6GHz cores. If you would only use thread on the PPE you would be effectively using a 1.6GHz CPU.
Multithreading doesn't work like that per se. It's about sharing resources between two threads to make the most use of the resources on chip. eg. If one thread has missed the cache and is waiting for a 500 cycle memory fetch, the other thread can jump in and use the chip to push itself along until the memory fetch is completed.

With nothing more to go on than a single sentence on a slide I'm not gonna try to guess what Deano's explanation is. If anyone has the video or transcript of his talk please link to it! But Deano was talking about 'thinking' of the PPE as two half-speed cores, not that it's built as such. And that's just one developer's take. Other devs may well have a different POV.
 
I don't imagine anybody is really using the SPEs as yet... I'd say they're all still figuring out just how to code for them, and what they're going to be able to handle.

By the end of the generation, I except almost all of the real computations will be done in the SPEs, and the PPE used primarily as a controlling core. It's going to be very exciting when developers get a real stranglehold over the SPEs.
 
onetimeposter said:
PS3 has only 1 core

if you mean only the core not the SPEs ur wrong.

No, he's not wrong, neither HS or UE3 were using the SPUs. A few others weren't either, the Gundam game (demoed live) at the playstation meeting wasn't using SPUs either.

As others have said, I doubt much work was being done with SPUs up to E3.
 
Shifty Geezer said:
Why do people still persist with this notion that the PPE is just there to control the SPU's?


well you saw what happened with the Gundam demo in Japan PS meeting where a 500 meter map and 4 mechs were shooting with no physics being done and even then it was stuttering, the graphics were controlled by a 7800 GTX.
 
Titanio said:
No, he's not wrong, neither HS or UE3 were using the SPUs. A few others weren't either, the Gundam game (demoed live) at the playstation meeting wasn't using SPUs either.

As others have said, I doubt much work was being done with SPUs up to E3.

well u saw in the video even the slightest side movement saw stuttering and there was no building damage of physics reenabled smoke effects when the mech guns shot at buildings
 
onetimeposter said:
well you saw what happened with the Gundam demo in Japan PS meeting where a 500 meter map and 4 mechs were shooting with no physics being done and even then it was stuttering, the graphics were controlled by a 7800 GTX.

That's besides the point - the PPE wasn't doing any "control work" for SPUs at that point since the SPUs were doing nothing. And regardless, the SPUs can run their little legs off and the PPE may still not be doing any "control work". The SPUs require no PPE control or management, they can look after themselves, and if the dev wishes, you can use the PPE solely for other application code.
 
Last edited by a moderator:
For the record, as has been repeated several times on this board, the SPU are NOT just co-processors or DPS or enhanced vector units. They are cores, just as the PPE is a cor and the XeCPU core is a core. They have specialised enhancements, but they basically operate independantly of each other and the PPE. You can even run a full OS on a SPE. The amount of management needed is just a couple of DMA requests I believe, loading the Apulet code into a SPE and letting it go off and do it's own thing. The amount of overhead on the PPE is negligable and it's there to take up the burden of other functions that the SPU's aren't so hot at.

Anyone claiming the SPU's are anything other than full and proper cores is either mistaken or deliberately spreading FUD.
 
Shifty Geezer said:
You can even run a full OS on a SPE.

Can you? I thought this was one thing that was marked out as it not being able to do.


Shifty Geezer said:
The amount of management needed is just a couple of DMA requests I believe, loading the Apulet code into a SPE and letting it go off and do it's own thing.

I think it's even less than that if you want. The SPUs have "mini-kernels" that can pull tasks out of memory by themselves (or even co-operating with other SPUs on the distribution of tasks). So whatever, if anything, is required on the PPE's part to simply get the kernel going in the first instance would be all that was required in that regard - it could from then on just do its own thing completely. Of course the PPE could be maintaining a queue of tasks in memory or whatever, but even that's not necessary.
 
Shifty Geezer said:
Multithreading doesn't work like that per se. It's about sharing resources between two threads to make the most use of the resources on chip. eg. If one thread has missed the cache and is waiting for a 500 cycle memory fetch, the other thread can jump in and use the chip to push itself along until the memory fetch is completed.

With nothing more to go on than a single sentence on a slide I'm not gonna try to guess what Deano's explanation is. If anyone has the video or transcript of his talk please link to it! But Deano was talking about 'thinking' of the PPE as two half-speed cores, not that it's built as such. And that's just one developer's take. Other devs may well have a different POV.

I understand that. I'm talking about the case where both threads are executing...are there enough resources to go around or not? I would imagine there should be or what's the point other than getting work done when another thread cannot continue to be executed for some reason?
 
scificube said:
I would imagine there should be or what's the point other than getting work done when another thread cannot continue to be executed for some reason?
That is the point! It's an optimisation to make use of idle resources. That's why hyperthreaded P4s don't run 2x faster despite having two threads. Now you could possible jiggle it to run two threads and share the resource evenly, which would have the same relative perforamnce as two cores at half the speed. That's what Deano seemed to be getting at. And there may be some extra enhancements to make the most of the second thread in XeCPU or PPE - though I've read the details I haven't taken on more than just a gist.
 
Shifty Geezer said:
That is the point! It's an optimisation to make use of idle resources. That's why hyperthreaded P4s don't run 2x faster despite having two threads. Now you could possible jiggle it to run two threads and share the resource evenly, which would have the same relative perforamnce as two cores at half the speed. That's what Deano seemed to be getting at. And there may be some extra enhancements to make the most of the second thread in XeCPU or PPE - though I've read the details I haven't taken on more than just a gist.



The PPE is based on the POWER Architecture, which is the basis of IBM's line of POWER and PowerPC offerings. The PPE is not intended as the primary processor for the system, but acts as a controller for the other eight SPEs, which handle most of the computational workload. However, the PPE is used to run conventional OSes due to its similarity to other 64-bit PowerPC processors, and because the SPEs are designed for vectorized floating point code execution. The PPE contains a 32 KiB instruction and data Level 1 cache and a 512 KiB Level 2 cache. Additionally, IBM has included a VMX (AltiVec) unit in the Cell PPE. [4]

http://en.wikipedia.org/wiki/Cell_processor
 
onetimeposter said:
The PPE is based on the POWER Architecture, which is the basis of IBM's line of POWER and PowerPC offerings. The PPE is not intended as the primary processor for the system, but acts as a controller for the other eight SPEs, which handle most of the computational workload.


The "handles most of the computational workload" bit is correct, undoubtedly, but it's worth repeating (and seemingly needs to be repeated again and again) that the PPE need not assume much or any "controlling" role depending on the programming model used. The PPE overhead associated with tasking the SPUs can be negligible or none, and the PPE can be free to occupy itself elsewhere.

This presentation from GDC is informative:

http://research.scea.com/research/html/CellGDC05/24.html

Note the involvement (or lack thereof) of the PPE in SPU business in the 3 programming models presented there.

It's also worth noting Hofstee's comments on regarding the SPUs to be "subordinate" to the PPE.
 
Last edited by a moderator:
onetimeposter said:
The PPE contains a 32 KiB instruction and data Level 1 cache and a 512 KiB Level 2 cache. Additionally, IBM has included a VMX (AltiVec) unit in the Cell PPE. [4]

http://en.wikipedia.org/wiki/Cell_processor

Almost right, though the latest CELL revision contains two full hardware VMX units, which can be seen from the DiE shots that go around in the internet. This means the CELL PPE can run 2 independent threads at full speed. The only limitation which counts in is the shared L2 cache.

But situation is even worse on Xenon. These cores contain only one VMX unit per core, enabling the Xbox 360 to do 3 "real" threads at once, only one more than the CELL PPE can handle. (of course the Xenon VMX unit can do hyper threading which gives a little performance boost but not much). But what is much worse is that all these threads on Xenon are blocked by each other because they all share one quite small (for so many threads) 1 meg L2 cache.
 
onetimeposter said:
The PPE is based on the POWER Architecture, which is the basis of IBM's line of POWER and PowerPC offerings. The PPE is not intended as the primary processor for the system, but acts as a controller for the other eight SPEs, which handle most of the computational workload. However, the PPE is used to run conventional OSes due to its similarity to other 64-bit PowerPC processors, and because the SPEs are designed for vectorized floating point code execution. The PPE contains a 32 KiB instruction and data Level 1 cache and a 512 KiB Level 2 cache. Additionally, IBM has included a VMX (AltiVec) unit in the Cell PPE. [4]

http://en.wikipedia.org/wiki/Cell_processor
Yes, it's not the primary processor. If you've got your workload running on PPE your wasting Cell. You want to keep the SPE's doing as much work as possible and the PPE acts as a controller for this. But that isn't it's only function. It's PPC compliant - in essence a Power core but without OOOE. You can if you want run all your code on the PPE and leave the SPE's sitting idle. As nothing but a controller chip PPE is total overkill. It's got a reasonable cache and VMX units and enough processing power to handle a hefty workload. Getting the SPE's to do something useful only requires a bit of signalling (AFAIK, I don't claim to be the world's most authoritative expert!) to get the SPE to load its Apulet and get to work.

In the case of PS3 the PPE could be thought of as controlling the main loop, player interactions and things, conscripting the SPE's to do intensive number crunching for physics and whatnot. It functions both as a scheduler and a useful unit in it's own right, in the same way a Sergeant both commands his troops to perform actions and performs actions of his own. A primary role of a sergeant is the control of his squad, but the majority of the squad's activities are undertaken by the squaddies. But ordering the squaddies isn't the sergeant's only function and he mucks in with the rest of them.
 
PPE is needed to manage the SPE's, plus it's got different strengths and weaknesses to SPE's. If you just had SPE's there'd be some things Cell would be REALLY bad out such as heavily branched control code. That's be the compromise achieved from Toshibas 'only SPEs' idea and IBM's 'only PPEs' idea.
 
Back
Top