The role of the PPE?

Status
Not open for further replies.
Titanio said:
However, NONE of this determines what code can be run where, in and of itself. As I read it, your argument
Instead it determines what code actually WILL be run, which was my point, why else would I state developer experience and skill as a major factor?

If it's not practical to move somethign off of the main core because it creates too many problems, it will not get done regardless of whether or not it could potentially 'run' on the extra core. Of course what's 'practical' depends on the programmers skill experience and ingenuity.

For example, with Xenon, everything that can run on the primary core could 'run' on secondary or tertiary core, but that doesn't mean that they can suddenly split the game engine across all 3 cores and utlize them effectively, they still struggle with taking a game engine and trying to split it over 3 cores despite having completely homogenous cores.

Why do you expect these problems to just go away when a developer looks at moving some code from the PPE to an SPE? Doesn't he face the exact same high-level issues a programer faces on 360 when he tries to move something from core0 to core1?
 
Last edited by a moderator:
scooby_dooby said:
If it's not practical to move somethign off of the main core because it creates too many problems, it will not get done regardless of whether or not it could potentially 'run' on the extra core.

Synchronisation or parallelisation of said task would not prevent it being moved onto a SPE, and I believe that was your argument.

If you're saying, "how does the developer tease out these "tasks" in the first place?" that's a completely different question. Games are not monolithic programs, however, games are sequences of distinct tasks. That's an issue facing any architecture, not Cell specifically. You're basically asking "how parallel is a game?". There's parallelism on multiple levels. Obviously if you can't break up your game at all, you've got problems, but I can't imagine that ever being the case.

Your previous argument sounded like "how do I make audio parallel, or how do I make hair simulation parallel?" or whatever, vs "how do I make a game parallel". Once you have a distinct task that can be run in parallel to others, it doesn't matter whether it itself can be parallelised or not, as far as where you execute it (PPE or SPE) is concerned.
 
Last edited by a moderator:
Titanio said:
Synchronisation or parallelisation of said task would not prevent it being moved onto a SPE, and I believe that was your argument.
I was never talking about distinct tasks at all, that's something you introduced here. I was merely referring to the inherent problems in multi-threading a game engine on multiple cores.

I only used the word synchronize because I was assuming that was the main issue, but 'multi-thread' would have been a more accurate description. Any tasks that can not be effectively multi-threaded, or that is interdependant on other tasks/data, may have to remain on the main PPE, even if it's perfectly suited for a stream processor like the SPE.
 
Last edited by a moderator:
scooby_dooby said:
If it's not practical to move somethign off of the main core because it creates too many problems, it will not get done regardless of whether or not it could potentially 'run' on the extra core. Of course what's 'practical' depends on the programmers skill experience and ingenuity.
That's not saying the same as...
I said any code that can't be synchronized across multiple cores, will have to be run on the PPE, it would seem to me this would be predominately inter-dependant tasks.
This statement says if a task can't be synchronized, it has to be done on the PPE. Audio 'can't' be synchronized as it doesn't need to be, but it can be (and is)run on a SPE. I'm not sure what you mean by synchronized and maybe just picked the wrong wording? As Titanio says, synchronising means things like sharing data and keeping in time. If you mean a task that can't be isolated from the other code has to stay on PPE as in a conventional program, I guess that's true. That'd perhaps be better explained as 'where code is dependant on other code and cannot be isolated onto a seperate processor, it will be confined to PPE'. Even then there's no law that says the majority of code would have to be kept on PPE. Given 15 tasks of a main loop, what's to stop 14 of them being executed on a SPE and 1 being executed on the PPE? I don't think anything much has to be run on PPE save kickstarting, but obviously there's some things you're going to want to do on PPE as it's better at them.
 
It's simply a bad choice of wording, by 'code that can't be sychronized across multiple cores' I mean code that isn't independant, it can't just be removed from the main core and stuck on a 2nd core. It is interdependant with other tasks and this this dependancy must be synchronized/solved.

I didn't mean that for a task to be moved to an SPE it has to run on 2 or more SPE's, that's just ridiculous. It has to be able to safely be removed from the MAIN core, without causing too much work for the developer given their timelines/skill, or it won't be feasible to do it.
 
scooby_dooby said:
Any tasks that can not be effectively multi-threaded, or that is interdependant on other tasks/data, may have to remain on the main PPE, even if it's perfectly suited for a stream processor like the SPE.

See, you say you're not talking about "tasks" but then you say this! Sorry, but this statement as I comprehend it, and the meaning of "tasks" (or "code" as you were referring to earlier) is simply false.

If I have a task..for example, simulating some cloth or whatever..whether I can parallelise that or not has no bearing on whether it can run on a SPE or not.

If you said "any game that can not be effectively multithreaded" then I might agree. But if you can't break up your game, you're up crap creek on any multi-core architecture..but you can break up games.
 
it's just cause I hate to use the word 'code' so much, for lack of a better word I use tasks. substitute functions, subroutines, objects, systems, game engine...whatever...

If you have a task, that can run on an SPE, but is dependant on other data/tasks/fucntions/whatever on other cores, you first have to solve that dependancy issue before you can move it to the SPE. So, it has to be parellisable in the sense that it can be moved from the main core, to a secondary core.
 
Last edited by a moderator:
scooby_dooby said:
it's just cause I hate to use the word 'code' so much, for lack of a better word I use tasks, substitute functions, subroutines, objects, systems, game engine...whatever...

Statements take on very different meanings and truth status depending on which of these words you use :) They all have different meanings, you need to be specific as to what exactly you're addressing.
 
Well you seem to have your own little internal definition for the word 'task', which apparently invoves it being completely independant from all other parts of the game engine. I just consider it to be a broad term for anything in the CPU has to accomplish.
 
Last edited by a moderator:
scooby_dooby said:
Well you seem to have your own little internal definition for the word 'task', which apparently invoves it being completely independant from all other tasks. I just consider it to be a broad term for anything in the CPU has to accomplish.

If you want to use a broad definition, then use "game". "Task" could be anything from a whole game to a tiny part of that game - making statements like those above incorrect.

But even if you want address the whole of a game, whether or not you could run that single monolithic game on a SPE would have nothing to do with whether you could parallelise it (or not, in this case). May sound stupid and pointless - and it is - but technically you could fire off a SPU to process it instead of doing it on the PPE, as long as the code ran reasonably on the SPE (which again would have nothing to do with whether it was parallelisable or not, since we're talking about a single process here). So regardless of whether you're referring to a task as the whole game, or as some small element of it, that still would not be a correct statement to make.

The point your making is irrelevant as far as pointing out specific characteristics of Cell anyway. It's something any multi-core architecture would be faced with if you could not break up the code at all.
 
Last edited by a moderator:
You have to use scooby_logic on'em before he gets it! ;) IBM would not have put 7/8 spe's on a chip if a "task" was not possible on them to some extent. :D
 
Shifty Geezer said:
Now, if you want to change your phrasing and say 'SPE can't run general purpose code quickly enough to match a conventional CPU' just go ahead and say so. That's a very different thing to 'is only a Single Precision Floating Point unit' as you first argued. It would also show a degree of intelligence if when corrected on points like 'SPE's being unable to access memory without going through PPE', you acknowlede the correction.

Well neither can xecpu or ppe without modifying and optimizing the code if I’m not mistaken

Speaking of non liner code and integers, spe's have 4 integer/fixed units but just how many integer/fixed units does ppe have ? 2 like xecpu or more ?

And Pakpassion/Griffith/Xbot/RedBlackDevil just how many more accounts do you have on this forum ?
 
Shifty Geezer said:
Cool ;). It's not manic miner though :p

How would you class the IO though? Is the GPU still needing CPU intervention to provide it with keyboard inputs? If so, SPE has the difference in having direct access to functionality without needing other procesors to format and feed it data, as it were. If not, SM2.0 is more capable than I thought!
I'd hazard a guess that Cell's OS will be written such that the PPE will handle inputs and the SPE's won't be able to do it. Like with GPUs many of the limitations are with software and a desire to farm out tasks to the hardware that's the best at executing the task. So of course the GPU won't handle IO like keyboard inputs.

nelg, I haven't programmed for Cell, but IMO we have a PPE because SPE's with their Local Store are geared towards running certain types of compute intensive algorithms very fast, but some things will just run better on a more traditional core like the PPE. It's also easier on the programmer as the PPE manages the cache instead of the programmer managing the LS.
 
one said:
I assume there are many PS2 developers who are used to developing games and engines on a CPU with 3 asymmetric processing units.
PS2 developers will likely have a leg up, but vertex processing and clipping can now be done on the graphics chip so that takes away one of the tasks, leaving a lot of parallelization effort in order to get the most out of Cell.
 
3dcgi said:
PS2 developers will likely have a leg up, but vertex processing and clipping can now be done on the graphics chip so that takes away one of the tasks, leaving a lot of parallelization effort in order to get the most out of Cell.
Vertex processing and clipping were among the most trivial uses of VUs though. And for the record, having to keep the stupid clip subroutine in VU1 memory at all times was quite cumbersome and actually limited things that could have been done if that was handled more gracefully - and I don't mean we needed clip circuit on GS, it was a much more natural fit for GIF arbitration part of the EE.
While SCE guys would tell me otherwise, I believe that replacing a certain other fixed circuit on EE (which noone ever used in games anyway save for those few of us that are clinically insane) with a nice clip(6-planes please, not another PSP) would have made the machine better overall.
 
I had a feeling you would comment when I mentioned clipping with the VUs. :D

Any idea what percentage of your VU time was spent vertex processing and clipping?
 
VU1 is doing various kinds of vertex processing 99.9% the time - basically the only other things it does is part of render states setup (texture state, blend modes etc.) for the vertex blocks being rendered.
Clipping is the "amusing" one - we typically spend less then 1% of time clipping vertices.
Meanwhile, the clip subroutine code takes up nearly 40% of VU1 instruction memory (and needs a reserved chunk of data memory to operate safely with other programs as well, though that is not quite as significant).

I don't have a good number for VU0 time cuz there's no way to measure macro-mode utilization in hardware, and what we run in micro code has huge variations. The major contributors to the latter are collision detection and some other physics related things - which can take anywhere from zero to majority of CPU time depending on situation.
 
3dcgi said:
PS2 developers will likely have a leg up, but vertex processing and clipping can now be done on the graphics chip so that takes away one of the tasks, leaving a lot of parallelization effort in order to get the most out of Cell.
Well, then, AGEIA and Havok guys will come to the rescue for SPU utilization! :smile:
 
Status
Not open for further replies.
Back
Top