PlayStation III Architecture

Hello again Phil,
The reason I ask is that we had a Phil at PVC for a while.
I thought you might be him. :-?
I am pleasantly surprised you took the time to learn of my Holy Quest.
Have you a shrubbery? NEpe!! Haha
(Monty Python and the Holy Grail.)

Zidane1strife,
I know the IBM paper your signature comes from. It's a pleasure to meet you. I went on for a good bit about how The Microproccessor Report interpreted it wrong or at least very poorly in a CNET.com news report about 'the' Cell.
 
"ctoclok ctoclok" (coconut shells ?)

that's not me . i find pcvsconsol a bit too childish to register there :)

some great lectures in these posts of your ,thanks :)
 
Paraphrased by DS.

A: What did you say?
B: I have ridden quite fair on my fathful steed. I ask you to request your lord hospice so that I might rest.
A: Ridden? on what?
B: My Faithful steed.
A: All you have is a pair of cocunuts.

------
OK, enough with me and my crusades. :p

Again it's nice meeting you and I look forward to more when there's something to share.
 
To David:
Ne! Ne! You must bring us a shrubbery! lol, I love Monty Python :LOL:
Good to see you over here too!

About Sony showing their cards too early and Xbox 2 beating PS3:
Microsoft has alreday stated several times that the Xbox 2 (or whatever name it's going to have) will launch earlier than PS3. This time around maybe Sony will have the superior hardware and Microsoft will have the advantage of a "kick-start".
 
Transistor counts allowed Sony more leeway last time ... what is different this time is that they have a semiconductor process to work with which is the equal of Intel&AMD, and they have some help from IBM to ensure their architecture is a little less ad-hoc ... or at the very least to ensure that this time they will have an understandable english manual.
 
this patent defines the PE but it also goes much farther than that... talking about how you inter-operate different machines based on this technology and how a standard ISA and other tricks help you in doing so...


[0068] FIG. 4 illustrates the structure of an APU. APU 402 includes local memory 406, registers 410, four floating point units 412 and four integer units 414. Again, however, depending upon the processing power required, a greater or lesser number of floating points units 512 and integer units 414 can be employed. In a preferred embodiment, local memory 406 contains 128 kilobytes of storage, and the capacity of registers 410 is 128.times.128 bits. Floating point units 412 preferably operate at a speed of 32 billion floating point operations per second (32 GFLOPS), and integer units 414 preferably operate at a speed of 32 billion operations per second (32 GOPS).


I think that this is the nicest part so far of the whole thing IMHO... it pretty much sanctions that the number of functional units in each APU ( Integer Units ), the clock speed of the beast, the number of APUs, the number of PEs... it all doesn't matter...

what does matter is that the ISA is constant, the same in all APUs and that the PEs interact with the APUs using standardized packets that can be processed by ANY APUs on any CELL technology based device

btw, here is the most recent patent ( filed the same day as the other one but with minor differencies it seems )...

http://makeashorterlink.com/?K5C122D03




as far as maybe a picture of EE3 look at this... http://makeashorterlink.com/?W2D162D03

Another interesting claim is the following

[0020] In another aspect, the present invention provides an absolute timer for the processing of tasks. This absolute timer is independent of the frequency of the clocks employed by the APUs for the processing of applications and data. Applications are written based upon the time period for tasks defined by the absolute timer. If the frequency of the clocks employed by the APUs increases because of, e.g., enhancements to the APUs, the time period for a given task as defined by the absolute timer remains the same. This scheme enables the implementation of enhanced processing times by newer versions of the APUs without disabling these newer APUs from processing older applications written for the slower processing times of older APUs.

[0021] The present invention also provides an alternative scheme to permit newer APUs having faster processing speeds to process older applications written for the slower processing speeds of older APUs. In this alternative scheme, the particular instructions or microcode employed by the APUs in processing these older applications are analyzed during processing for problems in the coordination of the APUs' parallel processing created by the enhanced speeds. "No operation" ("NOOP") instructions are inserted into the instructions executed by some of these APUs to maintain the sequential completion of processing by the APUs expected by the program. By inserting these NOOPs into these instructions, the correct timing for the APUs' execution of all instructions are maintained.


again this Highlight the fact that speed and execution units can variate, but the applications (most of them, naturally not all CELL devices should have the Visualizer... I wonder if the ISA of the Visualizer is the same in all the CELL based devices ) should still run... a program written following closely the generic CELL specs should run fine on a CELL PDA ( how to fit it ? less execution units, less APUs, less PEs, slower clock frequency than say PS3's Broadband Engine ), on a CELL equipped TV or on a CELL based microwave oven... :)

Here is a picture that should help to understand this idea...

http://makeashorterlink.com/?M2E131D03






ps3.jpg





Btw, the descritpion of the local memory of the APU makes it seem like it is not a regular cache but more like a Scratch-pad SRAM like the SPRAM in the EE's RISC core or the Micro-memories in the VUs...

it could be simple L1 cache after all they talk about coherency protocols, but it might be used like the e-DRAM is, as a local buffer and not a cache...

Advantage ? you could read as well as directly WRITE into it... local RAM gives you more flexibility than a cache... you can do caching in software with a local RAM pool ( done on on the VUs, on the GS, etc... ), but you can also use it while the e-DRAM is being used by another APU to do some work locally...
 
Obviousy Sony will have a large advantage in transistor count and architectural unique-ness with PS3. how much so is unknown, but it's fairly clear that PS3 will be over 1 billion transistors when including all of the eDRAM it's sure to have. Can we say the same about XB2? Unknown at this point. have to look at the roadmap of Intel/AMD plus Nvidia (the likely GPU provider)

Only way that I see XB2 beating PS3 architecturally is by using parallel CPUs & GPUs. or at least GPUs. the only advantage XB2 could have otherwise, is a newer GPU/feature set. but i am main talking about transistor count/architecture, that Sony will have the advantage this time, compared to a single PC-based CPU/GPU.
 
btw... 4 GHz * 4 FPUs per APU * 2 FMAC ops * 8 APUs * 4 PEs = ~1 TFLOPs

if 4 GHz is too high you can increase the number of FP units in the APUs or if it is too low you can decrease the FP Units in the APU and increase the clockspeed more ( to save transistors )...

Also you alled the APUs vector Units... I have to parse the patent better, but IMHO they can work in parallel but also separately... as no mention of the APUs basically containing an Integer 128 bits VU and a FP 128 bits VU except the registers which are 128 bits... I'd rather have them being able to also work independently as they would increase efficiency, but even this way it would not be too bad...
 
Can it be said that each APU, 32 in all, is PS3's equivalent of PS2's Vector Units? dumb question I know.

Also, each of the 4 PUs is a seperate PowerPC CPU core?


making a fun observation,
It seems like the PUs are each the commander of several divisions in PS3's army of processors. Each of the 8 APUs in every PE is a division, the 4 FPUs in each APU is a battalion. The PS3 currently has a 32 divisions in its army :)
 
each of the PU should be yes a separate core... there is pcket routing logic that seems to dedode in part the operations that have to be performed and route the packet to the right PE...
 
Panajev2001a said:
I think that this is the nicest part so far of the whole thing IMHO... it pretty much sanctions that the number of functional units in each APU ( Integer Units ), the clock speed of the beast, the number of APUs, the number of PEs... it all doesn't matter...

what does matter is that the ISA is constant, the same in all APUs and that the PEs interact with the APUs using standardized packets that can be processed by ANY APUs on any CELL technology based device

Yes PVMs are a nice concept arent they.
 
marconelly! said:
Quantum Redshift and MotoGP have nice rain.
Please... Rain effect in those two games is so much simpler compared to one in MGS2.
In what way?

The problem with the MGS2 rain scene is he implemented in a way that worked great on the PS2, but wouldn't work great on the Xbox. In an interview he talked about having trouble porting to the Xbox because it's so different and his engine was designed heavily around PS2's nuances.

He knew the sales wouldn't be too high on it, he didn't spend an excessive amount of time porting it, so he just implemented how the PS2 did it and cut back so it was a "decent" framerate.

That's why it's shoddy.

And the 500 plus missions are a joke, does anyone seriously play VR-missions? No one I know does...
 
I thought the Xbox could do "anything"? It's so easy to program for, ports should be a nonissue (regardless of the "inferior" parent platform), no?
 
MfA said:
Panajev2001a said:
I think that this is the nicest part so far of the whole thing IMHO... it pretty much sanctions that the number of functional units in each APU ( Integer Units ), the clock speed of the beast, the number of APUs, the number of PEs... it all doesn't matter...

what does matter is that the ISA is constant, the same in all APUs and that the PEs interact with the APUs using standardized packets that can be processed by ANY APUs on any CELL technology based device

Yes PVMs are a nice concept arent they.

Of course the real work is going to be in the tool chain ;)

Turning a sequential piece of C code into packets that can be effectively executed on these APUs is going to be a challenge for the compiler writers.

Cheers
Gubbi
 
In what way?
Just watch both. MGS2 rain visual complexity and physics is really in it's own class. In those two racing games the most prominent part of the rain effect are not the actual rain particles, but the water drops on the windshield. The actual rain in them looks quite simple.
 
randycat99 said:
I thought the Xbox could do "anything"? It's so easy to program for, ports should be a nonissue (regardless of the "inferior" parent platform), no?
Whoever said the Xbox could do "anything" obviously didn't know what they were talking about. :)

It is easy to program for, in comparison to platforms like PS2, but it's still not a painlessly easy task.

MGS2 was a shoddy port. The engine was a direct a port as they could make it, even though it was designed around lots of PS2-specific features like massive fillrate, that wasn't changed for the most part -- just converted to run on the Xbox hardware. The Xbox doesn't have the same massive fillrate as the PS2, lo and behold choppiness in the rain scene...
 
As for the 500 Levels coming to the PS2, I don't think they will easily match the Xbox.
Those levels were built from the start to show off the Xbox.
One thing i will say though.
This will make for one heck of a good comparison.

What do you mean "coming"? :?: I've already got it for the PS2...

btw, here is the most recent patent ( filed the same day as the other one but with minor differencies it seems )...

What's up with their site? None of the image files are showing... Are they for you?
 
Back
Top