Cell's PPE and Xenon compared.

Jawed said:
Programmers don't solve trivial problems - they just copy the code from somewhere else.

Jawed

I classify based on problem algorythmic complexity, or how much other code in the codebase is potentially touched by the solution.

So somethiung could be conceptually simple, and still be classed as "hard" if there is a lot of application code that depends on the changed implementation...... But again that's just me.
 
This thread is being reopened because it doeos have good information in it. The posts that pretty much derailed it are now deleted it. If it gets out of hand again the person or persons who derail it are going to be banned for one week.
 
derailed information......i posted IBM's article.....which said cell has 128*128 bit registers...the intwerviews of spyrtek and climax that proved the supremacy of ps3 over x360...damn u undercover MS agents...nothing can save x360

again x360 has 32 bit registers not 128 bit and ps3 has 128*128 bit registers...here is the proof from IBM

http://researchweb.watson.ibm.com/p...e reports/All_About_Cell_Cool_Chips_Final.pdf

all these are garbage posts by xbox fans....
 
nasim said:
derailed information......i posted IBM's article.....which said cell has 128*128 bit registers...the intwerviews of spyrtek and climax that proved the supremacy of ps3 over x360...damn u undercover MS agents...nothing can save x360

again x360 has 32 bit registers not 128 bit and ps3 has 128*128 bit registers...here is the proof from IBM

http://researchweb.watson.ibm.com/people/a/ashwini/E3%202005%20Cell%20Blade%20reports/All_About_Cell_Cool_Chips_Final.pdf

all these are garbage posts by xbox fans....

WTF?

Each of the 7 SPU have 128 128bit registers, each PPU thread (it has 2) has 32 GPR (64 bit), 32 FPR (64 bit), 32 VMX (128 bit).

Each of the 3 XeCPU cores 2 threads have 32 GPR (64 bit), 32 FPR (64 bit), 128 VMX (128 bit).

Neither has any 32 bit registers in the main register file.
 
DeanoC said:
WTF?

Each of the 7 SPU have 128 128bit registers, each PPU thread (it has 2) has 32 GPR (64 bit), 32 FPR (64 bit), 32 VMX (128 bit).

Each of the 3 XeCPU cores 2 threads have 32 GPR (64 bit), 32 FPR (64 bit), 128 VMX (128 bit).

Neither has any 32 bit registers in the main register file.

Thank you for explaining that to him. :)

Apparently, nasim doesn't "get it" and posts alot of flamebait here at the Console forum.
 
What is the distinction between general and special purpose cpu?
Holmdahl (from MS) said: "We believe we have selected the right balance between floating point and integer instruction performance [for X2]. About 80% of game code is integer, general purpose, code. So it makes sense to set the CPU balance towards [that]"
Prattnaik (maker of Pow5) said: "...bulk of the improvement [of Pow5] had to do with the interaction between the floating-point and the fixed-point..." (I guess, he talks about fixed point integer).
So, why difference on computer performance between these two types of numbers?
 
I don't think there's any official definition. I think the key concept is memory access patterns and branching. All code is branches, memory access and calculations, either on integers or floats (or integers representing other datasets). A stream processor has less branching, more structure memory access. A general purpose program will be doing a variety of different memory access patterns based on conditional branches and performing a variety of calculations. That's how I understand it.

Regards Holmdahl's statement AFAIK that 80% figure was never explained. 80% of execution time? 80% of instructions? I certainly notice that whatever Holmdahl and others might have said, MS sacrificed GP advantages in their processor to make room from powerful streaming floating point units. MS along with STI and everyone else working on CPUs feel the need for advances in vector float execution more than anything else.
 
Yeah, it's interesting that "Sony" (STI) chose not to go for a DP instruction in the SPEs.

I would guess that was to keep the FP pipeline to the same length as the integer pipeline.

Although Xenon's DP pipeline is apparently only 2 stages longer than the vector pipeline.

Jawed
 
I think the main reason would be that just adding a DP wouldn't do a whole lot to help, DP or not it would still be inherently a SoA SIMD.
You need vector component access and broadcast ops before we can start talking about useful horizontal processing, and making permutes and some other ops part of FPU execution pipeline would help as well.

If they included a DP and nothing else I'd only feel like someone is dangling a carrot on the stick but not really letting me get to it. :p
With current design, once I got over the initial periods of anger and denial I could move on to acceptance, if there was DP I'd still be in denial stage right now.
 
ArsTechnica quotes the patent which talks about not needing to convert between SoA and AoS on Xenon, and I've been assuming that's part of the reason why the FP pipeline is seemingly so long.

http://arstechnica.com/articles/paedia/cpu/xbox360-2.ars/4

The patent excerpt in red text is the meat of it.

Naturally, the problem I have is that the last time I wrote any vector maths code was on a BBC B, way back when ... so I'm stumbling around in the dark here.

Jawed
 
Last edited by a moderator:
Back
Top