Xenon= Modified G5 Triple Core Theory

Urian

Regular
G5 can manage 5 instructions per cycle and has the next units:

-2 Branch Units
-2 Load/Store Units
-2 Integer Units
-2 FPU
-2 VMX

Xenon can manage 2 instructions per cycle and has the next units:

-1 Branch Unit
-1 Load/Store Unit
-1 Integer Unit
-1 FPU
-1 VMX-128

The FPU+VMX-128 part has been revamped but I don´t see how is possible that a new CPU with half units could have more transistors than a more complex CPU x3 with the same Shared L2 Cache.

G5 Logic without L2 Cache= 34 milions of transistors.
G5 Triple Core Logic= 102 milions of transistors
G5 Triple Core with Shared L1 Cache= 151 milion transistors
What Xenon documents say= 165 milion transistors.

Obviously that a Xenon core is less powerful and obviously it needs less transistors but the documentation says that the transistors are more than a possible G5 triple core with Shared L2 Cache.

I am sure that we have 3 G5 inside the Xenon core and thanks to the disables OOE the system can only do 2 instructions per clock and this is when having 2 Integer Units, 2 FPU, 2 VMX... Become useless. And this tells how Microsoft only needed 14 months for finishing the project.

PD:It can be applied to the PPE.
 
You are completely wrong.

The G5 (970) is derived from Power 4.

The XCPU is completely original (well, it shares a lot with PPE)

Cheers
 
Gubbi said:
You are completely wrong.

The G5 (970) is derived from Power 4.

The XCPU is completely original (well, it shares a lot with PPE)

Cheers

Actually it's totally derived, say "garbage" product, leftover from the STI deal...
 
Urian said:
I am sure that we have 3 G5 inside the Xenon core and thanks to the disables OOE the system can only do 2 instructions per clock and this is when having 2 Integer Units, 2 FPU, 2 VMX... Become useless.
It's not possible to 'disable the OOE'. It's a fundamental and integral part of the instruction pipeline. You can as little 'disable the OOE' as you can 'disable' (remove) the chassis of a car and still have a driveable vehicle.

So what you propose is pure nonsense really.
 
Guden Oden said:
It's not possible to 'disable the OOE'. It's a fundamental and integral part of the instruction pipeline. You can as little 'disable the OOE' as you can 'disable' (remove) the chassis of a car and still have a driveable vehicle.

So what you propose is pure nonsense really.


Well, Xenon/PPE has no OOOE, it's a fact, no ?
 
Nemo80 said:
Actually it's totally derived, say "garbage" product, leftover from the STI deal...
Actually, both the PPE and Xenon cores were based on a previous unreleased PowerPC design. Consider it a fork.
 
Asher said:
Actually, both the PPE and Xenon cores were based on a previous unreleased PowerPC design. Consider it a fork.

True, but it's like

"unkown" PPC Design -> CELL PPU DD1.0 -> Xenon (1xVMX, 32kb L1 ...)

but the big difference is: CELL DD1.0 -> CELL DD2.0 -> CELL DD3.1 (PS3: 2xVMX, 64kb L1) ...
 
Last edited by a moderator:
I expect a very significant upgrade to the PPE cores in Xbox2010 and PlayStation4, assuming they both use similar architectures. or full-blown PowerPC/Power core variants, perhaps not these stripped down cores called PPEs.

perhaps 9-12 cores for Xbox2010 CPU maybe 4 cores for PS4 plus dozens of next-gen SPEs.


think I'm totally wrong?
 
Nemo80 said:
True, but it's like

"unkown" PPC Design -> CELL PPU DD1.0 -> Xenon (1xVMX, 32kb L1 ...)

but the big difference is: CELL DD1.0 -> CELL DD2.0 -> CELL DD3.1 (PS3: 2xVMX, 64kb L1) ...

No.

IBM Research PPC Design -> Cell PPE
IBM Research PPC Design -> Xenon core

Xenon cores are not based on PPEs, PPEs and Xenon cores are both based on the same core though. There's a difference.

I also don't understand your "2x VMX" comment. The VMX unit in Cell in PS3 is virtually identical to the one in the PowerPC 970, while the VMX-128 unit in Xenon cores are significantly improved with new instructions and more registers (among other tweaks).

You are also incorrect with the L1 cache sizes. The Xenon cores have 32KB L1 instruction cache, 32KB L1 Data cache -- just like the PPE cores.
 
Mmmkay said:
I think he's trying to refer to this information shown at GDC:
http://www.gaming-age.com/specials/gdc_2006/ps3/19.jpg

It indicates that CELL's PPE has 32KB of L1 Instruction cache but 64KB of L1 Data cache, whereas Xenon has 32KB for each. Could just be a typo though ;)
My guess is that's a typo, unless there's been a new revision of the chip since September that ups the cache for some reason...

The Cells out in the wild right now all have 32/32 IIRC.

I really don't trust people/sites that confuse Kb with KB...according to that presentation, Cell has 64 kilobits of L1 data cache, or 8KB. ;)
 
Asher said:
I really don't trust people/sites that confuse Kb with KB...according to that presentation, Cell has 64 kilobits of L1 data cache, or 8KB. ;)

True, the Kb/KB thing is amusing, though this is a slide Sony produced, not a website.
 
nonamer said:
It got it right on other slides, so it must be a geniune typo.

Were those slides were within the same presentation, or a later one? If so, then that would be the most likely explanation indeed.
 
Nemo80 said:
Actually it's totally derived, say "garbage" product, leftover from the STI deal...
How are you not banned yet? Surely there's a better board for you somewhere.

The Xenon (and PPE) cores are designed for SMT so duplicate a lot of resources (though not functional units) compared to the single threaded design of the 970, which means a lot of extra transistors. Xenon also has twice the cache of a 970 and a fairly large test unit and since it's meant for a consumer device has to have a lot more redundancy built in to improve yields in spite of flaws - the 970 had relatively poor yields and it never got a sniff of 3 GHz let alone 3.2.

If you knew anything at all about the 970 you'd see that there's very little similarity between them and either the Xenon or PPE and that's a good thing as what it excels at - DP FP performance - is not very relevant to the console world. It does have OOOE which would be nice but it's more of a case where it would let developers have an easier life than something that can't usually be worked around by using the performance profiling tools and seeing where your code is stalling. It'd be nice to think the compilers would be smart enough eventually to take care of most of that for you but they've always promised this and have yet to really deliver. Same goes for auto-vectorization, nice idea in theory but more often than not it does so little for you that you may as well not bother.
 
The ADC documentation that I have talks that the G5 can manage 5 instructions per cycle but with dependancy of the compiler.

My theory is simple, I know that you cannot take down the OOOE unit like taking out the chasis of a car, but the only 12-14 months of development and the similar number of transistors to a 970 core made me think that perhaps IBM has put an artificial penalty to the G5 core and has retouched it for creating the first generation of PPE.

Sorry for saying this, but the PPE/Xenon seems too large for its technical specs as a processor.
 
I can tell you, it's fact that Xenon cores are not based on the G5.

As for its size, there are a lot of transistors in Xenon cores used for performance monitoring/debugging and redundancy that would not be there in desktop parts.
 
Back
Top