Does Cell Have Any Other Advantages Over XCPU Other Than FLOPS?

nAo said:
Time has passed, season changes.. have you recently benchmarked the supernoisy thing? :)

What? What is this supernoisey think you guys are talking about? Which one is that? Can you guys talk in everyday gamer's talk please?
 
mckmas8808 said:
Aaaarrrrrggggghhhhh!!!:mad:

God I can't wait until Sony drops the darn N D freaking As. It will be Playstation land for me and I can't wait.


When is that?

I'd guess basically never. Why would they?

I doubt MS dropped any NDA's.
 
MrWibble said:
The question, I believe, was whether the SPUs are independant of the PPE or not. Yes, they have to get data to and from RAM or other SPUs using DMA - however DMA controllers are built into every SPU. So in what sense does the PPE have to be involved?

For one, without using the PPE, the SPU's have no capability for actually executing a single instruction. Instructions must be loaded into the SPU's LS via a DMA operation setup via software running on the PPE. The SPU's do have the ability to setup descriptors for the DMA engines but their abilities to do so are more limited than the PPE. In addition, the PPE has access to a much large amount of the machine state space than the SPUs.


I'd consider the ability to DMA data from place to place as fairly "direct".

The SPUs can only DMA from place to place if the memory aliasing of the SPU has been correctly set by the PPE. The PPE must setup the EA aliasing of the SPU LS in order for an SPU to DMA to/from another SPU.

The SPUs are really attached media/stream processors with the emphasis on attached. They are not architected nor micro-architected to function on their own.

Aaron Spink
speaking for myself inc.
 
DeanoC said:
maybe once they shared a common ancestor
Oh let's not be so hasty - the worst flaws of their common ancenstry are still very much present in both. And for the record I'm not talking about trivial things such as in-order execution.

ERP said:
that I do not believe can be explained by the compiler difference.
From my understanding the explanation is much simpler then that - and not exactly hw based either.
Though I couldn't speak for DD1 of course.
 
ERP said:
I've benchmarked both and in many tasks there is a significant per core clock for clock performance difference, that I do not believe can be explained by the compiler difference.
Is this isolating cache dependencies? I don't think it's been stated whether the PPE has the same cache setup as the XCPU, that could be one fairly significant advantage for Cell.

About compilers, have you actually looked to see what kind of asm code the compiler generated to see if one is doing a better job than the other? I suppose it doesn't matter much for actual work you'll be doing, but the state of compilers on the new processors is likely to be a pretty quickly moving target, from sucks -> sucks, but not quite as badly.

About the noisy one comment, the Xbox devkits have been described as fairly quiet and have been static for some time now, I think that's more likely to be the PS3 devkit which would be undergoing fairly large changes as it gets closer to a final hardware state.
 
aaronspink said:
For one, without using the PPE, the SPU's have no capability for actually executing a single instruction. Instructions must be loaded into the SPU's LS via a DMA operation setup via software running on the PPE.
Isn't it typically only once if you put a manager kernel in an SPE that fetchs/switchs all subsequent task stream in runtime by itself after the initiall kick and memory aliasing by PPE?
http://www.research.scea.com/research/html/CellGDC05/38.html
38.jpg
 
One: Yes, like any other processor out there you need some kind of kick to start (amiga anyone? :) ), after that SPEs can mostly run without any PPE help.
 
chachi said:
Is this isolating cache dependencies? I don't think it's been stated whether the PPE has the same cache setup as the XCPU, that could be one fairly significant advantage for Cell.
Was ERP's implication that Xenon's cores are clock-for-clock faster than the PPE, or vice versa?
 
nAo said:
One: Yes, like any other processor out there you need some kind of kick to start

Yup, someone has to be designated driver to start with. Probably seemed like a reasonable idea to have that be the PPE in the current design. However there's no real reason this needs to be the case, they could just as well have put one of the SPUs in charge of configuring the system and telling the PPE what to do.

This is all just flannel, certain people trying to convince themselves that the SPUs are somehow crippled or inadequate. The reality is that although they are certainly optimised for vectorised operation (float *or* integer), streaming, and low-memory footprint operations, they are more than capable of pretty much any kind of "general purpose" code you want.

(amiga anyone? :) )

Thanks but I already have 4... :)
 
Asher said:
Was ERP's implication that Xenon's cores are clock-for-clock faster than the PPE, or vice versa?

I think he had enough presence of mind not to actually say. Unless maybe he doesn't *like* working in the games industry any more :)
 
ERP said:
Baiting forum members is a sport for the devs on the board :p

Oh yes, until I finish training these hound-dogs to nAo's scent, then the hunt begins and let's see how much of a sport that is... imagine fox hunt with the dev in the place of the poor fox ;).
 
ERP said:
I've benchmarked both and in many tasks there is a significant per core clock for clock performance difference, that I do not believe can be explained by the compiler difference.

AFAIK DD2 is closer to the Xenos cores than DD1.

Well, to be fair, have you tried locking cache (and not use it) to leave only 512 KB of L2 and benchmark a single core on XeCPU versus CELL's PPE ?
 
Back
Top