Hironobu Sakaguchi's opinion of cell and PS3

Status
Not open for further replies.
Hi,Joker!
I'm quiet happy to see you again here.
In your first appearance on this forum some members have been quiet tought.
I was scared that we loss one few devs here that have experience with both systems (360 and ps3).

You has said that in your game you and your team have found away to make some calculations that on the ps3 you would want to be done on the spu on the xenos.

I've started a thread some times ago on this matter, if you're allowed to talk a little about it your comments would been welcomed.
the link :
http://forum.beyond3d.com/showthread.php?t=39096

In the FKATCT thread (lol) you said that you'll be working on ps3 during 2007, we 're here now, could you sum up your pro/con comment about the cell as far as your experience is concerned?

about this thread, I think that there is alot more interesting stuffs in this interview than the dsp stuff.
It was a quick answer, and HS seems to have some problems with KK add in some MS PR and you're here, I'm not sure that that part it's worse to dicuss...

there's interesting comments about the fact that the square Enix white engine could have been madefrom scratch as a multi platfrom engine aka capcom framework engine.
 
Last edited by a moderator:
Yes, would be good to hear some pros and cons from a cross-platform developer's perspective. Based on his responses so far, I was wondering whether he works only on rendering (and animation ?), or he also participate in AI and physics implementation. I am especially interested in the local store's latency win/loss in real world applications.
 
Last edited by a moderator:
The differences are in the details: DSPs often contain accelerators for communications applications (e.g., viterbi engines) whereas the SPUs are tailored for graphics (e.g., wider vector operations).

The SPU's are capable (and great at) much more than graphics-related tasks; in addition, they can handle anything with varying levels of success. So, no, the DSP analogy is a terrible one. Just because the SPE can mimic the behaviour of a DSP does not mean that the converse is true, and that is a key point of differentiation.
 
Last edited by a moderator:
I don't feel that such specific case examples are as relevant to game performance as people would think. It's like someone building a test app on PS3 or 360, and using it to extrapolate real world game performance to compare both machines. It just doesn't work that way, there is so much more going on. In a real world example, all chunks of code are fighting for the same cycles, memory, dma, cache, etc. The only way is to find out real world game performance is to code, optimize and run a real game on the actual hardware, then compare.

Just to make one clarification, I do think in it's lifetime that the PS3 will eclipse the 360 in cpu power. What I meant by the 'leapfrog' comment is that I don't think people will see the 10x-20x leap in performance that it seems people are expecting.

I don't think anyone is expecting a 10x-20x leap in performance other than total fanboys and 10 year old kids :) It's not a matter of eclipsing the 360s power but more a matter of will PS3 surpase the 360 at all... and if it will by how much and in which areas. Do you believe that the PS3 will not surpase the 360? It seems to me that Sony is getting their act together with Edge.
 
I see, so we can't do dot products on SPUs, right? and ask you this question: how many cycles does a VMX unit need to perform a dot product? And also..how many cycles that VMX units need to perform a 3 components or 2 components dot product?

Of course you can do dot products on spu's, never said you couldn't! Likewise, there's no normalize instruction on the 360's vmx, naturally it doesn't mean you can't do a normalize ;) But in this case, there just happens to be a dedicated instruction for dot product on the 360's vmx which is nice. I'd have to check the docs to get the cycle count on 4 component dot product, i *think* its 12 or 14 offhand.
 
I see, so we can't do dot products on SPUs, right? and ask you this question: how many cycles does a VMX unit need to perform a dot product?
And also..how many cycles that VMX units need to perform a 3 components or 2 components dot product?

Well actually that's totally irrelevant since you guys haven't made any progress anyways because you have no tools and support is rubbish. In fact all those nifty screenshots and videos is just the work of some savvy CG artists and the playable demos @E3 were just a cunning deception running on über PCs designed to placate audiences until you can jump to the 360 and be saved from the development quagmire that is the PS3. I mean c'mon, why else would Deano port Lua to XNA!? It's all in the tea leaves... ;)
 
I don't think anyone is expecting a 10x-20x leap in performance other than total fanboys and 10 year old kids :) It's not a matter of eclipsing the 360s power but more a matter of will PS3 surpase the 360 at all... and if it will by how much and in which areas. Do you believe that the PS3 will not surpase the 360? It seems to me that Sony is getting their act together with Edge.

Cpu side, the PS3 is easily superior to the 360 as far as performance goes, I don't think anyone can really dispute that. What I'm curious to know is by how much in a real world example. We're finding on our game that we can leverage the gpu on the 360 more than we can on the PS3. In other words, there are some things I moved to cpu side on the PS3 that I didn't have to move on the 360 because its gpu eats them up just fine. I'll know more soon enough, I'm mostly doing PS3 work this time around mostly because I'm curious as hell to see what cell can really do.

As far as PS3 eclipsing 360 as a whole? I still don't really know. The PS3 has things wrapped up cpu side, but its less effective gpu, and having tens of megabytes of less available memory mean this one is not easy to call.
 
While we are at it... joke454, did you see the rest of Playstation Edge ? Would be interesting to know as you're coding on SPU today, your opinion about the recent tools. How does it help in terms of performance and ease of integration with respect to your current cross-platform title (or future revision). :D
 
joker454 said:
The vmx units on the 360/PS3 ppu's are not quite the same, the ones one the 360 have some extra instructions.
Contrary to popular belief (where things are always added to non-Sony hardware without any compromises), 360 VMX is also missing some instructions.

As for DP - I stand by my statements from long ago - it's basically irellevant in code where you can use optimal data layouts, and in code where layouts are ill suited to Vertical SIMD, we would benefit from different extensions much more (unfortunately certain IBM engineers believe such things are heresy, and compilers and programmers alike are there to bite the bullet).
 
Cpu side, the PS3 is easily superior to the 360 as far as performance goes, I don't think anyone can really dispute that. What I'm curious to know is by how much in a real world example. We're finding on our game that we can leverage the gpu on the 360 more than we can on the PS3. In other words, there are some things I moved to cpu side on the PS3 that I didn't have to move on the 360 because its gpu eats them up just fine. I'll know more soon enough, I'm mostly doing PS3 work this time around mostly because I'm curious as hell to see what cell can really do.

As far as PS3 eclipsing 360 as a whole? I still don't really know. The PS3 has things wrapped up cpu side, but its less effective gpu, and having tens of megabytes of less available memory mean this one is not easy to call.

Correct me if I'm wrong but doesn't RSX have the pixel shader power advantage?
If Cell can be leveraged to match (or exceed) the vertex processing abilities I would think PS3 should win out overall.

Ofcourse I guess it all depends on developer focus, time and familiarity. I remember the Itagaki quote where he says the PS3 is more powerful but "obstinate". I wasn't sure quite what he meant by "obstinate" but perhaps he was referring to some of the issues brought up as far trying to get the most out of the system.
 
He's a producer in a company that outsources all of its programming.
Did Kutaragi sleep with his wife? Run over his cat? Someone seems to have a personal axe to grind.
lol that might explain it
This reminds me of a John Carmack thread.
Didnt see any bashing (I actually saw possitive remarks regarding MS in general) but thats irrelevant
He actually bashes their marketting quite heavily.
 
This reminds me of a John Carmack thread.

John Carmack is a programmer. When he talks about hardware, his words have weight.
Hironobu Sakaguchi is, as I said, a producer/director in a company that does no actual programming. The areas in which he can talk with authority are different from those for John Carmack.
Do you see the difference?
 
John Carmack is a programmer. When he talks about hardware, his words have weight.
Hironobu Sakaguchi is, as I said, a producer/director in a company that does no actual programming. The areas in which he can talk with authority are different from those for John Carmack.
Do you see the difference?
Not in terms of the ad hominem comments, no. And actually, I do think it possible that a head of a game company might have technical insight into what his teams work on, regardless of hands on or not. But regardless, I was purely speaking to the posts that immediately tried to discredit the person making the statement as opposed to addressing the statement itself.
 
A guy who literally sold out to MS taking pot shots at his old boss and now main competitor isn't exactly unbiased.
 
There's really nothing about the statement to discuss. It's a single sentence fragment of flippant over-simplification.
He doesn't like the PS3's architecture? On what basis, and so what? Neither he nor anybody in his company is dealing with the architecture directly (if they're keeping with their mission statement, anyway), so what possible reason could he have for voicing like or dislike?
 
But in this case, there just happens to be a dedicated instruction for dot product on the 360's vmx which is nice.
IF you want to write subpar code..yes, it's quite nice.
I'd have to check the docs to get the cycle count on 4 component dot product, i *think* its 12 or 14 offhand.
Here we go, now you know why SPEs don't have such an instruction (and still can outperform it by a factor of 8 or more..in fact I dont think you want to use that instruction on Xenon as well if you want to write some fast code)
 
IHere we go, now you know why SPEs don't have such an instruction (and still can outperform it by a factor of 8 or more..in fact I dont think you want to use that instruction on Xenon as well if you want to write some fast code)

Enlighten me then, I'm always open to learning new things ;) I just replaced a branch that was in a tight loop with a few instructions, one of which was dot product. I didn't actually need a dot product per say, what I needed, literally, was the absolute value sum of 4 numbers that are -1, 0 or 1. In other words, if my 4 numbers are { 0, -1, 0, 1 }, I should get 2 as my value.

So, I have these four numbers that are sitting in a vmx register and I need to know the above value to pass to the next step. How could I do this faster on xenon without dot product?
 
Enlighten me then, I'm always open to learning new things ;) I just replaced a branch that was in a tight loop with a few instructions, one of which was dot product. I didn't actually need a dot product per say, what I needed, literally, was the absolute value sum of 4 numbers that are -1, 0 or 1. In other words, if my 4 numbers are { 0, -1, 0, 1 }, I should get 2 as my value.

So, I have these four numbers that are sitting in a vmx register and I need to know the above value to pass to the next step. How could I do this faster on xenon without dot product?
Noob reply:

Looking at the SPU ISA if you are using an integer vector

Multiply the vector by itself for absolute value of the vector components
Count Ones in Bytes instruction

I don't what for you need the result, but the dot product is a very hight latency operation so it may cause stalls in a in-order CPU.
http://pc.watch.impress.co.jp/docs/2005/1028/kaigai02l.gif
 
Enlighten me then, I'm always open to learning new things ;) I just replaced a branch that was in a tight loop with a few instructions, one of which was dot product. I didn't actually need a dot product per say, what I needed, literally, was the absolute value sum of 4 numbers that are -1, 0 or 1. In other words, if my 4 numbers are { 0, -1, 0, 1 }, I should get 2 as my value.

So, I have these four numbers that are sitting in a vmx register and I need to know the above value to pass to the next step. How could I do this faster on xenon without dot product?


Since your numbers are only -1, 0, 1. Why waste a whole 128bit register for your vector that you only really need 4 bytes for?

You could arrange your data like this - so that each vector becomes stored vertically in 4 registers that are each holding a 16 byte value.

R1 = {A1, B1, C1, D1, ... P1}
R2 = {A2, B2, C2, D2, ... P2}
R3 = {A3, B3, C3, D3, ... P3}
R4 = {A4, B4, C4, D4, ... P4}

Then...

R1 = R1 * R1
R2 = R2 * R2
R3 = R3 * R3
R4 = R4 * R4

D = R1 + R2 + R3 + R4

With just 7 instructions you can compute 16 results.

The way you were suggesting you would use 4 instructions to compute 4 results.
 
Status
Not open for further replies.
Back
Top