Some speculation on Next gen consoles :) Very Verbose!!!

pc999 said:
Read the links you will know what I meant in "work",
BTW there is
http://www.beyond3d.com/forum/viewtopic.php?t=21085
a topic on the subject.

Good Read

I read that thanks :)

Basically The GameMaster asserts that a specialized unit will take procedural stuff to the next level such a procedural geometry and the concept is one that is forward looking. Two things I do not disagree with.

It was noted that as processing is out pacing the growth in memory space procedural solutions would become more and more important.

However this specialized part is not general purpose if I understand correctly. It is a part between the CPU and GPU that makes it all possible. The program that controls it is on the general purpose CPU.

Basically this math processor is to procedural synthesis as a PPU is to accelerated physics interactions.

This procedural synthesis is quite interesting for allot of reasons and could also explain why the Xbox2 will only use 256MB of ram...possibly. As noted in that thread there are some caveats to using procedural everything everywhere.

On thing that did strike me as odd that The GameMaster mentioned was the elimination of loading. This would seem to be something Unreal Engine3.0 has already taken care of for the most part with it seamless worlds capability. Then again not every dev will license the this engine :(
 
GwymWeepa said:
scificube said:
I do think the Cell should be able to handle physics will from a computational standpoint. It is the logic portion of the task that is at all in question. It may not be as difficult as I presumed back then now that I've come across a few ideas here.

Well considering how its designed and how powerful it is, and with Sony saying they want physics based graphics, I'm willing to bet it can handle physics with as much flair as that PPU, if not better. Just a hunch.

A hunch based on what? Personally I would caution such expectations.

Remember, the CELL processor in the PS3 will need to not only do physics but also advanced AI, possibly Vertex Shading, Audio, and other game tasks. All on 256MB of memory.

We have to remember a Physics Chip like Ageias has created (125M transistors) that the complete transistor budget has been design and allocated for a single type of processing task. Basically it will be extremely effecient at what it does, more so than even the math centric CELL SPEs. We do not know how much, but even if we conceeded that a 230M transistor CELL chip could outpace a 125M transistor dedicated Physics chip, I surely doubt the same CELL chip doing physics + AI, Vertex Shading, Audio, and other game logic would not be able to do physics better.

I think this is the hype speaking. If the PS3 has 4 8:1 CELLs we can start getting big ideas, but expecting the PS3 to outdo every chip, even specialized ones, is building unfair expectations. Yes the CELL is extremely powerful, but expecting it to outdo a dedicated physics chip as in the above scenario is like the early suggestions that the CELL could possibly replace GPU pixel shading. There are just certain limitations. And while I expect the PS3 to do physics very well, I would definately think in real world situations that all things being equal, a PC or X2 with a physics chip would allow better physics in a normal game which also has demanding AI, vertex shading, and so forth.

That said I have my doubts the X2 will have one. It would be cool, but I doubt it. Like others have said: "Why?" X2 already is a processing monster. Maybe not quite the FP power the CELL has, but definately not a wimp.
 
Cell can supposedly run at 4ghz, I doubt the Aegia chip comes anywhere near close to that. So let's imagine a developer decides to dedicate half the Cell muscle to physics, you're dealing with nearly as many transistors, though perhaps not as efficient as the Aegia system since they are not created only for physics, but they are running at the multi-ghz level.
 
A PPU wouldn't need to run at 4GHz just like a GPU wouldn't need to run at 4GHz. A Cell chip OTOH would need to run at 4GHz to do the same things as a lower clocked PPU or GPU.
 
PC-Engine said:
A PPU wouldn't need to run at 4GHz just like a GPU wouldn't need to run at 4GHz. A Cell chip OTOH would need to run at 4GHz to do the same things as a lower clocked PPU or GPU.

I understand that, but considering the sheer amount of horsepower of Cell, and the fact that there will be a GPU...what's the few hundred GFLOPs of power, if that means anything, going to go towards? Modeling clay? lol

I think Cell was created to be a physics beast, hence the slides about physics based graphics. Aegia is probably a great solution for MS or Nintendo if they chose to compete with Cell, but even then I think Cell could have it beat.
 
But we still don't know if Cell will be doing geometry duties or not. If the GPU is only doing PS then a lot of that power in Cell will have to be used for VS leaving little left for physics on the scale of a PPU.
 
PC-Engine said:
But we still don't know if Cell will be doing geometry duties or not. If the GPU is only doing PS then a lot of that power in Cell will have to be used for VS leaving little left for physics on the scale of a PPU.

That is indeed true, as I mentioned earlier Cell is a flexible design while the PPU is not. If Cell has to do a lot of the geometry or anything else then indeed a dedicated PPU will do a better job simulating physics. But, just how much of the Cell's power will an accetable level of geometry detail take? Also, I imagine there will be plenty of games that choose to use the power of Cell mostly for physics and don't concern themselves with geometry. I mean, it remains to be seen, but I get a hunch that Cell will be a physics monster that's probably second to none in the mass market when it comes to said physics.
 
The thread on the PPE most often will be the one that's telling the SPE's what to do or the one that's handling branching and other things the SPE's aren't good at.
Please explain to me how a PPE thread can handle branch prediction for a SPE :?
A PPE thread could be assigned to handle SPEs coordination but it's not mandatory since SPEs run a microkernel too and can work withouth PPE intervention at all
Make note of this and I'll explain why this is important later. The SPE's threads will mostly be things that are burdened with intense computation and things that have predicable code. (as a branch mis predication is quite costly for an SPE and SPE's are not good predictors from what I've gathered)
Branch prediction is not quite costly on SPEs..branch prediction is not possible at all, since there is no hw support for it.
It's not that SPEs are not good predictors..SPEs don't predict braches at all.
SPEs just have a branching hint mechanism: compilers will insert branch hints in order for the SPE to prefetch instructions that are likely to be executed.
That's a completely static thing, AFAIK.
A task like rendering is can be highly parallelized and is computationally intensive. That is to say rendering is a task that can be split up into stages where each SPE's can take some stage and pass it's result to another SPE working another another stage of the task and you can setup a pipeline or 'stream'...sort of like a factory's assembly line. This leads me to believe Cell will have a hand in rendering as it's just begging to get used to that end
Since sony guys on (public) dev lists are hinting to the fact that CELL will used mostly for non graphics stuff since there is a NVIDIA GPU on board this makes me believe NVIDIA GPU will support pixel and vertex shading. SPEs, graphics wise, could be used for complex tasks suchs as tesselation/procedural geometry generation.

About your general refrain of SPEs not being able to efficiently run general purpose code..well..you're obviously right!
But If you believe Xenon PPC cores will be vastly better..I think you're going to be disappointed ;)
AI, Physics, graphics..almost everything will need to run batched and in multiple instances on NG CPUs, Xenon will not be different in this department if maximum efficiency is needed.

ciao,
Marco
 
Since sony guys on (public) dev lists are hinting to the fact that CELL will used mostly for non graphics stuff since there is a NVIDIA GPU on board
You know, I think that deserves a thread of its own. :oops:
 
nAo, cannot you do predication on SPE's (execute branch paths of an if-then-else branch and save only the good result) ?

So, no branch prediction, but at least if-then-else conversion (even CMOV could be used to do that).
 
There's no hw support AFAIK for predication, even if predication can be effectively used in some situation, but it would be only a 'software' thing.
I already used predication in some tight clipping loop on the PS2 so it would not be much of a problem to make use of it on SPEs ;)
 
I already used predication in some tight clipping loop on the PS2 so it would not be much of a problem to make use of it on SPEs
I use something like that all over my shadow volume code - it's cool, no speed loss on loop optimization and I still get the benefits of vertex rejection (unlike what you do on other hardware implementations of volumes) :)
 
Fafalada said:
I use something like that all over my shadow volume code - it's cool, no speed loss on loop optimization and I still get the benefits of vertex rejection (unlike what you do on other hardware implementations of volumes) :)
Bingo! In fact I also used that in my siloutte extraction/volume extrusion code!
When I coded that I thought: woow..those VUs are really cool..way more than xbox VS ;)
Just to compensate all the times I thought: those VUs are crap..Xbox VS are so simple and fast..LOL ;)
 
I Believe that the PPE and the XeCPU cores are the same but with some variations in the memory access design making them different between the two but equal in the ALUs.

And they won´t be PPC 970 based, the reason is that PPC 970 is a 5 Instruction per cycle CPU and PPE is a 2 instructions per cycle CPU.
 
Josh378 said:
Here's somthing you migth enjoy

http://boards.ign.com/PS3_General_Board_/b8267/82042020/p1/


Biggest collection of PS3 info with links all over the net (and the biggest one of the all). All of Peter's links and slides are there, as well as 50+ links to every reliable source of PS3 info....(must hav IGN insider...or I could make a post here with the same info...It's not 56k friendly tho)

-Josh378[/url]

Thank you Johs378 :)

edit:

HOLEY MOLEY!!! That is all.

end edit:
 
nAo said:
The thread on the PPE most often will be the one that's telling the SPE's what to do or the one that's handling branching and other things the SPE's aren't good at.
Please explain to me how a PPE thread can handle branch prediction for a SPE :?
A PPE thread could be assigned to handle SPEs coordination but it's not mandatory since SPEs run a microkernel too and can work withouth PPE intervention at all
Make note of this and I'll explain why this is important later. The SPE's threads will mostly be things that are burdened with intense computation and things that have predicable code. (as a branch mis predication is quite costly for an SPE and SPE's are not good predictors from what I've gathered)
Branch prediction is not quite costly on SPEs..branch prediction is not possible at all, since there is no hw support for it.
It's not that SPEs are not good predictors..SPEs don't predict braches at all.
SPEs just have a branching hint mechanism: compilers will insert branch hints in order for the SPE to prefetch instructions that are likely to be executed.
That's a completely static thing, AFAIK.
A task like rendering is can be highly parallelized and is computationally intensive. That is to say rendering is a task that can be split up into stages where each SPE's can take some stage and pass it's result to another SPE working another another stage of the task and you can setup a pipeline or 'stream'...sort of like a factory's assembly line. This leads me to believe Cell will have a hand in rendering as it's just begging to get used to that end
Since sony guys on (public) dev lists are hinting to the fact that CELL will used mostly for non graphics stuff since there is a NVIDIA GPU on board this makes me believe NVIDIA GPU will support pixel and vertex shading. SPEs, graphics wise, could be used for complex tasks suchs as tesselation/procedural geometry generation.

About your general refrain of SPEs not being able to efficiently run general purpose code..well..you're obviously right!
But If you believe Xenon PPC cores will be vastly better..I think you're going to be disappointed ;)
AI, Physics, graphics..almost everything will need to run batched and in multiple instances on NG CPUs, Xenon will not be different in this department if maximum efficiency is needed.

ciao,
Marco

nAo:

Sorry. This theory was formalized by me back in february. Back then I simply wasn't as apprised of things as I am now. I apologize for suck mistakes but since I was speaking on the macro-scale I didn't think it would be too important anyway.

When I said the PPE would handle branch prediction for the PPE I was thinking of two possibilities. One that makes sense the other pure conjecture. One being it would handle tasks with intense branching such as AI. The other being that it would unroll loops etc before it gave the SPE's the APULETS to work on. I did not mean the SPE would query the PPE to handle a branch...that would be quite inefficient and frankly nonsensical even on my behalf :)

edit:

Actually this would make a heck of a lot more sense to do during compile time but then it would be static like you inform me. Thanks.

end edit:

I do apologize about not knowing about Sony's recent statements. I did cover the over possibilities though to a degree. I simply picked what I thought at that time made the most sense as the Cell was repeatedly being referred to as a multi-media/rendering beast. I don't mind being wrong if I am.

As far as how I viewed the SPEs...wait I did say coding for a NG CPUs would be difficult regardless as leveraging all the cores with NG CPUs is not a trivial task. Ok with respect to the SPE I was referring to a greater level of difficulty to the task given how it's lacking in HW capability to the PPE and the Xeon's PPE like cores. Furthermore, I thought I hinted to the depredation of abstraction that also seems to be a problem with coding for SPEs. I did not mean to imply next gen coding would be simple. I meant to imply it would be easier to construct on the Xeon CPU and I made a prediction that there is a possibility that code on the Cell may a significant portion of it being to replace HW capabilities present on the Xbox2 CPU. It primarily was an argument that in ease of use and lower overhead in code may reduce the performance delta some see between the Cell and Xeon CPU. I don't think I went out of bounds or anything but of course I could be wrong...but then none of this would be speculations and as boring as reading white papers!

I beg thy forgiveness :p
 
Urian said:
I Believe that the PPE and the XeCPU cores are the same but with some variations in the memory access design making them different between the two but equal in the ALUs.

And they won´t be PPC 970 based, the reason is that PPC 970 is a 5 Instruction per cycle CPU and PPE is a 2 instructions per cycle CPU.

I think I suggested something close to that.

....so Urian is my friend :D
 
GwymWeepa said:
Cell can supposedly run at 4ghz, I doubt the Aegia chip comes anywhere near close to that. So let's imagine a developer decides to dedicate half the Cell muscle to physics, you're dealing with nearly as many transistors, though perhaps not as efficient as the Aegia system since they are not created only for physics, but they are running at the multi-ghz level.

Perhaps here is a good analogy to look at that shows that things are not so simple.
Deano Calver said:
I'm close to screaming (partly because I can't say alot about next-gen GPUs) but hopefully this will explain enough to show GPUs are good at maths as well.

GPUs (not surprisingly) are good at what they do, why do it on a CPU that isn't designed for the job? On there home turf (doing maths on lots of seperate bits of data) they are very good.

GPUs are seriously SIMD. They have a number of units (quads are an old name for them but that distinction will go away) each working on lots of data.

So lets compare an SPU SIMD to an imagninary near future GPU SIMD unit (this is meant as a order of magnitude thought experiment, so read nothing into the numbers).

1 instruction FMAC in a SPU will operate on 4 floats per cycle
1 instruction FMAC in a GPU unit will operate on 48 floats per cycle

It we take 4GHz for SPU and 500Mhz for a GPU, then a SPU can preform 8 times as many instructions.
So in 1 GPU cycle
8 FMAC instructions in a SPU will operate on 32 floats
1 FMAC instruction in a GPU unit will operate on 48 floats

So even using this crude back of the envelope calculation show that a GPU isn't exactly outclassed...

Just in case there any doubt, GPUs will ship with multiple 'units' as illustrated here just as Cell ships with multiple SPUs...

And we are not even thinking about all the 'free' data conversion, low latency memory reads and lerps that are the part of the fixed hardware...
 
Acert93 said:
GwymWeepa said:
scificube said:
I do think the Cell should be able to handle physics will from a computational standpoint. It is the logic portion of the task that is at all in question. It may not be as difficult as I presumed back then now that I've come across a few ideas here.

Well considering how its designed and how powerful it is, and with Sony saying they want physics based graphics, I'm willing to bet it can handle physics with as much flair as that PPU, if not better. Just a hunch.

A hunch based on what? Personally I would caution such expectations.

Remember, the CELL processor in the PS3 will need to not only do physics but also advanced AI, possibly Vertex Shading, Audio, and other game tasks. All on 256MB of memory.

We have to remember a Physics Chip like Ageias has created (125M transistors) that the complete transistor budget has been design and allocated for a single type of processing task. Basically it will be extremely effecient at what it does, more so than even the math centric CELL SPEs. We do not know how much, but even if we conceeded that a 230M transistor CELL chip could outpace a 125M transistor dedicated Physics chip, I surely doubt the same CELL chip doing physics + AI, Vertex Shading, Audio, and other game logic would not be able to do physics better.

I think this is the hype speaking. If the PS3 has 4 8:1 CELLs we can start getting big ideas, but expecting the PS3 to outdo every chip, even specialized ones, is building unfair expectations. Yes the CELL is extremely powerful, but expecting it to outdo a dedicated physics chip as in the above scenario is like the early suggestions that the CELL could possibly replace GPU pixel shading. There are just certain limitations. And while I expect the PS3 to do physics very well, I would definately think in real world situations that all things being equal, a PC or X2 with a physics chip would allow better physics in a normal game which also has demanding AI, vertex shading, and so forth.


Also remember many of CELL's transistors are commited to high speed SRAM, so comparing transistor counts is misleading. Just like comparing a GPU's clock speed to a CPU's. Ageia doesn't need to make any tradeoff's for running other tasks while CELL has to remain more flexable and general purpose. The Ageia PhysX PPU is going to be way more efficent than CELL when it comes to physics processing. I'm confident that a 8 SPU CELL cpu would be much slower than a PPU in physics.
 
Back
Top