psurge said:
The patent describes an exemplary embodiment of the FPE inside the PPU as consisting of an integer unit, 4 scalar FPUs, and 4 4-way SIMD FPUs (AFAICT they are controlled with a single VLIW instruction stream). In terms of floating point power, that's roughly the equivalent of 4 "vertex shader units". IMO the notable difference is that each of the functional units in a PPE can exchange data with eachother as well as various memory banks. The programmer also has explicit control over what data gets sent where (via the DME), making it way more flexible than the vertex shader model. I would actually expect a PPU to be quite good at vertex work.
Static geometry can very well be identical between GPU and PPU, but any kind of dynamic stuff (skinned characters) is probably sent as a rigid bodies { position, orientation, velocity, bounding volume, constraints (i.e. joint types), links to other rigid bodies } or particles.
The way I envision things is that the driver tracks the complete set of active and inactive rigid bodies and decides which ones need to be sent to the PPU for processing, recieving position/velocity/orientation updates in return. In the case of soft bodies (water surfaces, cloth) though, it could make sense for the PPU to just send a GPU compatible mesh back to the driver.
Yes, I agree with that (and most of the other posts as well
) That would make the most sense.
I think there are three different considerations:
1. Is the PPU only as good as the current software implementation, or is it able to do more and better calculations? In the latter case, you need the objects to match the "original" ones closer, so very rough bounding boxes might work less well, and it might be better to just send the complete meshes over to it.
2. Is the PPU able to do more advanced stuff, like destructable objects, clothes and bouncing boobies? I assume so, as those would be the main selling points. If it cannot do those, what's the use of it? And if it can do that, it must be able to calculate and return meshes.
3. If the CPU has to do too much work and / or transfer too much data, (calculate bounding boxes -> send to PPU -> wait -> get data back -> perform transforms according to new positions and vectors -> send to GPU), it might only make sense if the PPU can do it
much better than the CPU could in that time.
So, it would actually make sense to send over the whole scene as-is up front, and have the PPU do the transformations and send it to the GPU. Or at least do as much of the work as possible and return (transformed parts of) the updated scene to the CPU. Although the first possibility would restrict the extra effects the CPU can do, you could substitute most of them with shaders. And it is much faster. The only things that might be hard to do this way, are complex sub-object skinning (when the PPU sends it to the GPU) and alpha-textured 2D meshes for the collision detection.
But then again, we don't know and it is all just speculation at this time.