Using the CPU to post process GPU work

Acert93 · May 22, 2005

Jawed said:
Physics and AI are not a good fit for stream processors. It will work solely because there's a brutish amount of power on Cell sat doing not much else.

Jawed

It has been argued on this forum that CELL is well designed for Physics. Why would it not be?

randycat99 · May 22, 2005

...and a "hyperthread" constitutes the functionality of +1 core, now? I think we need to reign a few things in here, before they become "fact" in Internet-world, eh?

DemoCoder · May 22, 2005

SPEs are quite a bit more general than pixel shaders, they have random access to their own RAM which is considerably more efficient than the mechanism by which pixel shaders can access the results of previous computations (render to texture)

Secondly, while AI is scalar/integer (except for some exotic techniques), Physics is very vector oriented and efficiently implemented as streaming matrix computations. It can be broken down into three types of operations: handling collisions, integration, and solving. Integration and solving are very stream oriented and represent the majority workload in any physics engine.

For example, Baraff's solver is efficiently implemented as a stream solver on a PS/2 VU. This is what you get if you buy the MathEngine middleware for PS/2.

Jawed · May 22, 2005

How many objects is that? 50?

This makes it sound like it only barely works:

http://www.q12.org/pipermail/ode/2003-January/002759.html

Perhaps you've got some more edifying resources?

Jawed

nAo · May 22, 2005

Jawed said:
You've just explained why doing various kinds of blending and geometry interpolation on Cell is a huge waste of resources, and why this functionality should be undertaken by RSX.

I just explained why in my opinion that comparison was flawed,do you care to explain what geomtetry interpolation is and why CELL handling it is a waste of resource?

In case you haven't noticed, ATI has gone to the next level on this kind of functionality with R500.

I really don't know what you're talking about here, care to explain?

Clearly there are algorithms that are ill-suited to RSX's 200-odd GFLops of programmable shaders and they should be run on Cell

what are odd gigaflop/s?

- though it's worth pointing out that Cell is a streaming processor, which means that the SPEs are only marginally more general purpose than the programmable streaming floating point functionality of RSX

Umh? SPEs are much more general purpose than SM3.0 shaders.
With a SPE you can gather and scatter data from wherever you want.

This goes back to the argument that XB360's CPU is "6 cores" of general purpose computing power with small amounts of vector processing tacked on the side.

I'm really wondering how you are deducing all these arguments from something I wrote that has nothing in common with what are you writing here..

Physics and AI are not a good fit for stream processors. It will work solely because there's a brutish amount of power on Cell sat doing not much else.
Jawed

If physics is not well suited to run on CELL then CELL designers have failed
cause physics was one of the applications they tried to address with CELL design.
I believe they know better than you and me, in fact CELL architecture seems well suited for physics calculations.

Jawed · May 22, 2005

Sony/NVidia are suggesting that higher-order surfaces are computed as a shared workload between Cell and RSX.

ATI is saying that HOS will run entirely on R500. Additionally raster output on R500 would appear to be dramatically more efficient. These two things are what I'm referring to as R500's significant advantages.

Not all physics is connected bodies. Streamed physics processing (i.e. SPEs) would seem to be good at connected bodies, because streaming is suited to dealing with small amounts of data on gazillions of objects, one after the other in a stream.

Collision testing of thousands of separate objects requires searching through huge amounts of memory (compared to the memory size of an SPE). Random access. A streaming processor is comparatively bad at random access, particularly as there's no caching in the SPEs (i.e. there's no look ahead - you have to program your own).

Of course if you want to make pretty exploding objects where none of the thousands of bits interacts with anything but the ground and other static objects then yes, that'll work trivially nicely on Cell.

We'll just have to wait and see eh? 8)

Jawed

DemoCoder · May 23, 2005

The R500's HOS AFAIK is essentially a fixed-function tesselation unit combined with a geometry shader. This is considerably less flexible than an SPE, and to use its geometry shaders, the R500 had to borrow them from pixel shading, whereas with CELL, it runs in parallel.

As for collision detection, it's just not true. Most collision detection algorithms use a hierarchical approach. At any given time, you do not need to check every object against every other, that would be insanely inefficient, even on the XB.

Moreover, it is likely that the SPEs would be tasked to handle multiple time-step integrations and constraint solutions in parallel, while the main PPE assists with scanning a hierarchical collision structure. Once rough a rough partition of the problem set is decided, the individual "high precision" collision checks can be offloaded to SPE.

MfA · May 23, 2005

The world at large only has local interaction, so obviously physics can be well parallelized

DemoCoder · May 23, 2005

Only if you believe Bell's Inequality hasn't been violated. If it has, and your physics engine uses hidden variables, then you need non-locality.

see colon · May 23, 2005

From a previous thread, Deno said the slide is rubbish as consoles have been doing "two way rendering" since PS1. I agree with Jaws/Jawed (can't remember which one ) that the slide was talking more about one way PC archiectures.

not to derail the conversation, but i'm pretty sure dreamcast was only one way. as far as i know it's the only one in recent memory.

Using the CPU to post process GPU work

Acert93

Artist formerly known as Acert93

randycat99

DemoCoder

Jawed

nAo

Nutella Nutellae

Jawed

DemoCoder

MfA

DemoCoder

see colon

All Ham & No Potatos

Similar threads