Deano's multi-processing post and Sweeney's comments

Reverend

Banned
First of all, Deano's post.

I brought this to Tim Sweeney's attention and he had the following comments to make :

I definitely see the CPU as The Right Place for performing tesselation and other geometry operations, and have always lobbied against these features being hardcoded on future GPU's. After all, the interesting tesselation algorithms (such as subdivision surfaces) require lots of branchy code with random memory access, something that is better suited to a general CPU than a GPU.

Regarding vertex shading, it's trending to be such a small part of the workload compared to pixel shading that long-term it won't matter very much where this is performed.

Just thought this may be of interest.
 
And this is a nice argument for having a unified shader architecture. If you want to do much of your Geometry processing via the CPU then all of the ALU's of a unified shader graphics processor can be dedicated to Pixel Shading, however you can still call on the graphics processor to do some geometry processing if the CPU is bogged down or have a task that the CPU wouldn't be good at (such as a vertex task that requires textures).
 
I'd really rather we work toward having the GPU do all of the graphics work, so that the CPU can do other tasks. No matter how powerful the CPU is, there are ways to exploit that power in ways other than graphics in games.
 
In my mind the tesselation of landscapes should be done on cpu because of course their are alot of calculation mainly collision detection that need the geometeric data anyway.
 
bloodbob said:
In my mind the tesselation of landscapes should be done on cpu because of course their are alot of calculation mainly collision detection that need the geometeric data anyway.
AFAICS that's not necessarily the case.
If you are only testing for a small number of collisions (relative to the number of polygons in the geometry), then it's probably much cheaper to only tessellate the regions local to each likely collision. You can also then take advantage of a natural heirarchy to get nested bounding boxes to avoid some tessellation as well.
 
bloodbob said:
In my mind the tesselation of landscapes should be done on cpu because of course their are alot of calculation mainly collision detection that need the geometeric data anyway.
I don't think collision detection should be done with high-res triangle meshes anyway. I would typically expect that either lower-resolution meshes or analytical representations of meshes would be much more efficient.
 
Simon F said:
bloodbob said:
In my mind the tesselation of landscapes should be done on cpu because of course their are alot of calculation mainly collision detection that need the geometeric data anyway.
AFAICS that's not necessarily the case.
If you are only testing for a small number of collisions (relative to the number of polygons in the geometry), then it's probably much cheaper to only tessellate the regions local to each likely collision. You can also then take advantage of a natural heirarchy to get nested bounding boxes to avoid some tessellation as well.

Ahhh too true.

Chalnoth said:
I don't think collision detection should be done with high-res triangle meshes anyway. I would typically expect that either lower-resolution meshes or analytical representations of meshes would be much more efficient.
Analytical methods might be nice but could take a fair bit of rework of exisiting engines. The sort of things I was thinking of is stuff like nurbs for out door areas.
 
After all, the interesting tesselation algorithms (such as subdivision surfaces) require lots of branchy code with random memory access, something that is better suited to a general CPU than a GPU.
Any (non pathological) subdivision surfaces algorithm can be expressed as a set of linear operators in a common vector space, that is like a a set of matrices applied to a vector.
Once vertices valence is bounded it's possible to fully tesselate a mesh via subdivision surfaces without making a single branch, so this kind of subdivision schemes are very suitable for a GPU-like stream processing.

Addendum:
Regarding random accesses to memory, they can be reduced to < 10 per subdivided mesh ;) so even random memory accesses are not a problem..

ciao,
Marco
 
Back
Top