I just caught this from the updated CineFX whitepaper about how the NV3x can perform skinning:
Hmmm, will it be practical to do skinning with shaders in games? I thought that this was done on the CPU by the game application...
Under DirectX 9, this improves somewhat. One shader can now be written to represent the example involving up to four bones, but since DirectX 9 only supports a very limited notion of branching, it can only be performed per object, which means that the model must still be broken up and drawn separately.
The NVIDIA “CineFX†architecture, with its fully generalized loops and branches that can be data-dependent, has a much more straightforward programming methodology. One shader is written to encompass all the skinning methods and operations, and since the shader can branch on a per-vertex basis, it is not required to break up the model. By performing the loop conditionally on a per-vertex basis, segmentation of the model is not necessary, dramatically improving both application performance as well as developer productivity.
Hmmm, will it be practical to do skinning with shaders in games? I thought that this was done on the CPU by the game application...