GeForceFX and displacement mapping?

Another problem with combining matrix palette skinning with adaptive tesselation (or any tesselation for that matter), is how to "interpolate" the matrix indices.

It's possible to solve with an abundance of indices, but it's inefficient.
 
Basic said:
Another problem with combining matrix palette skinning with adaptive tesselation (or any tesselation for that matter), is how to "interpolate" the matrix indices.

It's possible to solve with an abundance of indices, but it's inefficient.

One way: If you know that the indices are to be used for matrix palette skinning only, you could, for the polygon, collect all the distinct indices needed for all of its vertices and then linearly interpolate the weights of each unique index. In the worst case, if you allow N unique indices for each vertex, the vertex shader needs to look at 3N unique {index, weight} pairs within a triangle. Although I suspect that very few models require more than, say, 4 unique indices for any given polygon, so restricting artists to 4 unique indices per polygon seems doable.
 
arjan de lumens said:
Also, a question (to all): Barring the case of amplified/on-the-fly generated geometry, is there anything that can be done with VS3.0 that cannot be done with, say, PS3.0 and render-to-vertex-array?

Could you define "render to vertex array" and how it differes from other mechanisms like texture reading in the vertex shader (possibly linked to render to texture using the pixel shader in a previous pass), output to memory from the vertex shader ?

Does Render to Vertex Array come with the disadvantage of a one to one mapping between the vertex array you rendered to and the pass that uses the result of render to vertex array ?

In short just define Render to vertex array :)

K-
 
ERP said:
Taking the risk of offending every Matrox fan on the board.

Matrox style displacement mapping is really nothing more than a stop gap solution, the right way to do it is texture reads in the vertex shader (as in VS3.0).

have you seen any others than me? ;)
in a matter of fact, (going a bit off topic, but I'll continue.) have you seen hardcore matrox fans lately? :)

And before starting with MURC, I don't think you can say ppl that are planning publically to sue Matrox because of banding artifacts on Parhelia, to be very hard core fans. (might be IMO though, but I haven't seen that kind of action on nVnews nor Rage3D.)

so I doubt that there's anyone on board that your comment might take to take trail of HellBinder (when talking about ATI) or Chalnoth (when talking about nVidia.) or me (unfortunately I have (or at least had.) a weak point too, but It isn't Matrox.)


and in the end, so far, since first DirectX came, all versions and their supportted features have been being more or less stopgaps. at least I haven't seen much things that would be same as on previous versions. ;)
 
Kristof said:
Could you define "render to vertex array" and how it differes from other mechanisms like texture reading in the vertex shader (possibly linked to render to texture using the pixel shader in a previous pass), output to memory from the vertex shader ?

Does Render to Vertex Array come with the disadvantage of a one to one mapping between the vertex array you rendered to and the pass that uses the result of render to vertex array ?

In short just define Render to vertex array :)

K-
OK... you render to a standard pbuffer/offscreen color buffer using the usual rendering pipelines/pixel shaders. Then, you point the vertex shader to the buffer and tell it to interpret the data in the buffer as a vertex data stream. It's conceptually rather similar to 'render-to-texture', altough the outcome is obviosuly very different. This technique obviously comes with the issue/disadvantage that there must be a 1:1 correspondence between the 'pixels' in the buffer and vertices that are sent to the vertex shader. It is, of course, an advantage if the pixel shader supports a floating-point pixel format.

It seems from the discussion here that P10 and NV30 (no word about R300) are both just programmable enough to use render-to-vertex-array to do displacement mapping.
 
Well the main problem I have with such a buffer is how do you tell the pixel shader to act this way. You need to move the pixel shader from processing pixels to processing what in essence are vertices. This means you need a special mode or you need to generate "triangles" (since that is what the PS traditionally processes) to match up with the positions in the pbuffer and this most likely means generating one pixel triangles which are very inefficient... unless as said you have a special mode, but even then it gets tricky since the pipelines of the PS tend to depend on processing 2x2 areas or a triangle...

I am also not fully convinced that using the pixel shader resources to do things the vertex shader should do. We are moving towards similar capabilities between PS and VS, they have the same capabilities but one processes pixels and one vertices. What you seem to propose is to link PS and VS for "some" processing so that texture accesses can be handled by the PS which for a moment becomes part of the VS ?

I guess I prefer the concept of sampling the data, where a texture can be generated (anyway you see fit) and this can then be sampled by the vertex shader. Essentially a form of data store than can be read by vertex shader and read/written by the pixel shader. Difference with a vertex array is that a vertex array gets stuck in the odd 1:1 mapping, a texture not necessarily (although some techniques might still require this).

K-
 
If it can render to an offscreen float buffer it can do render to vertex array. It's just a matter of getting the driver to accept the rendered texture as a vertex buffer. So it shouldn't be a problem for the R300.
 
Humus said:
If it can render to an offscreen float buffer it can do render to vertex array. It's just a matter of getting the driver to accept the rendered texture as a vertex buffer. So it shouldn't be a problem for the R300.

One issue though might be the data issue rate, assuming this is still linked to NPatch tesselation. What happens to a secondary data stream if tesselation is enabled, does the hardware try to tesselate it as well ?

I mean in essence we have one stream, which contains triangles but this stream is transformed in a tesselated stream of triangles. So one NPatch triangle in and X triangles come out of the tesselator. Now what happens to the secondary stream which comes from the render to vertex array ? If you can not link the tesselated input data with the non-tesselated data stream from the array then things don't work ? In short you dont want the array stream to get interpolated by the HOS... I think... can that be done ?

If your just looking to this for some kind of "store" to video memory of the vertex shader output with no tesselation (e.g. for faster multipass rendering), then it should be possible...

K-
 
arjan:
I think that what you described was the "abundance of indices" was what I was thinking of. But if the base triangle is a part of a mesh, then you'd need to add empty index slots for the indices that the neighbouring triangles uses too. The emty slots could often be reused, but not always.
So it would need even more than 3N triangles.

As an simple example, think of a tetrahederon with max 1 matrix per vertice. (Not much need for matrix palette skinning here, but I'm just trying to keep it simple.) It would need 4 indices per vertex, just to make the position of the indices in the interpolated vector right.

Kristof:
I don't know the answer, but I have a strong feeling about what it might be.
I assume that it will be like a normal render to (high precision) texture. And then when your finished you simply say "No no, this ain't a texture, it's a vertex array". You'd of course need to know in what order the "texels" are stored in the texture, so you can index them.
 
Kristof said:
Well the main problem I have with such a buffer is how do you tell the pixel shader to act this way. You need to move the pixel shader from processing pixels to processing what in essence are vertices. This means you need a special mode or you need to generate "triangles" (since that is what the PS traditionally processes) to match up with the positions in the pbuffer and this most likely means generating one pixel triangles which are very inefficient... unless as said you have a special mode, but even then it gets tricky since the pipelines of the PS tend to depend on processing 2x2 areas or a triangle...

I am also not fully convinced that using the pixel shader resources to do things the vertex shader should do. We are moving towards similar capabilities between PS and VS, they have the same capabilities but one processes pixels and one vertices. What you seem to propose is to link PS and VS for "some" processing so that texture accesses can be handled by the PS which for a moment becomes part of the VS ?

K-

If you have a triangle mesh corresponding to, say, an N-patch or Bezier patch, you should be able to line the vertices up in the pbuffer in such a way that the pixel shader can just render to all of them just like it would render to any old triangle or quad, with no particular need for 1-pixel triangles to get the work done. This would be a rather efficient way to interpolate data across the mesh or apply a texture for, say, displacement mapping. The only special feature needed is that the pbuffer must have a linear storage layout, or else accessing it as a vertex array gets a bit goofy.

Of course, if you want to generate multiple pieces of data per vertex, you will need to set up one pbuffer for each piece of vertex data; you can still use 'multiple render targets' to fill multiple pbuffers at the same time.

And as for using pixel shaders as vertex shaders in general: I do believe that their hardware will converge to such a point that the functional units will be shared between them, with only support logic operating differently for vertex and pixel shading support modes.
 
Basic said:
arjan:
I think that what you described was the "abundance of indices" was what I was thinking of. But if the base triangle is a part of a mesh, then you'd need to add empty index slots for the indices that the neighbouring triangles uses too. The emty slots could often be reused, but not always.
So it would need even more than 3N triangles.
I don't see why: on vertices that lie on the edge between two control vertices you would only need to fill slots corresponding to the 2 vertices, as the weights of the other vertices in the 2 original triangles would compute to exactly 0 on that edge - so you would need 2N slots on the edge and 3N slots elsewhere. You do, of course, get a problem if the edge is a crease (that is, the normal vector is discontinuous on the edge), but that is a well-known problem with e.g. N-patches in any case.
As an simple example, think of a tetrahederon with max 1 matrix per vertice. (Not much need for matrix palette skinning here, but I'm just trying to keep it simple.) It would need 4 indices per vertex, just to make the position of the indices in the interpolated vector right.
Umm, no - wouldn't you use normal vectors for that, to control the tessellation? Ummm - OK, now I see the problem: before you tessellate, you need to transform the normal vector using the matrix palette, which means you need to do a lot of work both before and after tessellation. Still nothing unsolvable, but we're moving fast towards the shade-tessellate-shade-again scheme that gking suggested here.
 
Maybe I see the difference in how we're thinking.

I wanted the base mesh (control points) to be designed so it had a good T&L and cache efficiency even when not using it for N-patches. So each vertex should be one entity, and that includes the matrix inidces and weights. And each vertex must carry matrix indices and weights for all vertices of all triangles using the first vertex.

You can't reuse a slot in vertex A so it's one index in triangle ABC and another index in triangle ADE, even if the weight at vertex A is 0. Because the index is also interpolated and will have some messed up value in the middle of the base-triangle.


But maybe you didn't think of it that way at all.
It will work as you said if you make the base mesh as a triangle list, and duplicate vertices around the areas where you switch between different matrices. The duplicated vertices would have same position/normal/... but different matrix_index_and_weight-vectors.


The "transform normal before tesselation" is indeed a problem (that I didn't think of). But it should already be solved, since that's a problem even without matrix palette skinning. It's probably solved by first T&L'ing the control points, tesselating, and then T&L'ing the tesselated values.
 
"Presampled displacement mapping" basically applies geometry decimation to a high-res mesh, and creates a low-res N-patch mesh (with an assumed static tesselation level in the app) and a stream of vertex displacement values. There should be a tool that ships in D3DX that will generate these presampled DM meshes.

Kristof:
IIRC this is why Matrox is actually proposing to do the skinning on the CPU on the low tesselated model

That's effectively what I proposed, only I was recommending that the skinning be done on the GPU, rather than the CPU.
 
hmm.. indeed, maybe Transform - tessellate - Light pipeline would be the most universal solution ?
Light stage and fragment shaders could even be unified, given that tessellator is guaranteed to output fragment-sized polygons.. er micropolygons. Now, where is this all heading to ? 8)
 
no_way said:
hmm.. indeed, maybe Transform - tessellate - Light pipeline would be the most universal solution ?
Light stage and fragment shaders could even be unified, given that tessellator is guaranteed to output fragment-sized polygons.. er micropolygons. Now, where is this all heading to ? 8)

No way: does "säteen seuranta" ring the bell? ;)

to the others, it's ray tracing in english.
 
I don't think that matrix skinning will be good enough by the time we move to displacement mapped HOS characters. Some kind of muscle simulation will be a must to get characters that can deform correctly. After all, skin is not rotating with the bones in real life, it's pushed and pulled and sliding over the muscles and fat under it.
I'd say that it is impossible, no matter how long an artist tweaks the weighting, to get realistic deformations without better tools...
 
That's why movie studios use blend shapes and displacement maps to handle character animation, in addition to conventional skinning.

Nappe1 said:
No way: does "säteen seuranta" ring the bell?

to the others, it's ray tracing in english.

Actually, transform, tesselate, light, and micropolygons sounds like REYES and Renderman. Which could be ray traced (a la BMRT) or scan-line rendered (a la PRMan 3), or use a hybrid approach (a la Entropy and new versions of PRMan).
 
Basic said:
Maybe I see the difference in how we're thinking.

I wanted the base mesh (control points) to be designed so it had a good T&L and cache efficiency even when not using it for N-patches. So each vertex should be one entity, and that includes the matrix inidces and weights. And each vertex must carry matrix indices and weights for all vertices of all triangles using the first vertex.

You can't reuse a slot in vertex A so it's one index in triangle ABC and another index in triangle ADE, even if the weight at vertex A is 0. Because the index is also interpolated and will have some messed up value in the middle of the base-triangle.
Ummm, my whole idea was a scheme to avoid interpolating the indices at all. Consider, for example, a triangle/N-patch where the matrix indices at each vertex/control point are, say {M0, M1, M2}, {M2, M3}, {M1, M3, M4}. In this case, you would have 5 distinct matrix indices for the triangle/N-patch - {M0, M1, M2, M3, M4}, and thus, for a generated vertex within the N-patch, you need 5 slots to collect the 5 distinct matrix indices. For their weights, you determine the weight at each control point of each of the 5 matrices (inserting weight = 0 if a matrix isn't listed for a control point) and interpolate them linearly across the N-patch. This scheme should work just the same regardless of the order in which the vertices/control points are laid out in/fetched from memory.

The "transform normal before tesselation" is indeed a problem (that I didn't think of). But it should already be solved, since that's a problem even without matrix palette skinning. It's probably solved by first T&L'ing the control points, tesselating, and then T&L'ing the tesselated values.
Actually, when I think of it, I suspect that nothing at all should be transformed before (N-patch) tessellation, even with matrix palette skinning.
 
Reading threough the discussion and thinking about this some more........

Once you have matrix/texture access in the vertex shader, the only restriction you need to remove to be able to do tesselation in the vertex shader is the 1 input->1 output restriction. The cost of doing this is that it can make exploiting the natural parallelism much more complex (you end up with an almost general purpose cpu optimised for vector ops). And perhaps that will be enough to prevent solutions of this nature.

The more I think about it the more I think that adding yet another programmable unit into the pipeline moves us further away from a general solution and does more to limit flexibility.

Hmmm...

Edit -- Thinking about this there is nothing stopping you implementing parralellism on blocks of vertices using this approach, however you'd have the same issue as using both VU's on the PS, it's difficult to guarantee primitive submission order -- /Edit
 
gking said:
That's why movie studios use blend shapes and displacement maps to handle character animation, in addition to conventional skinning.

That's a kind of a last resort as it is very time-consuming to create all the blendshapes/textures and it is also hideously hard to get it right. For example the shoulder and hip areas have a very big freedom of rotation around most of it's axes. Also, the extra blend shapes and textures can consume a lot of memory that should be spent on the characters.

Smaller studios stuck with off-the-shelf software generally use matrix skinning combined with other deformers (lattices, clusters, etc.) to get skinning right; however it is a common method to have 2 or even more versions of a character rig for different motions. Simple blending has several big problems, like it does not preserve volume and cannot imitate skin sliding over the underlying tissue.

Movie studios have developed muscle based skinning tools to solve this problem. ILM did it for the Mummy, Weta did it for LOTR, Dreamworks did it for Shrek. There's already a commercial tool called ACT available as well, for 3ds max (www.cgcharacter.com). It also takes some time to set up properly (unfortunately I don't yet have any experience with such tools) but the results usually do not require any further patching and work in almost every case.
The idea is actually quite simple, you need a system of muscle volumes that are attached to the skeleton and deform with it (you can also add dynamics if you want to). Then you 'wrap' the skin around it and you're ready (theoretically ;).

As realtime game art grows in complexity, I'm quite sure that such advanced skinning methods will have to be implemented. You can already see the problems I've mentioned in games like Silent Hill 3 or even DOA3...
 
Back
Top