The appealing thing about this model is how data-driven and freeform it is. The mesh shader pipeline has very relaxed expectations about the shape of your data and the kinds of things you’re doing to do. Everything’s up to the programmer: you can pull the vertex and index data from buffers, generate them algorithmically, or any combination.
At the same time, the mesh shader model sidesteps the issues that hampered geometry shaders, by explicitly embracing SIMD execution (in the form of the compute “work group” abstraction). Instead of each shader
thread generating geometry on its own—which leads to divergence, and large input/output data sizes—we have the whole work group outputting a meshlet cooperatively. This mean we can use compute-style tricks, like: first do some work on the vertices in parallel, then have a barrier, then work on the triangles in parallel. It also means the input/output bandwidth needs are a lot more reasonable. And, because meshlets are indexed triangle lists, they don’t break vertex reuse, as geometry shaders often did.
....
It’s great that mesh shaders can subsume our current geometry tasks, and in some cases make them more efficient. But mesh shaders also open up possibilities for new kinds of geometry processing that wouldn’t have been feasible on the GPU before, or would have required expensive compute pre-passes storing data out to memory and then reading it back in through the traditional geometry pipeline.
With our meshes already in meshlet form, we can do
finer-grained culling at the meshlet level, and even at the triangle level within each meshlet. With task shaders, we can potentially do mesh LOD selection on the GPU, and if we want to get fancy we could even try dynamically packing together very small draws (from coarse LODs) to get better meshlet utilization.
In place of tile-based forward lighting, or as an extension to it, it might be useful to cull lights (and projected decals, etc.) per meshlet, assuming there’s a good way to pass the variable-size light list from a mesh shader down to the fragment shader. (This suggestion from
Seb Aaltonen.)