I thought strips were the "naive" approach to vertex ordering? I still don't understand why vertex count matters though, it's more the ordering that dictates cache efficiency right? At least that's what I got from Forsyth's paper -
http://home.comcast.net/~tom_forsyth/papers/fast_vert_cache_opt.html
What is the order of the vertices coming out of TS? Seems likely to be strips ("rasterised" as it were).
If PTVC is 32 but a patch produces hundreds of vertices... What then are the chances of having to re-process the patch multiple times?
An algorithm like his could be built into TS, I guess, with cognisance of the varying tessellation factors that are possible along each edge.
Maybe this doesn't matter because the cost of re-processing a patch is much lower than fetching/shading vertices in a non-tessellated pipeline
Not to mention that tessellation, used to tackle LOD, should be a massive win due to general bandwidth savings.
Page 22 here:
http://developer.amd.com/gpu_assets/Real-Time_Tessellation_on_GPU.pdf
references PTVC but appears to be talking about cache space being taken with vertex attribute data (not just vertex index data), I think. Seems to suggest PTVC is working OK regardless.
Other things I've seen vaguely seem to suggest that tessellation, in producing "localised, intense" meshes will generally have good PTVC performance. I'm not convinced because it seems they're just going to come out in strips. So, ahem, that's better than lots of disparate patches of triangles in random order, but it's nothing like optimal for a mesh.
So I really don't know, one way or another, which is why I'm pondering it.
Jawed