O.K. Deep breath, start explaining NPatches and displacement mapping.
This is might get a bit long....
There are 2 distinct stages to surface evaluators (of which NPatch and displacment maps are both a form of).
1) Tesselation
2) Perturbation
The second stage is doable on ALL vertex shader hardware, you want NPatches, want displacement maps, want subdivision surfaces just run it though a vertex shader. You take some input, run it through a basis function (which defines how you perturb the vertex) and voila any surface you like.
But to do this you have to tesselate the base triangle, this isn't programmable (yet) so you either do it on the CPU (could be expensive) or the hardware has a fixed function tesselator.
The fuxed function tesselator on the ATI 9700 was meant to be better than the fixed function tesselator on the ATI 8500, the ATI 8500 can only do integer steps (i.e. level 1 might give you 5 triangles, level 2 might give you 15 triangles) whereas the ATI 9700 was meant to have floating point levels (i.e. level 1 might be 5 triangles and level 2 might still be 15 but level 1.5 would give 10).
That appears to be the only difference between Truform v1 and Truform v2, the fact that lots of people think its much slower on 9700 might indiacate a hardware bug and they've fallen back to software I honestly don't know. The other possiblity why the 9700 seems slower is that AFAIK ATI had dedicated NPatch perturbation hardware in the ATI 8500, it seem likely that they may have removed this and moved its function into the vertex shader (in the same way they moved the entire fixed function vertex pipe into vertex shaders).
Now a clever tesselator would alter the tesselation based on the data (i.e. if your surface is flat, don't add more triangles), this of course makes the 2 stages dependent. This is what the Matrox card does, its fixed function only but tesselates the mesh based on the basis function and the displacement map.
Now whats likely to happen one day, is that we will have a tesselation shader that generates vertices (and triangles) that are passed to a vertex shader that perturbs them. This combined with vertex texturing would give generalised programmable surface evaluators. We can currently do a crude form of vertex texturing (the badly titled pre-sampled displacement maps), have fairly capable vertex perturbation shaders (/plug check out my ShaderX2 article, I'm hoping to put it up here on Beyond3D soon /end plug) but only fixed function or CPU tesselation shaders. Vertex shader V3 gives us proper vertex texturing and I predict DX10 will give us tesselation shaders.
BTW Some people seem to think that we are moving to a Renderman type system with vertex and pixel shaders becoming one, I happen to disagree. While some hardware may be shared, the problem with a single shader approach is the removal of the concept of frequency. For real-time shaders will need to evaluate different things at different rates, I'm personally hoping that DX10 shaders will have advanced frequency concepts (i.e. run this shader every base triangle, this every vertex, this every pixel, this every sub pixel and this every write to the framebuffer). A system like that would give us (with shared memory) tesselation shaders, alpha blender shaders, anti-aliasing shaders etc.