It seems there is some confusion how DX11 tessellation works.
The fixed function part of the tesselator only calculates the domain location of each point based on the configuration that is calculated by the hull shader for each patch. The whole interpolation of the vertex attributes need to be done in the domain shader that is called for every calculated domain location onces.
The "fixed function" tessellator is computing the location of the new vertex (interpolating multiple vertices based on control points and tessellation factors). The debate we're having is over the use of programmable ALUs (as Marco indicated) to perform the interpolation required by TS.
After TS, DS is used to interpolate the original control points' attributes (e.g. normal) for the new naked vertices (which are points only). DS is converting points into vertices by giving the vertices more properties than merely location. DS can also be used to perform displacement mapping, which then alters the computed location of the vertices - but not necessarily by using any kind of interpolation.
Rasterisation then requires a final interpolation of attributes, per pixel, derived solely from the per-vertex attributes.
So, apparently with R800, these three kinds of interpolation are all running on the ALUs. DS is programmable interpolation, but TS and RS require fixed-function interpolation.
So, the question is what is the mechanism by which ATI TS accesses the ALUs to obtain interpolated points for the new vertices. I'm merely suggesting that TS generates two input streams for an IN shader to consume and spit out vertex locations. This is similar to the way that NVidia's MI consumes two streams: the plane equation constants A, B and C (one set of these per primitive), plus attributes, to spit out an interpolated attribute for a pixel.
Similarly, during PS an attribute (e.g. normal) is interpolated on-demand, just like in NVidia. NVidia seemingly does this by generating extra instructions in the compiled kernel. ATI might do this, or might have an IN kernel that takes the place of the SPI unit in older architectures. SPI generates a full set of all required attributes before PS commences.
IN would then have an output buffer which it is allowed to fill. This output buffer would be a primary parameter for scheduling IN, basically presenting PS with a set of on-demand attributes.
Obviously, just guessing here.
Jawed