The topic about how GF100 supports tesselation is probably whether the architecture has a fixed function tesselation unit or not. Chances are very high that there isn't one present but that doesn't mean that the usual tasks such TS unit is meant for are done in software per se unless someone considers programmable hw suddenly as software.
Unless I've understood the data flow in DX11 wrong diagrams showed an oversimplified: hull shader -> (ff) tesselation unit -> domain shader. Meaning the question is how and where the TS tasks are performed and in extension with what efficiency.
I also agree with nao that implementations can be vastly different between them; hell you can have two different ff units from different architectures that can have large differences both in implementation and in efficiency.
As for Rys' comment on it, I'd figure he'll step up and elaborate as to what he exactly meant, but I have the feeling it has been driven vastly out of context. He specifically said 'software' tesselator and not software tesselator.
Anyway call to any experts reading here: the way I've understood it so far is that some of the 4 basic tasks a ff tesselation is meant to do, some are better suited for fixed function and some are better suited for programmable hw and yes that stands open for correction.
Cliff notes: it's hard to say the implementation rocks or sucks unless you have precise details on it, which I severely doubt anyone has at this point and even then a real time experiment would show the efficiency of the implemtation.
5870 takes about a 40% performance hit in Unigine benchmark when tessellation is enabled. it will be interesting to see if Fermi/GF100/GT300 takes a larger or smaller performance hit. that is the only thing that should matter.
No an IHV specific techdemo/benchmark should not be any defining point, but real time usage in future games.