NVIDIA Fermi: Architecture discussion

Just tesselation or tesselation + additional effects (like displacement mapping and necessity to shade the new geometry)?
 
The topic about how GF100 supports tesselation is probably whether the architecture has a fixed function tesselation unit or not. Chances are very high that there isn't one present but that doesn't mean that the usual tasks such TS unit is meant for are done in software per se unless someone considers programmable hw suddenly as software.

Unless I've understood the data flow in DX11 wrong diagrams showed an oversimplified: hull shader -> (ff) tesselation unit -> domain shader. Meaning the question is how and where the TS tasks are performed and in extension with what efficiency.

I also agree with nao that implementations can be vastly different between them; hell you can have two different ff units from different architectures that can have large differences both in implementation and in efficiency.

As for Rys' comment on it, I'd figure he'll step up and elaborate as to what he exactly meant, but I have the feeling it has been driven vastly out of context. He specifically said 'software' tesselator and not software tesselator.

Anyway call to any experts reading here: the way I've understood it so far is that some of the 4 basic tasks a ff tesselation is meant to do, some are better suited for fixed function and some are better suited for programmable hw and yes that stands open for correction.

Cliff notes: it's hard to say the implementation rocks or sucks unless you have precise details on it, which I severely doubt anyone has at this point and even then a real time experiment would show the efficiency of the implemtation.

5870 takes about a 40% performance hit in Unigine benchmark when tessellation is enabled. it will be interesting to see if Fermi/GF100/GT300 takes a larger or smaller performance hit. that is the only thing that should matter.

No an IHV specific techdemo/benchmark should not be any defining point, but real time usage in future games.
 
Just tesselation or tesselation + additional effects (like displacement mapping and necessity to shade the new geometry)?

Tesselation won't do anything by itself I believe, take the simple example of a cube, each face of which is made with two triangle.
Apply tesselation and you will end up with a cube, each face being a square made with say, 128 triangles.


But I realise you probably are aware of that, and that brings a good question. If only totally dumb tesselation is applied all over the scene, it may end up slower than doing the additional processing, including applying a LOD model to avoid drawing millions useless triangles?
 
Just to be clear. This was never about calling Rys out. They put him in the message as kinda a tongue in cheek joke. To quote the original source of the rumors. If it was my choice he wouldn't have been included in the response. But I don't edit the responses either.

Ailuros is absolutely right that it was blown completely out of context. But thats the way the internet is. I know some of you guys only visit this forum. Or one or two others. But around the web you'd be surprise how what was originally quoted. Became something entirely else to the point of "Nvidia is Emulating DirectX 11 Tessellation". The question actually appeared at Geforce Zone, and it was decided to just answer it rather than let it get any further out of context than it already is.

The topic about how GF100 supports tesselation is probably whether the architecture has a fixed function tesselation unit or not. Chances are very high that there isn't one present but that doesn't mean that the usual tasks such TS unit is meant for are done in software per se unless someone considers programmable hw suddenly as software.

Nvidia has been very clear to me in this context at least. That they are saying Fermi has dedicated ASIC for tessellation.
 
Last edited by a moderator:
To be honest, I was surprised nvidia let the cat out of the bag about their tesselation unit - they love their surprises as history has shown. But then again, more and more I get the feeling it's not the same nvidia as we've seen the past years - some mentalities are (slowly) changing.
No an IHV specific techdemo/benchmark should not be any defining point, but real time usage in future games.
Bingo.
 
One thing I do have to wonder is whether it actually matters that Fermi isn't out presently so long as the competing hardware is sold out/barely available? Yes?/No?
 
To be honest, I was surprised nvidia let the cat out of the bag about their tesselation unit - they love their surprises as history has shown. But then again, more and more I get the feeling it's not the same nvidia as we've seen the past years - some mentalities are (slowly) changing.

Not necessarily to the better always.


Do you want your 20 bucks now or later? :LOL: ;)
 
I doubt nVidia has full dedicated fixed function hardware for the tessellation. More likely they have a little bit of dedicated hardware and majority of the tessellation math is done in the general purpose shader cores. Radeon DX11 chips also moved some math from texture filtering units to general purpose shader cores. There is no purpose to generate dedicated fixed function hardware for all the new chip features anymore.
 
Radeon DX11 chips also moved some math from texture filtering units to general purpose shader cores.
attribute interpolation (which is what rv8xx moved inside shader core) certainly isn't part of texture filtering, in fact it's not part of texturing at all, so I take issue with that statement...

There is no purpose to generate dedicated fixed function hardware for all the new chip features anymore.
This is certainly true.
 
Wasn't unified hardware the way to go - especially remebering some presentations in late-summer of 2006? Why is this suddenly not true anymore for tesselation?
 
Wasn't unified hardware the way to go - especially remebering some presentations in late-summer of 2006? Why is this suddenly not true anymore for tesselation?

Unified is the way to go for programmable hardware. But for fixed-function stuff there's still the opportunity to have highly optimized dedicated hardware that would justify its existence with much higher performance.
 
Wasn't unified hardware the way to go - especially remebering some presentations in late-summer of 2006? Why is this suddenly not true anymore for tesselation?
Well, for the same reason there is still dedicated texturing, setup, colour/blend/depth & etc. hardware. Until recently, ATi still supported HW attribute interpolation unit, now void in Evergreen arch. Some things are still better left on their own trannies, performance and die area wise.
AFAIK, tessellation could be performed with GS instancing (in DX10.0). There was a presentation or a white paper on the matter.
 
Wasn't unified hardware the way to go - especially remebering some presentations in late-summer of 2006? Why is this suddenly not true anymore for tesselation?

Unified was the way to go because pixel and vertex shaders were a lot like in their functionality anyway, so it made sense to use a single piece of hw for both of them instead of using two different shader units, overall increasing efficiency as now you have less hw elements in your pipeline.

Tesselator does something that is in not common to anything else. Ie, rops, rasterizer, primitive assembly are all very different, so it is not possible to use a single piece of ff hw for all of them. Larrabee tries doing it all in it's alu's, but whether the flexibility of programming and the dynamic load balancing it can do are worth the area and power cost remains to be seen.

In fact, of all the ff hw out there, it seems (atleast from 30k feet) that the two things most suitable (that does not imply a match made in heaven) for unification are tesselator and the rasterizer. If anything, they are closer to each other than any other piece of ff hw out there.
 
5870 takes about a 40% performance hit in Unigine benchmark when tessellation is enabled. it will be interesting to see if Fermi/GF100/GT300 takes a larger or smaller performance hit. that is the only thing that should matter.

Just tesselation or tesselation + additional effects (like displacement mapping and necessity to shade the new geometry)?

To clarify, when you mention a 40% hit in Unigine when applying tesselation, are you implying that tesselation alone incurs a 40% performance hit, or that applying tesselation and displacement maps (and the new geometry casting and receiving shadows) has a 40% hit? There is a difference as you can use tesselation for purposes other than displacement mapping.
 
Charlie's latest piece is up, saying A2 has taped out and Nvidia's internal roadmap has Fermi in H1 of next year .. Fermi variants no where to be seen.
 
Back
Top