NVIDIA GF100 & Friends speculation

Of course. 40% was an average, that should be understood.




DX11!=tesselation. Also, the only game I've seen getting much DX11 pub is Dirt 2, so far, to little effect, it looks about the same as the DX10 version.

I'm just wondering what games will make a big tesselation push? In order to do that a developer would have to deviate in a major way from the console version, which I do not see any current examples of. EG, Assasins Creed 2, Batman Arkham Asylum, Modern Warfare 2, etc.

Xbox360 has a tesselator. I've seen it in some games, Star Wars: The Force Unleashed appears to use it(poorly, but yeah). Of games I can think of right now, I think the new AVP game coming out soon uses it to nice effect.
 
There are alot of interesting things that can be done with tessellation with a little creativity. The hair demo you've seen is actually physx + tessellation.

The hair in that demo is impressive, but being a dummy head on a blank screen is less so. Tell them to make a demo of it in a scene with some people doing something.:smile:
 
There are alot of interesting things that can be done with tessellation with a little creativity. The hair demo you've seen is actually physx + tessellation.

There was a very interesting paper on using a fast radix sort to help efficiently render many layers of hair. It's possible that they've integrated that technique into PhysX as that's one algorithm that Nvidia claims to have gotten a boost by Fermi's caches.

Rys, you're joking right? I was hoping to read B3D's piece before I so much as glimpsed at another website :(

I'm just wondering what games will make a big tesselation push? In order to do that a developer would have to deviate in a major way from the console version, which I do not see any current examples of. EG, Assasins Creed 2, Batman Arkham Asylum, Modern Warfare 2, etc.

Actually Batman is a good example of a significant deviation. So although many people don't like their tactics we're probably much better off if Nvidia has an advantage with tessellation as they are more aggressive in pushing developers to adopt things that favor their hardware. It should be a bit easier too since tessellation isn't a proprietary thing that will enrage the populace.
 
I suspect he's more right than wrong on this issue, I bet they have a little hardware for the most fixed function, specific parts of the tesselator and the rest is just some kind of shader emulation.
You do realize that, other than the fixed function tesselator itself, the spec requires everything to be shader based, right?

So it's a bit of a mystery to me why you're expressing such disdain for what seems to be the correct way to implementing it?

I'm sure you can elaborate...
 
Oh puhleeze :D Never, ever, ever put Rys on the same level as Chuck. Sometimes it's not what one says but the way one says it that is important :)

Since when it is a bad thing for someone with an industry insider source to speculate. Thats what Rys has been doing. I thought the article was great given the information available on the web at the time. Not calling you out here just reaffirming what you've said here. Rys wasn't trying to spread FUD. He was trying to have a technical "but" speculative piece on Fermi.
 
You do realize that, other than the fixed function tesselator itself, the spec requires everything to be shader based, right?

So it's a bit of a mystery to me why you're expressing such disdain for what seems to be the correct way to implementing it?

I'm sure you can elaborate...

So in effect Charlie was right :rolleyes: Ok just making sure. /wink

not meant as a "Love Charlie" post btw... for those unaware of sarcasm
 
Last edited by a moderator:
Please point out the specific parts of the below statements that are accurate?

Zero cost tessellation on RV870?
Inefficient tessellation on Fermi?
Cypress owning Fermi in DX11?

In the R870, if you compare the time it takes to render 1 Million triangles from 250K using the tesselator, it will take a bit longer than running those same 1 Million triangles through without the tesselator. Tesselation takes no shader time, so other than latency and bandwidth, there is essentially zero cost. If ATI implemented things right, and remember, this is generation four of the technology, things should be almost transparent.

Contrast that with the GT300 approach. There is no dedicated tesselator, and if you use that DX11 feature, it will take large amounts of shader time, used inefficiently as is the case with general purpose hardware. You will then need the same shaders again to render the triangles. 250K to 1 Million triangles on the GT300 should be notably slower than straight 1 Million triangles.

The same should hold true for all DX11 features, ATI has dedicated hardware where applicable, Nvidia has general purpose shaders roped into doing things far less efficiently. When you turn on DX11 features, the GT300 will take a performance nosedive, the R870 won't.
 
Please point out the specific parts of the below statements that are accurate?

Zero cost tessellation on RV870?
Inefficient tessellation on Fermi?
Cypress owning Fermi in DX11?

The first sentence isn't even true since a domain shader is used after tessellation to calculate vertex positions (yes, in theory you can use null hull/domain shaders, but that's not the expected use case). That's why the idea of Charlie producing an analysis of NVidia's tessellation architecture (which he said he is working on earlier) smells suspiciously like "re-post talking points sent private from IHV".
 
The first sentence isn't even true since a domain shader is used after tessellation to calculate vertex positions (yes, in theory you can use null hull/domain shaders, but that's not the expected use case). That's why the idea of Charlie producing an analysis of NVidia's tessellation architecture (which he said he is working on earlier) smells suspiciously like "re-post talking points sent private from IHV".

Mr Demerjians article is up.
 
Really really can't wait till the NDA expires so these types of posts come to an end.
Me too. :smile:
My 5 285s F@H need some big brothers. Of course if it's a monster gaming card, I might sneak quite a bit more gaming in than I currently do.
I'll just wave at those waving the picket signs as I pay for the new hardware. :p
 
Hmm So when do those NDAs come to the end of their miserable and hateful lives?

Anyway im looking forward to see how the end result vs Charlie's article pan out. Its a long read and I guess I'll read it again to be sure I get everything he said down.
 
If you google GF100 or GF100 architecture, Chuck's hateful "article" is the first thing that comes up, hah! Looks like he achieved at least one goal for himself and whichever IHV he is directly (or indirectly) working for :) Thankfully it won't be too much longer before we can start talking about the real, true, good stuff ;)
 
Last edited by a moderator:
Hmm So when do those NDAs come to the end of their miserable and hateful lives?

Anyway im looking forward to see how the end result vs Charlie's article pan out. Its a long read and I guess I'll read it again to be sure I get everything he said down.

Charlies article can't be disproved really, since most of it relies on attacking a supposed feature in which Fermi performance can't be scaled, but Cypress can. Plus a lot of other silliness, like still trying to hang onto the idea that NVidia doesn't have a "real tessellator" just something that smells like it + SM. I believe we call that waffling and wiggling when your wrong.
 
Charlies article can't be disproved really, since most of it relies on attacking a supposed feature in which Fermi performance can't be scaled, but Cypress can. Plus a lot of other silliness, like still trying to hang onto the idea that NVidia doesn't have a "real tessellator" just something that smells like it + SM. I believe we call that waffling and wiggling when your wrong.

Well there are some specific claims...

What is going to be delivered? As we have said earlier on tapeout, the GF100/Fermi is a 23.x * 23.x mm chip, we hear it is within a hair of 550mm^2.
(Its gonna get measured)

The raw manufacturing cost of each GF100 to Nvidia is more than double that of ATI's Cypress.

Cost aside, the next problem is power. The demo cards at CES were pulling 280W for a single GPU
(I assume this is the 512 Cuda core version)

GF100 has almost no fixed function units, not even the tessellator. Most of the units that were fixed in G200 are now distributed

Moving on to tessellation we said last May that Nvidia does not have dedicated hardware tessellators. Nvidia said the GF100 has hardware tessellation, even though our sources were adamant that it did not. You can say that Nvidia is lying or that they are splitting hairs, but there is no tessellator.

Instead, they have what they call a 'polymorph engine', and there is one per SM. Basically they added a few features to a subset of the shaders, and that is now what they are calling a tessellator

Moving along, another interesting bit is that the GF100 has upgraded their ROPs (Rendering Output units) from 32 on G200/GTX280/GTX285 to 48

On a slightly less positive note, the texture unit count has gone from 80 on the G200 line to 64 on the GF100 parts. Again, without numbers on efficiency of the units, they are not necessarily comparable, but smart money from insiders was on 128 texture units last spring

The more direct comparison to an ATI 5970 was curiously neglected at CES, mainly because the GF100 is about on par, best case for Nvidia, with that part.

Nvidia has been telling their AIBs (Add In Board makers) that the initial GF100 chips they will receive are going to be massively cut down and downclocked, likely at the same 448 shaders and rough clocks as the Fermi compute board. There will be a handful of 512 shader chips at 'full clocks' distributed to the press and for PR stunts, but yields will not support this as a real product.

To make matters worse, SemiAccurate's sources in the far east are saying that the current A3 silicon is 'a mess'. Last spring, we were told that the chip was targeted for a top clock of 1500-1600MHz. The current silicon coming out of TSMC, defects aside, are not binning past 1400MHz in anything resembling quantity, with 1200MHz being the 'volume' bin.

GF100 as it stands is exactly what we were told last spring. It is too hot, too big, too compute focused, not graphics focused, and economically unmanufacturable

I figured I would drag out some of the stuff which could be refuted by the truth or not refuted. Im interested to see how close to the mark he is. As hes one of the only guys sticking his neck out so publicly I have to give the guy the respect of waiting for the final product before calling it.
 
Back
Top