NVIDIA GF100 & Friends speculation

According to nVidia slides, GF100 is 59% faster than HD5870 in the tested scene.

According to several reviews, HD5970 is 53-61% faster than HD5870 in Heaven benchmark. Despite the test and scene were chosen by nVidia, Fermi doesn't perform faster than HD5970. Fermi is presented as a tessellation monster, but in most intensive tessellation benchmark it doesn't perform better than HD5970.

I think this pretty corresponds to Dave's words - nVidia spended a lot of research and silicon for a feature, which has quite low impact on real-world performance... Reminds me R5xx and its dynamic branching...

5970 has more transistors than GF100: 4,200 millions vs. 3,100 milliones. So, i guess the 5970 has a lot of useless transistors, too?
 
Sorry, but there has to be something.

It's not ALU bound, not texturing bound, and there are not enough triangles to explain such a low framerate in heavily tessellated scenes (which are even heavier than the one selected by NV).

NV didn't show their settings either iirc, and AA shows a big loss too on this and other scenes.


There is really nothing there, you are trying to make something out of nothing ;), they used the heaven demo because that is what AMD has been using to show off tesselation so far in a game engine (externally it is easy to compare for public to look into it). And nV has actually alot more flexibility when it comes to increasing the tesselation amounts.

MSAA how is it impacted with increased polygons on the screen?

After your edit, triagle set up is also frequency bound ;) thats why its always stated triangles per clock! But again, we are looking at shifting bottlenecks.
 
Last edited by a moderator:
LOL. People are criticizing Fermi for not beating a dual-gpu card? Whatever.
As much as I'd love to have a 5970, if they can bring Fermi out at the sub $500 point, that's a huge cost-savings over 5970. Plus there's no SLI/Crossfire baloney to mess with.
 
I think what we are looking at is this -

The benchmarked card is clearly faster than the 5870, so much so that a gpu refresh will not give ATI the single fastest card back this series.

Nvidia will need their own dual gpu to beat the 5970, however it is highly likely that ATI will be able to beat it on dual gpu terms with a refresh. That is assuming neither abandons the 300w limit.

While Nvidia has improved more on performance compared to last series, it is still not enough to beat ATI on price-performance.

Smart move by Nvidia this however, they have given very few answers and left more questions being asked.
 
Razor1: Yes, that's it exactly. Doubling of triangle-rate is probably the best thing on AFR. And ATi used it as alternative solution. Cypress isn't direct competitor of Fermi, so I see no reason why its triangle-rate should be comparable to Fermi. The competitor is Hemlock.

5970 has more transistors than GF100: 4,200 millions vs. 3,100 milliones. So, i guess the 5970 has a lot of useless transistors, too?
Are you sure Hemlock isn't faster in another games/benchmarks? Anyway, this isn't about transistors, it's manufacturing costs / performance (two 300mm² GPUs aren't more expensive then a single 450mm² one).
 
LOL. People are criticizing Fermi for not beating a dual-gpu card? Whatever.
As much as I'd love to have a 5970, if they can bring Fermi out at the sub $500 point, that's a huge cost-savings over 5970. Plus there's no SLI/Crossfire baloney to mess with.
Not to mention blasphemy called AFR :mad:
 
MSAA how is it impacted with increased polygons on the screen?
That just show you never analyzed the benchmark results...

MSAA shows a huge drop with grass, to the point Cypress "heavy tessellation" framerate is not that different than "no tessellation and much grass" framerate.

And as I said in my previous edit, Heaven is far from being 100% setup bound, something unrelated to it nor ALU throughput, nor memory BW, nor texturing, nor blending comes into play on the whole benchmark when comparing Cypress to Juniper with almost the same specs except setup rate.
 
According to nVidia slides, GF100 is 59% faster than HD5870 in the tested scene.

According to several reviews, HD5970 is 53-61% faster than HD5870 in Heaven benchmark. Despite the test and scene were chosen by nVidia, Fermi doesn't perform faster than HD5970. Fermi is presented as a tessellation monster, but in most intensive tessellation benchmark it doesn't perform better than HD5970.

Holly molly are you serious about playing down Fermi single GPU becouse it doesn't spank a 5970 which is dual-GPU but "just" comes close!?
 
That just show you never analyzed the benchmark results...

MSAA shows a huge drop with grass, to the point Cypress "heavy tessellation" framerate is not that different than "no tessellation and much grass" framerate.

And as I said in my previous edit, Heaven is far from being 100% setup bound, something unrelated to it nor ALU throughput, nor memory BW, nor texturing, nor blending comes into play on the whole benchmark when comparing Cypress to Juniper with almost the same specs except setup rate.


I'm asking you how MSAA is affected..... I'm looking for your analysis, just stating it doesn't mean much.

Juniper has less in every department so bottlenecks will automatically shift between the GPU's.
 
Razor1: Yes, that's it exactly. Doubling of triangle-rate is probably the best thing on AFR. And ATi used it as alternative solution. Cypress isn't direct competitor of Fermi, so I see no reason why its triangle-rate should be comparable to Fermi. The competitor is Hemlock.

Are you sure Hemlock isn't faster in another games/benchmarks? Anyway, this isn't about transistors, it's manufacturing costs / performance (two 300mm² GPUs aren't more expensive then a single 450mm² one).


Hmm well Fermi will compete in price range somewhere near or just under Hemlock, so that gives you a good idea of what nV is going for. Market slides are marketing slides, just have to wait till independent reviewers will have to show what you are looking for.
 
So I just read the summaries and what-not of Fermi. I'm not the most technical person on the GPU side, more a CPU person, but from looking at the slides Nvidia have build a quad-core GPU no, with each core having 128CUDA cores and a rasterizing engine.

This diagram looks eerily similar to early dual/quad core CPU designs from AMD and later Intel. So this chip is going to be massively parallel right. Is this the best path for Nvidia to take, from experience with parallelism in the CPU space it has taken 6-7 years for some of the more basic elements to take advantage of having >1 core. Then again I suppose the same questions were raised when AMD went multi-core with Athlon.

Anyway, that is my limited understanding of what Nvidia are trying to achieve, it's probably misguided and wrong, so please correct me if this is the case. If I'm right the design paves the way for very easy scalability, I mean they can basically cut the chip in half and get 50% performance with a 50% die size or increase the number of 'cores' in the next iteration (32nm) to 6 along with other general architectural improvements and efficiency gains and finally stop increasing the die size but get massive performance increases similar to what Intel/AMD do in the CPU space.

What is the real thinking, am I completely wrong?
 
So I just read the summaries and what-not of Fermi. I'm not the most technical person on the GPU side, more a CPU person, but from looking at the slides Nvidia have build a quad-core GPU no, with each core having 128CUDA cores and a rasterizing engine.
I quite agree with that.

As for the "multicore" issue, there's nothing wrong as graphics is an inherently multi-thread task.
 
Razor1: Yes, that's it exactly. Doubling of triangle-rate is probably the best thing on AFR. And ATi used it as alternative solution. Cypress isn't direct competitor of Fermi, so I see no reason why its triangle-rate should be comparable to Fermi. The competitor is Hemlock.


Are you sure Hemlock isn't faster in another games/benchmarks? Anyway, this isn't about transistors, it's manufacturing costs / performance (two 300mm² GPUs aren't more expensive then a single 450mm² one).

Why do people insist on comparing a dual GPU card to a single GPU card? I'm sorry, it may work on a $ to $ basis, but after that, it holds no water. Most people with a decent IQ will compare it to Cypress, not Hemlock.
 
Holly molly are you serious about playing down Fermi single GPU becouse it doesn't spank a 5970 which is dual-GPU but "just" comes close!?
Hemlock was designed as direct competitor of nVidia's single-GPU high end. It's already 3rd generation product using this approach. So it's completely correct to compare these products. And as we see, Fermi is "just" competitive in some cherry-picked benchmarks (in fact cherry-picked scenes of cherry-picked benchmarks), which reflects its strongest point - triangle processing.

XMAN26: So you're saying it's perfectly legitimate to compare 6 months old product targeting price segment A to a new product targeting price segment B?
 
Last edited by a moderator:
Why do people insist on comparing a dual GPU card to a single GPU card? I'm sorry, it may work on a $ to $ basis, but after that, it holds no water. Most people with a decent IQ will compare it to Cypress, not Hemlock.

Most people with a decent IQ will realise the whole story too, transistors, die size etc etc. Unfortunately for Nvidia, most people who don't know the whole story will only see gf100 as being second quickest card.

Nvidia has a real problem here, because they are going to release a very late graphics card that is not the fastest available. Last time that happened was when?
 
Will this new CSAA work under Opengl? Until now, TMSAA doesn't work in Opengl-Games. Is this gonna change?

If nvidia kills the hybrid modes, I will lose massiv IQ in Opengl.
 
They could in theory paper launch the single board SLI version at the same time they launch GF100, if they can get some prototypes into reviewers hands by that time.
 
Back
Top