NVIDIA GF100 & Friends speculation

History is fun, eh?

Interesting take overall. :D
And the dark secret of GT200 is that it's the kludge that NVidia used, because G100 (now called GF100) was too ambitious for the end of 2007 :p

So NVidia took G80, tacked-on DP, tweaked the crappy TMUs, increased ALU:TEX, fixed the MCs to coalesce and a few other minor details.

And the CUDAholics loved it.

So GF100 gained some D3D11-loving due to its delayed implementation. Delaying it further, of course. That's all.

Jawed
 
Nvidia white papers say G100 has 512 cores too. And that GT200 has 240 cores. and that G80 has 128 cores. Nvidia white papers generally aren't reliable sources of information unfortunately.

Oh ok, because NVIDIA calls them "CUDA Cores", that automatically invalidates everything else that was mentioned about well more than 2x improvement in terms of texture filtering performance, high AA performance, compute performance, etc vs GT200? What a convenient way to side-step the question. This is getting Beyond Silly (no pun intended). If a new graphics card has far higher efficiency than an older graphics card in several key areas, then how in the world can one claim that the architecture is "basically" the same? That makes no sense. Common sense would dictate as such, irrespective of what a snobby "computer architect" may think.
 
You don't think that AMD wrote pretty much new RTL when they went from K6 to K7? Or Intel from P5 to P6 or from P6 to P4? Or DEC from EV5 to EV6? Or IBM from Power6 to Power7? or from Power4/5 to Power6?

Yes, every once in a while one can expect things to be refreshed. But when GPU vendors tell you 'new architecture,' it is hard to believe they have fresh RTL all over. Do you think NVIDIA wrote completely new ROP units going from 7900 to G80?

I also think that some units maintain modified versions of the same RTL base even across boundaries as big as K6->K7. I'll hazard a guess by saying that the Intel x86 decode unit might have stayed on for a few generations of new architectures.
 
Oh ok, because NVIDIA calls them "CUDA Cores", that automatically invalidates everything else that was mentioned about well more than 2x improvement in terms of texture filtering performance, high AA performance, compute performance, etc vs GT200? What a convenient way to side-step the question. This is getting Beyond Silly (no pun intended). If a new graphics card has far higher efficiency than an older graphics card in several key areas, then how in the world can one claim that the architecture is "basically" the same? That makes no sense. Common sense would dictate as such, irrespective of what a snobby "computer architect" may think.

When a company continues to repeat the same lie over and over, it tends to make one less likely to believe anything else they try to tell you.

And as a computer architect, looking at higher efficiencies isn't a necessarily a good way to determine if something is a new architecture. Its not uncommon to find significant performance issues or bottlenecks after a product has shipped that can be easily fixed/enhanced with minimal resources on the next evolution of the architecture. Sometimes its as simple as increasing the number of buffers in one place, others its an issue of changing an arbitration scheme. Still the same architecture, but with very different performance.

And that's the problem with common sense, it generally isn't very common and it generally doesn't make a whole lot of sense.
 
Apparently because you see a similar SIMD structure you think that this is the same architecture - given that Fermi has the same shader execuction mechanism as G80 and a similar mechanism for the shader clustering, do you not see Fermi as the same architecture?

It's curious that you picked only the things that didn't change from G80 to Fermi (which in this discussion, most of us already agreed that some things rarely change) and curiously neglected to mention the major changes in cache hierarchy, geometry processing, ECC support, the GPC modules, the SP / DP units, etc...
If you go by only the things that didn't change, then you'll have an even harder time finding the differences from RV670 to RV770...
 
So NVidia took G80, tacked-on DP, tweaked the crappy TMUs, increased ALU:TEX, fixed the MCs to coalesce and a few other minor details.

And the CUDAholics loved it.

Why wouldn't they? The result of those "minor tweaks" was still better than what the competition could produce. Or are we doing this in a vacuum now? :)
 
It's curious that you picked only the things that didn't change from G80 to Fermi (which in this discussion, most of us already agreed that some things rarely change) and curiously neglected to mention the major changes in cache hierarchy, geometry processing, ECC support, the GPC modules, the SP / DP units, etc...
If you go by only the things that didn't change, then you'll have an even harder time finding the differences from RV670 to RV770...

I dont know why you even bother asking him. Hes from ATi, of course he would do that :devilish:

Desclaimer: Im not a computer architect, neither an engineer.

My take: I see GF100 as revolutionary because its cache structure, but mainly because of the rearragement of the GPU in itself. Hell, combining a pack of TMUs with the a pack of shader cores, instead of having one big block working with all shaders, is a big overhaul in my humble view. And you have to give nVIDIA engineers kudos for their work (if it works).

In a sort of "equation" you had before ALUs+TMUs+ROPs.
Now you have ((ALUs+TMUs)*X)+ROPs.
This result is exactly the same matematically speaking, but the form is different, boosting operations in paralel (at least in theory).
If you say it doesnt, you are basically saying, IMO, that a One Core CPU at 3Ghz, would have same performance as a Dual Core 1,5Ghz (ignoring differences in architecture of both).

I know this is a weird, bad analogy, but i think you can understand what i mean...
 
Why wouldn't they? The result of those "minor tweaks" was still better than what the competition could produce. Or are we doing this in a vacuum now? :)

Might be, might not be. We still only have paper specs with no idea how it really performs and whether it is indeed better or not.

And if it takes 3-4 years for it to manifest, we're back to comparing X19xx versus 7900. Where arguments raged whether ATI's higher ALU ratio was better or not. Turns out it probably "was" better, but it didn't really manifest until 4+ years after it was made.

Regards,
SB
 
Chill out guys. GF100 is not a revolution. But this can be: http://www.marketwatch.com/story/su...et-in-q2-2010-2010-03-10?reflink=MW_news_stmp GF100 is nice on the paper. Revolutionary? Blah blah blah... I say it's late, yelds scares every green man on the planet. And speaking about green man... It sucks too much power. GF100 is not revolutionary at all.

It's not just about architecture, or performance. You have to take all other important factors, mainly power efficiency and how manufacturable it is.

So dont say it's revolutionary. The whole product is not revolutionary, it's just an evolution, and there are some big problems with it.

OK, lets make wireless energy transmitor.......... Wow, we've got it! ......... It's revolution! Nobody will use it coz it's not so wireless as we thought, but it's revolutionary!!! :rolleyes:
 
But is it more efficient? Is it better than 1½ x Cypress?

That would greatly depend on what tests you throw at it. In DP-math, it could very well be more efficient, whereas in raw, bilinear texturing throughput I don't see much chances for Fermi.
 

LOL! I wonder why that is revolutionary... Oh! I know! Because its AMD/ATI.. It has to be! :LOL:

Psst.. Turn the red light off a bit and look here: http://www.nvidia.com/object/realityserver_tesla.html. The green side is doing the same thing. There is nothing revolutionary with it.

It is something promised by a lot of companies for years. Did it materialised? No! You are still constrained by limited internet bandwidth, which you cant control. So is it manufacturable? Yes. Is it marketable? No. Yawn.

Ironically it has the same fate of the thing you were critic of:

OK, lets make wireless energy transmitor.......... Wow, we've got it! ......... It's revolution! Nobody will use it coz it's not so wireless as we thought, but it's revolutionary!!!

Please just stop spewing Charlie words. He is also on this forum you know? No need to evangelize.
 
Last edited by a moderator:
LOL! I wonder why that is revolutionary... Oh! I know! Because its AMD/ATI.. It has to be! :LOL:

I just wanna be sure... Have you noticed the "can be" in my post? "But this can be: http..."

So LOL to you...

That supercomputer brings an oportunity to do something revolutionary (in cloud). But some people here said that GF100 actually IS revolutionary.

Read 2 times before bashing.

What I said was that GF100 as a whole IS NOT revolutionary. Architecture, maybe. GF100 with all that problems in mind absolutely not.
 
It's curious that you picked only the things that didn't change from G80 to Fermi (which in this discussion, most of us already agreed that some things rarely change) and curiously neglected to mention the major changes in cache hierarchy, geometry processing, ECC support, the GPC modules, the SP / DP units, etc...
If you go by only the things that didn't change, then you'll have an even harder time finding the differences from RV670 to RV770...
No, its not curious, I did that because that is exactly what you are doing where AMD architectures are concerned. I can give you an equally long list of things that changed from R6xx->RV7xx (and again from RV7xx->Evergreen), so why do you view RV7xx as not a new architecture and Fermi a new one?
 
That would greatly depend on what tests you throw at it. In DP-math, it could very well be more efficient, whereas in raw, bilinear texturing throughput I don't see much chances for Fermi.

GF100 was designed much more about compute than graphic. When they announced in panic after rv870 the fermi tesla architecture they could only say something about the L2 cache,DP, ECC and the cuda cores.
They needed to sacrifice something to fit into the 3+ bilion transistors and ended up being limited by the size anyway (clocks,heat).
Without PR setings 2560 x 1920 resolution, 8xAA and physx on it could end up being much closer to the 285GTX than the radeon 4870 is to the 5870.
We need to wait till March 26 to find it out.
 
Last edited by a moderator:
Back
Top