NVIDIA GF100 & Friends speculation

Can AMD get its act together? Will AMD be sending out a Fermi Competitive Reviewer's Guide document on how to test ATI cards?

Jawed

Stay tuned for the next exciting episode of the Time-to-Market and The-Rich-Feature,only here, on your favourite GPU soap opera source, B3D:D

SB: I think Fermi definitely has some Rampage tech. GigaPixel too. Oh, and it's pretty clear that the incredibly awesome never before seen utterly bombastic tessellation performance is due to them using Sage...think of it, that's why it's ~3 bln transistors!
 
Yeah, I really don't know why people keep carrying on about the compute changes when the graphics side got a much bigger overhaul.
FUD?

Honestly, sometimes I get the feeling that those who are playing it up understand neither graphics, nor compute, nor hardware....
 
No, its not curious, I did that because that is exactly what you are doing where AMD architectures are concerned. I can give you an equally long list of things that changed from R6xx->RV7xx (and again from RV7xx->Evergreen), so why do you view RV7xx as not a new architecture and Fermi a new one?
Maybe because you drew the wrong lines in block diagrams?

But I'd love to see those lists too. :)
 
Fermi may or may not be revolutionary, that remains to be seen. But it is clearly, 100%, no doubt about it, a new architecture.

No one on this thread has mentioned yet that Fermi has a brand new instruction set. In other words: programs for G80/GT200 were binary compatible, but you *must* recompile for Fermi, because it has a completely different ISA. Obviously that has some drawbacks - Nvidia had to create a new compiler for Fermi, and existing CUDA applications must be recompiled. In this regard, Fermi is more of an architectural change than *any* new x86 processor during the past 25 years.

One huge change in the ISA was the move to a load-store architecture with a unified address space. G80/GT200 had separate instructions for loading data from shared memory versus global memory. This had some hardware benefits, but made programming more difficult, since the compiler had to distinguish between pointers to shared memory and pointers to global memory, and issue completely different instruction sequences for loads and stores to those two places. Fermi unifies the address space, which enables its configurable caches and makes compilation of general purpose programs much easier.

Those of you that insist Fermi is just a minor rework of GT200 are (perhaps willfully) misinformed. For those of you that are curious, you can read what Dave Patterson (who wrote the book on computer architecture) says about Fermi: Dave Patterson on Fermi
 
No one on this thread has mentioned yet that Fermi has a brand new instruction set. In other words: programs for G80/GT200 were binary compatible, but you *must* recompile for Fermi, because it has a completely different ISA.
Only if you chose to forego forward compatibility.
 
Only if you chose to forego forward compatibility.

Sadly, you're incorrect. If you have written any code that depends on .cubin, you will need to recompile. PTX is an intermediate bytecode that can be automatically recompiled by the driver, but if you don't have it because you've been using .cubin, you must recompile to use Fermi. The details are in this guide: Fermi Compatibility Guide
 
Sadly, you're incorrect. If you have written any code that depends on .cubin, you will need to recompile. PTX is an intermediate bytecode that can be automatically recompiled by the driver, but if you don't have it because you've been using .cubin, you must recompile to use Fermi. The details are in this guide: Fermi Compatibility Guide
No, you misunderstand me. cubin is optin, and nv makes it perfectly clear that recompiles will be needed.

You might need .cubin for your needs, but it is by no means a hard requirement for the entire world. If PTX works for you, then recompiles are not needed.
 
No, you misunderstand me. cubin is optin, and nv makes it perfectly clear that recompiles will be needed.

You might need .cubin for your needs, but it is by no means a hard requirement for the entire world. If PTX works for you, then recompiles are not needed.

This discussion is about the architecture, is it not? The architecture does not execute PTX. It executes .cubin.

The recompilation must be performed somewhere, either by the developer (my situation, since I use .cubin), or by the driver's PTX->CUBIN JIT compiler.

The point being: Fermi has a completely different instruction set architecture than G80/GT200. That's about as far from a simple evolutionary change as you can get.
 
The recompilation must be performed somewhere, either by the developer (my situation, since I use .cubin), or by the driver's PTX->CUBIN JIT compiler.
Not a big deal.
The point being: Fermi has a completely different instruction set architecture than G80/GT200. That's about as far from a simple evolutionary change as you can get.
Excellent point.
 
No one on this thread has mentioned yet that Fermi has a brand new instruction set. In other words: programs for G80/GT200 were binary compatible, but you *must* recompile for Fermi, because it has a completely different ISA. Obviously that has some drawbacks - Nvidia had to create a new compiler for Fermi, and existing CUDA applications must be recompiled. In this regard, Fermi is more of an architectural change than *any* new x86 processor during the past 25 years.
I'm glad you qualified it with "in this regard" because while the ISA of x86 hasn't been utterly revamped in the last 25 years, there were certainly points where you could say (or rather, must say) the new generation of CPU's were utterly revamped (for instance, Pentium=>Pentium Pro).

In any case, why does it really matter so much about whether a design is "new" or not. Let's just see (for those of us who don't get to know beforehand) what GF100 can do and what ATI's response can do.
 
The point being: Fermi has a completely different instruction set architecture than G80/GT200.
Point taken but imho that's not really a good point to illustrate it's a new architecture (not that I dispute it has big changes compared to GT200). The uops of x86 cpus can also change (though afaik they don't change that much until it's a really radical change). Of course this isn't user-visible, but that's just because it needs to stay compatible hence the x86->uops translation is done by the cpu itself, whereas the gpu hands this task basically off to the driver.
 
Can AMD get its act together? Will AMD be sending out a Fermi Competitive Reviewer's Guide document on how to test ATI cards?

Jawed

I'm as big an ATI fanboy as they come but if they aren't prepared for Fermi then they serious questions need to be asked why.

I'm expecting not only the 2gb 5870 to be tested alongside Fermi, but also their partners best effort non-reference cards too. That's 1ghz 5870's and pci-e bursting 5970's.

ATI should have got this planned to perfection, got the message clear to their partners that they not only want to see Fermi lose, but to lose hard.

I want to see anand and tomshardware benchmarks with the gtx 480 coming in 4th or worse, or I won't be happy. :p
 
GF100 was designed much more about compute than graphic. When they announced in panic after rv870 the fermi tesla architecture they could only say something about the L2 cache,DP, ECC and the cuda cores.
They needed to sacrifice something to fit into the 3+ bilion transistors and ended up being limited by the size anyway (clocks,heat).
Without PR setings 2560 x 1920 resolution, 8xAA and physx on it could end up being much closer to the 285GTX than the radeon 4870 is to the 5870.
We need to wait till March 26 to find it out.

You do realize that G100 was originally suppose to take the spot of the GT200 right? GT200 was the reactionary part because G100 wasn't going to make it.
 
Depends on how you define it I guess. If for each instruction the program is only exposed to a scalar unit then for all intents and purposes it's scalar. It doesn't matter if some other thread is running in parallel on the other scalar thingamagig next to it. The way I look at it, if there is no requirement for a single warp to occupy both the ALU and SFU units in a given cycle to achieve maximum occupation then it's scalar.

If you're talking about the fact that it's SIMD well is that really a useful argument when comparing GPU architectures. They're all SIMD so it's sorta irrelevant.

Probably a more interesting way to talk about things is as AoS and SoA from a hardware execution perspective than scalar/parallel/vliw.
 
Point taken but imho that's not really a good point to illustrate it's a new architecture (not that I dispute it has big changes compared to GT200). The uops of x86 cpus can also change (though afaik they don't change that much until it's a really radical change). Of course this isn't user-visible, but that's just because it needs to stay compatible hence the x86->uops translation is done by the cpu itself, whereas the gpu hands this task basically off to the driver.

Yes, x86 micro-ops can be changed without changing the ISA. But changing the micro-ops substantially requires a new microarchitecture, which Fermi is. The changes in Fermi's ISA were not done for cosmetic reasons. Nvidia didn't commit itself to writing a new compiler for Fermi just for fun, instead it was required due to a deep rethinking of what the processor should do. The claims that Fermi is just a warmed over GT200 are ludicrous.
 
:LOL:Offtopic: poor ATI, they can't claim even their tesselator is new - they have had it in silicon for generations! And if it is indeed new: were previous tesselators so bad they had to start from scratch? /OT
DX11 has a different tessellation pattern so the tessellator is new and the reason had nothing to do with performance. This was discussed in another thread at some point.
 
You do realize that G100 was originally suppose to take the spot of the GT200 right? GT200 was the reactionary part because G100 wasn't going to make it.
When G80 in Nov '06 was introduced some Nvidia CUDA guy announced, in a rare act of revealing something about future roadmaps, that they would break the terraflop barrier by end '07. (I'm sure they've regretted that ever since.)

Do tell: what kind of chip was G100 supposed to be?

Either there were expecting to release GF100 by end of '07, which would mean that the GF100 schedule has slipped by an astonishing 27 months.

Or was this a pre-announcement of a GT200 that was already in the making and slipped by a fairly pedestrian 6 months instead. Which basically puts your brilliant realization to rest.

The whole premise that they intended to release a Fermi-like chip within a year of G80 doesn't pass the first smell test. There is nothing that points to some intermediate architecture refresh that's not GT200.
 
Back
Top