NVIDIA Fermi: Architecture discussion

L2 is coherent with itself, there is a single L2 for each memory bus ... there can only ever be one copy.

That's not a cache coherency scheme, that is simply caching.
I've just finished to read RWT's preview. I previously thought that they got fully coherent L1 caches per SM. Wishful thinking..
 
I've just finished to read RWT's preview. I previously thought that they got fully coherent L1 caches per SM. Wishful thinking..
What would it really get you any way, finegrained communication should be done with messages, coarse grained communication can be done through L2 (latency is not an issue for coarse grained communication, and it has plenty of bandwidth). Complete coherency is a waste of transistors :p
 
In layman terms;

R600 - 320SP - 720 million - 2.25 MT per SP (Added D3D10)
RV670 - 320SP - 666 million - 2.08 MT per SP (Added D3D10.1 & lowered memory bus width)
RV770 - 800SP - 956 million - 1.19 MT per SP
Cypress - 1600SP - 2.15 billion - 1.34 MT per SP (Added D3D11)

G80 - 128SP - 686 million - 5.36 MT per SP (Added D3D10)
G200 - 240SP - 1.4 billion - 5.83 MT per SP (Added CUDA)
Fermi - 512SP - 3 billion - 5.85 MT per SP (Added more CUDA + D3D11 & lowered memory bus width)

Since we dont have any performance numbers, purely from this perspective it doesnt like Fermi is Nvidia's RV770 but still their jump looks more efficient.
 
Has NVIDIA posted any information on GRAPHICS related improvements?

DX11 is an improvement, no ? ;)

And this is a "graphics" card, it has a DVI port (unlike their Tesla GPGPU counterparts):

nvidia_fermi_tesla_card_gputc.jpg


Since this was never meant to be a product "launch", not even a "paper-launch", talking about that three months before the actual unveiling would sort of kill the "buzz" in the media.
I'm surprised even an official die photo got in there, since not even AMD treated us with one from the "Cypress"/RV870/HD5870 core.
 
Last edited by a moderator:
DX11 is an improvement, no ? ;)

And this is a "graphics" card, it has a DVI port:

Since this was never meant to be a product "launch", not even a "paper-launch", talking about that three months before the actual unveiling would sort of kill the "buzz" in the media.
I'm surprised even an official die photo got in there, since not even AMD treated us with one from the "Cypress"/HD5870 core.

DX11 is an improvement, but one that everyone has expected.

BTW, I'm not suggesting that they reveal frame rates or even images/videos from games. I'm just wondering if the hardware has been modified or any hardware has been added that will benefit graphics. For example, there is the rumor that it does not have a hardware tessellation unit. That is something that could impact graphics.
 
At this point, I think fixed functional graphics features are essentially maxed out. Yes, they are marginal improvements to be made to texture filtering or antialiasing, but its high cost for small gain. The future of GPU graphics improvements as things have become more general is essentially just upping performance and eliminating pathological use cases that might choke performance.

They might improve AF or AA slightly, but those just eliminate annoying artifacts from fewer and fewer edge cases, they won't improve lighting.

We've kind of reached a 'boring' stage in GPU evolution where introduction of new fixed-function blocks may recede. I'm not sure what the long term survival will be for fixed-function tessellation for example (despite Charlie's aggressive shilling for it). Geometry amplification is a wide-open field that encompasses compression and procedural synthesis.

From a developer standpoint, this card looks sweet, from exceptions, to vtable-dispatch support, to debugging in a real IDE stepping through C++ on the GPU, but it comes at a cost, which is density. NVidia may be following SGI's eventual decline, by losing the consumer/workstation market based on cost and targeting HPC, which is a niche.
 
For example, there is the rumor that it does not have a hardware tessellation unit. That is something that could impact graphics.

Whether tessellation is done in hardware or software won't affect graphics quality, only performance, and that's if your bottlenecked by tessellation and if your desired tessellation fits the hardware.
 
At this point, I think fixed functional graphics features are essentially maxed out. Yes, they are marginal improvements to be made to texture filtering or antialiasing, but its high cost for small gain. The future of GPU graphics improvements as things have become more general is essentially just upping performance and eliminating pathological use cases that might choke performance.

From a developer standpoint, this card looks sweet, from exceptions, to vtable-dispatch support, to debugging in a real IDE stepping through C++ on the GPU, but it comes at a cost, which is density. NVidia may be following SGI's eventual decline, by losing the consumer/workstation market based on cost and targeting HPC, which is a niche.

How could exceptions, vtable dispatch support, debugging in a real IDE, and so fourth impact graphics?

Also, I read somewhere that ray tracing could be improved in this GPU. Do you see anything that could indicate this?
 
For example, there is the rumor that it does not have a hardware tessellation unit. That is something that could impact graphics.

Pure software emulation is certainly not on their minds if it significantly impacts actual 3D performance, but it is conceivable that it's one area where there'll be compromises as to the level of hardware implementation, much like the ones Nvidia did it with G80 when Geometry Shaders' performance (part of the DX10.x spec) took a back seat to the rest.
This time, with this much programmability on-chip (assuming for a moment the scenario described above), the quality of the OS driver and compiler is paramount, and could change the tide for a particular game pretty dramatically.
 
So to those who understand, this new chip is it
Yay, Meh or Wtf ?

I don't understand, but i think it's :oops: for gpgpu computing.. and :cry: for gaming (in terms of performance/mm^2 against Cypress).
But we will see..
For sure, it's a nice step forward from GT200's architecture.. but where Nvidia is stepping? :D
 
This seems to be a new FUD meme, that this card won't run games fast. With the exception of special hardware for ECC and double precision, practically everything else will help games, if you accept the messaging that AMD/ATI was selling since the R600, which was that ALU:TEX ratios are going up and shader power is therefore a big factor in gaming.

What may not be as important, is the stuff for C++, since you can always write non-OO code and avoid indirect dispatch (with a cost). That's extra trannies for development convenience.

The lack of information about TMUs/ROPs doesn't mean they suddenly gimped the card. They wouldn't have included a 384-bit memory bus if they were going to gimp on texturing or fillrate. I would wait and see. Then again, AMD had a massive 512-bit bus on the R520, but totally skimped out on TMUs.
 
Yeah, based on the information available, I'd estimate it will be at least ~45% faster than HD5870 which seems reasonable for 40% greater die size.
 
That was tech day, not aimed at the gaming consumer. For those with an interest in what they showed it was an awesome day, for the others ... well wait till the gamer launch comes.
 
Back
Top