NVIDIA Fermi: Architecture discussion

According to this interview they can: http://www.pcgameshardware.de/aid,6...h-modularen-Aufbau-moeglich/Grafikkarte/News/

There is a speculation, that the half of multiply logic could be saved by cutting DP.

That seems to be a big issue for maintaining compatibility amongst cards. If there are cards with no DP, then it makes things a hell of a lot harder for developers - you end up with all sorts of ugly checks to find feature sets...ironically just like a CPU.

What happens if code is written assuming DP and then runs on a card with no DP? Does the JIT just emit an error?

David
 
To be honest I'd be surprised if thats the case. In every example of Yields I can think of regarding Nvidia. its traditionally been the ROPS that came up defective/. As the GT200 architectured evolved. We still have parts with disabled ROPS/Memory but full fledged GT200 from TMU/Shader PoV.

The fact that Nvidia has fully functioning shader cores and hasn't displayed its graphic portion of its tech makes me believe we'll see a similar trend.

Defects are generally randomly distributed and impact all parts of the design. The only way that some areas of the chip would be more defect prone is if they were designed without following design rules. Usually you test and verify the hell out of those blocks and work with the foundry and EDA vendor to make sure it's going to be OK, but there's still some elevated risk.

I think what you are trying to say is that it's easier to handle/hide problems in TMU/Shaders, which is a function of the SW interfaces, microcode, etc. etc.

David
 
While we're throwing up just random numbers, I'd say 640alus, 160tus & 64rops. Ofcourse then they would have to get rid of the other silicon dedicated to GPGPU and highly optimize their alu/tus. However like DemoCoder mentioned this was probably not the best time for them to do it so I'll still be pinning hopes for such optimization in the refresh.

We're basically saying the same thing then; it wouldn't fit the 3+B transistors budget unless they'd exclude the added computing functionality. But how much would they actually remove of that when you start building from the get go ALUs that are capable of both SP/DP and not just a mediocre ratio but a 2:1 ratio? While it's just one example I have a hard time imagining that things like that come for free. Instead of developing two different architectures for two different target markets (which would add quite a bit in resources and the R&D cost for HPC would hardly get amortisized ever) they obviously tried to get the best possible for "both worlds".

If they managed to reach (which is still open to find out) up to 2*GT200 gaming performance, they haven't really missed the typical twice the performance IHVs set with each generation. If they haven't then of course the picture changes dramatically and while we're here GT200 wasn't really a new generation now was it?

Not sure what you are implying there.
I should have been more clear then; there was a newsblurb from Fudo where someone from AMD admitted that you can't change much on short notice. One of the messages many read out of that one was that NV went to deeper architectural changes with GF100 compared to AMD and the latter couldn't do anything about it.

Now I am implying that by the time NVIDIA realised that AMD is not going to miss its target despite the 40nm problems and they will arrive inevitably later than AMD, it was just too damn late to theoretically pull a performance part before the high end part.

The usual kind? Hard launch.
If it's a hard launch and the difference between the two is in the =/>2 months range, then it's truly not such a dramatic delay. But it has to be a real hard launch which remains to be seen.

Fermi needs to be better for graphics than the competition not just Nvidia's previous generation.
Albeit it wasn't in answer to my post, allow me to bounce back to the first paragraph (always following my so far reasoning in my former posts): and in order to achieve that do they absolutely have to have twice the amount of TMUs or ROPs as an example or would it had been wiser to increase efficiency in the existing ones? Because if it's the latter they wouldn't need f.e. 160TMUs as you're suggesting and revamp them at the same time.

To avoid misunderstandings look at RV770's ROPs vs. RV670/600 as an example. An amount of 16 in all of them and the "but" I leave for you to fill out.

NVIDIA has to fill out our blanks - question marks as to what they've done in terms of 3D exactly. Frankly when I see an IHV going through as much trouble to revamp (let me call them) every computing transistor, it sounds pretty idiotic that each and every 3D related transistor has remained unchanged. I'm not saying or implying that it's one way or another, for the time being that space has been left blank in my mind that's all.
 
http://www.brightsideofnews.com/news/2009/10/2/colfax-intl-shows-worlds-first-8gpu-box-8tflops!.aspx

When it comes to the GPUs, eight 55nm Tesla C1060 boards will get you 8.8 TFLOPS [the 55nm Tesla are a bit higher clocked than old 65nm C1060's, so you get 1.05-1.1 TFLOPS per board]. When it comes to computing power of future Tesla cards, worst-case scenario is 4.99 TFLOPS of IEEE 754-2008 spec dual-precision power, i.e. 9.98 TFLOPS of single-precision power [note: we got this info ourselves, not from Mike].

Assuming he's referring to Fermi, that puts clocks at ~1200Mhz. Though he could simply be pulling that # out of his nether regions.
 
If it's not late, then whoever planned it to miss the Windows 7 launch and miss the holiday shopping season should be fired.
I don't know why everybody expect Win7 to have an impact in terms of graphics cards sales. I still think that that impact will be closer to zero.
As for the holiday season high end videocards aren't bought in seasons they're bought to play games and that's what matters most. Most of people will wait for DX11 games, for DX9/10 games that won't work well enough on current hardware, for NV's DX11 solution to decide whos better. They don't give a damn about holiday seasons.

From Anandtech: "I asked two people at NVIDIA why Fermi is late; NVIDIA's VP of Product Marketing, Ujesh Desai and NVIDIA's VP of GPU Engineering, Jonah Alben. Ujesh responded: because designing GPUs this big is "fucking hard"."
It all depends on what he asked. He might have asked why it's latER than competition. The answer seems to imply that question because a GPU can only be "this big" in comparision to another GPU.

Nvidia acknowledged its late, end of story. Now unless you know more about Nvidia's internal deadlines than Nvidia themselves, you cant say otherwise.
See above. Anand's words are Anand's words, not NVIDIA's.

I think its quite clear who feeds Fuad so yes I'll trust that bit of news.
I think it's quite clear that Fuad doesn't understand half of what he's being fed with.

Fermi needs to be better for graphics than the competition not just Nvidia's previous generation.
Is it really that hard to assume that it will be better for graphics than the competition? It may be bad for margins (again) but it looks to be faster and more future proof than anything the competition have to offer. So how's that bad for you as a consumer?
 
That seems to be a big issue for maintaining compatibility amongst cards. If there are cards with no DP, then it makes things a hell of a lot harder for developers - you end up with all sorts of ugly checks to find feature sets...ironically just like a CPU.

Well, gpu's are hardly better here. games already check for dx version etc. And there is a reason why pixomatic was written.

What happens if code is written assuming DP and then runs on a card with no DP? Does the JIT just emit an error?

JIT demotes those double computations to spfp, just like on G80.
 
I am thinking that DP in fermi is closely tied to int and spfp operations, which is why dp is unlikely to be deleted in gaming parts. They'll prolly disable the exceptions, rounding modes, denormals etc. (or some of them) in mid range parts, but retain some dp capability.

WHY? you ask.

Int mul is 32 bits, and sp mul is 23 bits. If you use both of them, you get 55 bits of multiplication. dp needs 52 bits. Just about right for doing dp fma. :)

This could be a reason to have so much int power, and how they are managing to but so much dp in gaming parts (albeit high end) without going bankrupt. This could be why you can dual issue spfp and int's, but dp doesn't dual issue with anything else.
 
Now I am implying that by the time NVIDIA realised that AMD is not going to miss its target despite the 40nm problems and they will arrive inevitably later than AMD, it was just too damn late to theoretically pull a performance part before the high end part.

.

Is it not what theym ight have done when they cancelled GT212? They might have put the resources into getting the D12U perfromance part out faster then planed.
 
I don't know why everybody expect Win7 to have an impact in terms of graphics cards sales. I still think that that impact will be closer to zero.
As for the holiday season high end videocards aren't bought in seasons they're bought to play games and that's what matters most. Most of people will wait for DX11 games, for DX9/10 games that won't work well enough on current hardware, for NV's DX11 solution to decide whos better. They don't give a damn about holiday seasons.

You are correct that Windows 7 in and of itself won't all of a sudden drive sales for graphics cards. However, missing the holiday season with brand new cards would be unfortunate for NVIDIA and it's partners, because a huge amount of merchandise is purchased during the holiday season.

It's not the end of the world though. If GF100 is not ready for purchase this year during the holiday season in quantity, then NVIDIA's partners will still have significant sales volume by offering large sales/discounts on existing GT200-based products (such as GTX 295, GTX 285, etc).

Another strategy that NVIDIA's partners can use for high end price points is to take pre-orders for GF100 in November and December, provided that GF100 will be available in quantity by early next year.


It all depends on what he asked. He might have asked why it's latER than competition. The answer seems to imply that question because a GPU can only be "this big" in comparision to another GPU.


See above. Anand's words are Anand's words, not NVIDIA's.

This is a circuitous and never-ending argument. Being late is often a matter of perspective. If GF100 was available in large quantities by Nov of this year, and RV870 didn't show up until early next year, then most likely we wouldn't be having a debate about GF100 being "late". Since AMD/ATI executed well with RV870, and since they have a less complex chip to make, GF100 is considered to be "late". But in reality, that is just a misnomer, as being late depends not just on one's internal roadmap but also on one's competition, in addition to what goal or goals the company is trying to accomplish with the product. I would argue that GF100 is not "late" in terms of being a viable solution for the HPC market.
 
Last edited by a moderator:
I don't know why everybody expect Win7 to have an impact in terms of graphics cards sales. I still think that that impact will be closer to zero.
As for the holiday season high end videocards aren't bought in seasons they're bought to play games and that's what matters most. Most of people will wait for DX11 games, for DX9/10 games that won't work well enough on current hardware, for NV's DX11 solution to decide whos better. They don't give a damn about holiday seasons.

:/

People don't give a damn about holliday seasons?

I would like to see which poll/data have you looked at that would prove your point.
 
:/

People don't give a damn about holliday seasons?

I would like to see which poll/data have you looked at that would prove your point.
Can you provide a reason for buying a high end video card during holidays besides new games? And what if all of these games work just fine on your current videocard? Wouldn't you go and buy, i don't know, a new TV or something instead?
Sales go up during holiday but that doesn't mean that they go up for everything without any reason. What I'm saying is that this holiday season is hardly a defining one for DX11 videocards. And if NV will provide benchmarks and disclose prices of GF100 cards before holidays and those benchmarks and prices turn out to be good in comparision to AMD's cards then most of potential buyers will wait for GF100 to become avialable. Because there will hardly be a reason to buy a new DX11 video card before holidays and not, let's say, in January or February.
But that's of course assuming that these benchmarks and prices will be good 'cause if not many of buyers may go with EG cards even before GF100 cards will become avialable. It's all in NV's own hands now.
 
Is it not what theym ight have done when they cancelled GT212? They might have put the resources into getting the D12U perfromance part out faster then planed.

No one knows where they sent those human resources to. A performance project would logically be D12P following the current codename mess.
 
High end videocards as Christmas presents? Wouldn't a PS3 be a better Christmas present from any pov? :p

It depends if someone has a PC and already has/ does not want the PS3.
Moreover, Christmas is all around the world one of the best periods for the complete PC sales (even high-end gaming ones).
 
Can you provide a reason for buying a high end video card during holidays besides new games?.

Some people, like me, buy those cards not necessarily for gaming but for developing software. Sure that may be only 1 in 100 potential buyers, but for Nvidia and ATI it is important that those developers get their next gen card first as they tend to stick with it if it's good enough and software may as a result be optimized more for one or other brand.
 
Some people, like me, buy those cards not necessarily for gaming but for developing software. Sure that may be only 1 in 100 potential buyers, but for Nvidia and ATI it is important that those developers get their next gen card first as they tend to stick with it if it's good enough and software may as a result be optimized more for one or other brand.
Isn't Fermi more interesting to you as a developer? Shouldn't you as a developer have hardware from all key vendors? (Not mentioning that vendors usually provide h/w to key developers for free and some time before this h/w becomes avialable to consumers.)
 
Back
Top