NVIDIA GF100 & Friends speculation

What I meant is that the GF100 architecture leads itself more to scaling down to lower end designs in a timely, cost-effective, and efficient manner once the high end GPU is ready compared to GT200. After all, each GPC in GF100 is reportedly nearly a full GPU in and of itself, right? Wasn't it ATI/AMD who decided in recent years to move away from monolithic GPU's in part because time to market for lower end derivatives was very poor compared to introduction of new GPU's at the high end? With GF100, it seems (at least on the surface) that once the high end GPU is ready, then time to market for the lower end derivatives stands to be significantly better than before. Also, if NVIDIA can make a balanced high end GPU, then by definition the lower end derivatives should be balanced too. Is it really balanced to have lower end derivatives with the same geometry throughput as the higher end models? Of course, ATI/AMD's strategy will always have some merit. NVIDIA cannot easily get around the fact that monolithic GPU's take a long time to come to market and are very difficult to engineer. That said, the proof is in the pudding, and the results later this year will speak for themselves. I guess we'll learn a lot more in the coming months in seeing how everything plays out.
Not trying to sound like an @ss but really !?? No shit Sherlock in comparison to the g200 that has yet to deliver any semblance of a real derivative at best the g80/92 was nv's best/last successful full generational part. The G200 has got to be one of the worst parts to compare a future part Against, in the meantime ATi is on their 3rd generTion of top to bottom derivatives.
 
Coz that's an opportunity to setup a 181mm2 chip against a ~330mm2 chip.

Anyway, enough of this hole/pricing/placement talk. Back to architecture. :cool:

sounds like it will be another tap out another board design and what not to fill a gap they already have a card to fit in.
 
Why would they do that ?

I figure their whole line up will shift. The 5830 will fill the $200 gap not a 5790. The 5850 will drop to the $260 price range the 5830 is in . The 5870 will slot into the $350 price tag and the e6 5870 2GB will slot into the $400 price point. They can put out a 5890 in the $400 + range if they want to take the performance crown. I think its obvious that ati has high priced cards becuase there was no reason not to reap as much money from them as possible and I think dueto the large gaps in pricing we will see them drop down.

The best thing is the 5870 at $350 is only a $30 drop from its original $379 msrp. The 5850 at $260 is at its original msrp

Ati shifting pricing like that (which shouldn't be very hard for them when you look at the original msrps) wil lreally screw with nvidia's rumored $350 /500 price tags. Why buy a 470 when the 5870 is faster at the same price. Why 480 for $500 when the 5870 is 20% slower but $150 bucks less.

With pricing like that the 5850 will become a really big card and most likely the sweet spot for gamers for the next 3-6 months.
The worst thing is I heard 5850 and 5870 prices are going up by US$15 and US$10 respectively
 
Test silicon is not equal to final silicon, however. Yes, work can be done much more rapidly than before they have any silicon, but if the test silicon were really ready, they would have moved straight to production.
Sorry, Chalnoth, but this is just not true.

Things have to be really broken if A1 silicon can't be used for most, if not all development. Since so much has already been verified before tape-out, most metal spins are used to fix relatively obscure corner case bugs. Annoying things that can take a lot of time to track down and that can make the chip hang every hour or so, but that don't prevent SW from doing their job.

In GPU land, time-to-market pressure is very high, but in most other businesses, time-to-first-silicon is more important: you need to win the design slot with the customer. Time-to-market on the other hands is entirely dependent on the customer schedule and final production silicon is often not in the critical path at that point.

In those cases, A1 silicon with bugs is usually sufficient to let SW work and it's not unusual wait with A2 spins for months, to make sure that all issues have been found. There is no reason why things would be different for GPU's.
 
What I meant is that the GF100 architecture leads itself more to scaling down to lower end designs in a timely, cost-effective, and efficient manner once the high end GPU is ready compared to GT200.
And I disagree. Splitting up four intricately communicating parts is more difficult than scaling down a part with a common base and varying numbers of independent SIMDs. There's a lot more loose ends to take care of in Fermi. The low number of texture units could be a problem, too, and it affects the low end more due to lower game settings using less shader math.

Not trying to sound like an @ss but really !?? No shit Sherlock in comparison to the g200 that has yet to deliver any semblance of a real derivative at best the g80/92 was nv's best/last successful full generational part.
That has nothing to do with GT200's architecture. The lack of GT200 derivatives were due to NVidia's troubles at 40nm and the better cost effectiveness of G92 with virtually the same features.
 
Well if it's random then cache doesn't really make a difference ;)
Haha sure; I meant more a case where there was lots of data-dependent coherence, but not predictable enough to preload the LDS. The extreme case though is you could just write something like a convolution without even bothering to load it into LDS and trust the cache on Fermi, while on ATI it would be dog slow. Whether or not NVIDIA (or someone else) will lower itself to that level of cheese remains to be seen ;)

But what I was trying to say was why couldn't ATI use its L2 for R/O buffers? And is L1 only for textures?
Yeah, and for all I know they do. Still their L2 is far enough away that it can't replace LDS for reuse, whereas on Fermi by my understanding LDS/L1 is the same memory... you can even partition it differently per kernel in CUDA.

Cypress is very fast at global atomics, but *much* faster at local atomics. I am surprised you don't get any benefit from using the LDS on the 5850
If he's doing one atomic per load of the input data you're almost certainly bound on just the bandwidth of reading in the data :) Indeed Cypress' local atomics are screaming fast, especially when uncontended. It'll also blow people's minds how fast their global append/consume implementation is (hint: nearly the speed of local atomics!!). Will be interesting to see how well Fermi does in these increasingly important benchmarks. Come on B3D - you can come through with me tomorrow with a technical review like only you guys can do! :p

LDS is definitely more power efficient (and prolly area efficient too) over an equal capacity block of gp cache. There is a definite use case for existence of these things.
Yeah for sure, but Fermi seems to have already paid the "cache" cost since it's configurable as a cache, LDS or a mix.
 
Come on B3D - you can come through with me tomorrow with a technical review like only you guys can do! :p

lol someone has high expectations. I think B3D would need access to a gt300 first before that can happen. :p
 
Yeah, and for all I know they do. Still their L2 is far enough away that it can't replace LDS for reuse, whereas on Fermi by my understanding LDS/L1 is the same memory... you can even partition it differently per kernel in CUDA.
That's not globally coherent though.

LDS/L1 might be the same memory, but I doubt they have the same performance characteristics (assuming the working set fits in either).
 
I kid because I care ;) B3D reviews save me from having to skim through dozens of other reviews to get half the (useful) content for 10x the time investment.
++

I ordering the 8800gts the day B3D's review came out, demonstrating just how good the 8800's AF angle-independence was .

I do not expect the same thing to happen w/ Fermi, however ;)
 
I know it's nitpicking but G8x/9x/GT2x0 AF isn't 100% angle independent; it's way less angle dependent than G7x f.e.

As another sidenote 45 degree angle dependency like for instance on RV7x0 doesn't annoy me personally. You have to really twist yourself in some weird corner cases to see a difference.

What personally ticks me off with AF in recent years are AF optimizations. Wherever those lead to any form of texture noise I find it highly annoying. I suppose that "high quality" AF as a setting should still exist on GF100; one question would be (and we're going to find out hopefully some time soon after the launch) is whether the performance drop between HQ AF and driver default is as relatively small as on G8x up to GT2x0.

Before I see performance measurements though, I'd love to read from those with a similar sensitivity to AF optimizations how filtering quality with HQ AF exactly looks like compared to Q AF and of course existing GF GPUs.
 
From the information that has been released so far, it certainly seems to be the case that nVidia thinks of AF as a "solved problem", which would be an indication that the AF has changed very little, or perhaps not at all, from the G8x.
 
What I meant is that the GF100 architecture leads itself more to scaling down to lower end designs in a timely, cost-effective, and efficient manner once the high end GPU is ready compared to GT200

If this was the case then why havent we seen any Fermi derivatives by now? Heck we've hardly had any news of derivatives at all. They had planned to release Fermi in Q4/09, ie November or December. I dont think they would have planned to release derivates only a year later.

And anyway as others have stated, with Fermi's design, it is more likely to be harder to design derivatives compared to the old gen.

G80 came out in November 06 and we saw G84/86 in April 07. Ati has been a lot better in delivering derivatives recently. With RV670 we saw the entire lineup out within three to four months(November-Feb). Same thing with RV770(June-September). With Evergreen we saw it increase to five months, but this time ATI taped out 4 chips compared to 3 as they had done earlier. And according to rumours Redwood and Cedar were actually good to go in time for christmas but were delayed because of yield and volume availability concerns.
 
If this was the case then why havent we seen any Fermi derivatives by now? Heck we've hardly had any news of derivatives at all. They had planned to release Fermi in Q4/09, ie November or December. I dont think they would have planned to release derivates only a year later.

And anyway as others have stated, with Fermi's design, it is more likely to be harder to design derivatives compared to the old gen.

And where did you get that idea ?
DId you even look at the architecture specs ? Did you see the GPC structure ?

Fermi is highly scalable and also, there have been many rumors about derivatives (GF104 and GF108 specifically). It seems you either missed or avoided them...

Erinyes said:
G80 came out in November 06 and we saw G84/86 in April 07. Ati has been a lot better in delivering derivatives recently. With RV670 we saw the entire lineup out within three to four months(November-Feb). Same thing with RV770(June-September). With Evergreen we saw it increase to five months, but this time ATI taped out 4 chips compared to 3 as they had done earlier. And according to rumours Redwood and Cedar were actually good to go in time for christmas but were delayed because of yield and volume availability concerns.

And what does that have to do with GF100 ? Some G80 derivatives came before ATI even launched the HD 2900 XT and that's exactly what ATI did this time as well. They launched their derivatives before NVIDIA even launched any DX11 product.
As for GT200, they didn't need derivatives, because 1) the design wasn't really that scalable and 2) G92 was happily taking the mid and low range and wasting money on R&D on a GT200 based chip that would just rival an existing one, would be...well...a waste. As for Fermi, if the latest rumors are true, GF104 will be out 3 months after the release of GF100.
 
Back
Top