NVIDIA GF100 & Friends speculation

Why would one run out of memory faster than an other?
Less efficient compression? Any other reasons?
 
How much money does NVidia make off discrete graphics anyway? Maybe the future is to use discrete as an R&D proving ground and loss-leader, while making a killing on workstation, HPC, mobile, and consoles. It seems like PC gaming on discrete is dying slowly. Budgets to develop such games are going up, while marketing to recoup the losses must scale up as well, at the same time, piracy is worse, and it's a declining market in comparison to everything else. On mobile, you can work with much smaller teams, over shorter periods, with easier marketing (viral/mobile downloads). Hell, Zynga made $280 million on Farmville which is a crappy flash game.

Even if margins were super good on high-end discretes, it doesn't look like a great place to scale up earnings.
 
Why do people insist on comparing a dual GPU card to a single GPU card? I'm sorry, it may work on a $ to $ basis, but after that, it holds no water. Most people with a decent IQ will compare it to Cypress, not Hemlock.

You are totally right. If NV PR thought they could get away with it, they would be comparing Fermi to Juniper. I don't think they had a Redwood sample on hand that they could admit to owning, so that comparison is right out. Then again, if they were as cunning as they think they are, they would have waited until Cedar came out and slapped that part silly with Fermi. Fermi is at least 10x faster than Cedar you know.

Then again, most reviewers are a little smarter than that. Well, looking at the writeups today, I may have to rethink that, but I will give them the benefit of the doubt for now.

Back to the point at hand, why bother with price, power, or card count comparisons? What is the point of that if they dent something that your ego is staked upon?

-Charlie
 
They could in theory paper launch the single board SLI version at the same time they launch GF100, if they can get some prototypes into reviewers hands by that time.

Yes, but it will either be slower than a Hemlock or not be a PCIe board. Then again, as long as we are talking theoretical parts, they could launch GF100 tomorrow.

-Charlie
 
Loads of them really. Surface management schemes (how often/early/late to evict or replace a texture), how surfaces are laid out in memory (swizzles + bit ordering + encoding) and the space they take, same thing with meshes and index buffers and vertex attributes. Render targets probably have their own management schemes keeping them in memory longer/shorter between chips/vendors/drivers. Whether the board is doing anything else at the time, like interleaving a compute kernel or trying to do some Folding and how to handle situations like that. They all can and will affect how long something's kept in that relatively small space and how.
 
Development of casual games is where development for pc games was, what, 10 years ago. They won't be content sitting still there forever.
 
You are totally right. If NV PR thought they could get away with it, they would be comparing Fermi to Juniper. I don't think they had a Redwood sample on hand that they could admit to owning, so that comparison is right out. Then again, if they were as cunning as they think they are, they would have waited until Cedar came out and slapped that part silly with Fermi. Fermi is at least 10x faster than Cedar you know.

Then again, most reviewers are a little smarter than that. Well, looking at the writeups today, I may have to rethink that, but I will give them the benefit of the doubt for now.
Enough now :smile: I'm tired of your trolling, and while you may be the most beautiful troll I've ever had the pleasure to meet, it grates now. You've got your own forums to play games on, so go do this stuff there.

I've tolerated it for ages because I like you, but this thread has seen my patience dry up like the Gobi. Tidy your posts up or bugger off.
 
@Trini all other performance aspects bar memory could be samely valid while not at 2560. So why use those?

Not sure what you're asking. I didn't see any mention of resolution in the white paper. neliz was showing a 100% advantage at 2560x1600 compared to a 50% advantage at 1920x1200. Are you asking if Nvidia would have also used the former to overstate Fermi's advantage over GT200?
 
2560x1600 is my idea of high resolution, 1920x1200~ish is something consoles from 2005/2006 pretend to render at.

here is the slide: http://hardocp.com/image.html?image=MTI2MzYwODIxNHh4VHN0ekRuc2RfMV81MF9sLmdpZg==

Now I can already see this turn into a semantics war, sigh "noes!, high resolution is 1680x1050" etc. For all we know NV could be talking about 1024x768 for that matter. It's their job, claim superb improvements, just don't tell what exactly improved and don't mention the circumstances... keeps the fans happy until they're really going to launch a part.

And indeed, if Fermi was faster than Cypress at 25x16x8AA, wouldn't we see claims about three, four , or five times AA improvement over GT200 at "high" resolution? It's not their habit to claim just about 2 times and be done with it, they need larger awesomebars.
 
Costs include double the amount of video RAM, a PCI-Express-Bridge-Chip, PCB cost and the contracts negotiated with TSMC.

According to the analyses I have seen, yes it does. If you go by raw silicon costs, Fermi is ~$150 (104 die candidates per $5000 wafer, 30% yield (I am being overly generous here, far less for a 512 shader part) gets you ~$160), add in $10 for packaging and testing, $20 for a PCB (lots of routing for 384b memory, but cheaper than GT280), $20 for board components, $25 for HSF, $10 for assembly, and $39 for 12 GDDR5 chips ($3.25 per for low bins, if they up the bin, add $6 or so). That gets you to $274 raw cost, no FOB, no profit, not assembly failures etc.

Then look at Hemlock. Silicon costs about $45 per Cypress die ($5000 wafer w/160 die candidates per (it is actually higher) @70% yield (conservative not estimate) ~= $44.6, real numbers are better than that), $90 for the pair. Add $7.50 per, $15 total for packaging and testing (no lid, much smaller die, far fewer traces, better dot pitch, less power, much lower package board count etc), $25 for the PCB (probably cheaper than GF100's but lets be negative), $30 for board components, $25 for the HSF, and $15 for assembly. Add 16 GDDR5 chips (At $3.50 per - one price tier up from Fermi's) for $56 and you are at $256.

I have a lot more precise numbers than that, but lets just say I am giving NV every benefit of the doubt and ATI a very negative spin on these parts. Also bear in mind that these numbers are before any profit is taken. Nvidia tends to mark up silicon by 100% from cost, so they would sell GF100 kits for $360 or so, ATI tends to take a bit less, but if they do 100%, it would be at $266 for the same kit. At that point, you are arguing PCB layers, board components, and HSFs.

I can make a good case for ATI having the cheaper of all of the above even after you take into account both chips. Part of that is the absolutely moronic choice NV made with regard to voltage vs amperage, did they go flat out insane? No, don't answer that, but it blows out their packaging costs, and likely will lead to some serious problems on chip longevity, especially in enthusiast parts where voltages are played with.

So, to answer the question, yes, ATI can make a Hemlock for far cheaper than NV can make a GF100. In fact, they can price Hemlock at the point where NV is in the red and ATI is making money, albeit a very slim amount. If you add in margins for the AIBs, disti's, shipping companies, and (r)etailers, things get much uglier for NV.

-Charlie
 
And indeed, if Fermi was faster than Cypress at 25x16x8AA, wouldn't we see claims about three, four , or five times AA improvement over GT200 at "high" resolution? It's not their habit to claim just about 2 times and be done with it, they need larger awesomebars.

Heh, things must be getting really bad when Nvidia is deemed to not be spinning hard enough. Is there anything they can do that won't be twisted in a negative manner? It doesn't seem that way.

@Tahir, they did mention something about improving compression with 8xAA. I took that to mean a smaller bandwidth hit but presumably that would result in lower memory usage as well.
 
Heh, things must be getting really bad when Nvidia is deemed to not be spinning hard enough.

You're the analytical guy, what would you do with "yeah, it's about two times as fast in certain situations" pick that up way too positive or look how they spun that in the past? From my point of view, GF100 can only get better at launch.

Is there anything they can do that won't be twisted in a negative manner? It doesn't seem that way.

I can tell you I love their idle mode and so would you... but then again.. nobody is supposed to know that.
 
@Tahir, they did mention something about improving compression with 8xAA. I took that to mean a smaller bandwidth hit but presumably that would result in lower memory usage as well.
L2s aren't just for textures, now. So with the big L2s that could make for quite a big improvement.

Jawed
 
This little head-to-head sums up what bugs me about this entire thread. The fanboys aren't scared of just unzipping and plonking it on the table, despite only having a tiny part of the big picture to hand and only their preset personal feelings about a hardware vendor to fill in the rest.
Whats with the awesome one-liners Rys? Seems like you were holding back for some time and are just letting a few rip today. :p
 
I can tell you I love their idle mode and so would you... but then again.. nobody is supposed to know that.
Ooh, turn off 3 GPCs?

Which reminds me, rather than deleting GPCs to make smaller GPUs, wouldn't it make sense to reduce the count of SMs per GPC? This way you keep the balance of geometry performance on large and small triangles. It would also mean that power-saving by turning off GPCs works the same across the range.

Jawed
 
Thanks to Rys, Trini and Jawed for some insight into memory usage differences and the reasons they may occur.

@Jawed - pure speculation here: I doubt they change the SM per GPC so soon unless it really is a "trivial" matter. Speculating because of the problems NV has had to market and the engineering hurdles they have encountered so far may have kept them too busy to re-engineer the parts with such a fine scalpel so soon unless the GF100 was inherently designed to be easily modular.

On second thoughts they did just go from 512 CUDA Cores to 448 CUDA Cores - I wonder what that entailed and what else was chopped.
 
Im worried for nV based on their FC2 numbers. For 2 reasons.

1. All the cards demo-ed at CES had 1-6pin and 1-8pin, not 2-6 pin, which lead me to think the cards were the full powered 512 sp variant, not the 448 sp little bro.

2. The nV released FC2 bench indicate fermi to be ~60% faster than gtx 285, in small ranch, at 4xaa. Most reviews put the 5870 as ~40% faster than gtx 285, however, they use actual gameplay fraps or different maps. Some use small ranch, and the 5870 scores around 80 fps at that res and 4xaa. Now, lets just ignore the latter worse case scenario for fermi being similar to 5870, and go with the first. Fermi is about 20% faster than 5870 at best case scenario in FC2. This is likely to be the 512 sp variant as per #1. So that puts the 448 sp variant as being ~= 5870.

We don't know how it will perform in dx11 games where this is most important. But for current gen games, and FC2 which is a TWIMTBP game, its not much better. I want nV to be competitive, as a consumer, im not a fanboy, i go for best bang for buck. ATI is pricing their cards a bit higher than they should atm. Their cheap production costs have room to drop, and i want it to drop.. going to be upgrading soon. :)
 
ATI is pricing their cards a bit higher than they should atm.

This sticks out like a sore thumb. The HD5850 for instance is cheaper than a GTX285 yet offers 20~30% more performance and you and AMD to lower the prices? amazing logic!
 
You're the analytical guy, what would you do with "yeah, it's about two times as fast in certain situations" pick that up way too positive or look how they spun that in the past? From my point of view, GF100 can only get better at launch.

I would take it as an example that demonstrates one of their claims. Would you not do the same? "Best case scenario" is par for the course in marketing, it's not their job to highlight the shortcomings of their product. And you shouldn't expect them to either.

Ooh, turn off 3 GPCs?

That was my first thought too, they're pretty much self-contained so a single GPC at lower clocks/voltage would make for a nifty idle mode.

By the way, why was there no mention of video processing capabilities? Unchanged from GT21x presumably?
 
No they haven't. Where's my clocks!

As soon as NV can figure out where they have bins, they will, umm, not disclose that aws0mness for a bit longer.

Seriously though, they don't know yet, clocks are a mess, not binning in nice ways, and defects are rife. Add in that they have very coarse controls via voltage, and are bound hard by overall wattage and burning traces, and you have a mess. Right now, the best they can do is prey... err prAy and FUD.

Then again, if you look back in the thread to about 24 hours ago, according to some, we were supposed to have all questions answered by now. I wonder what happened?

-Charlie
 
Back
Top