NVIDIA GF100 & Friends speculation

I wasn't sure which would be appropriate thread, but seeing this has gone a bit off the course every now and then, I suppose this is as good as any, as I don't think this is worth it's own thread

Finally, when nVidia naming started to make sense again after all the renames, leaving 2xx as DX10 adapters, 3xx as DX10.1 adapters and 4xx as DX11 adapters, nVidia pulls another great one and re-re-re-(re-re?)releases G92 as GT330
http://www.nvidia.com/object/product_geforce_gt_330_us.html

This is incorrect.

GTS 250 was essentially a rebranded 9800 GTX+, no question about it.

GT 330 is quite different than GTS 250. The closest comparison to GT 330 is GT 240. But even then, GT 330 can potentially differ from GT 240 in the following areas: # of CUDA "cores", graphics clock freq, processor clock freq, memory clock req, memory config, memory interface, DirectX version, etc.

GT 330 is an OEM-only card. The OEM is not going to give consumers the choice of choosing between a GT 240 and GT 330, unless they are out of their mind.
 
You might want to actually follow the discussion more. The whole discussion was started because people were insisting that Quadro/Tesla were separate products from a silicon design standpoint which isn't true.

You can spin it however you want, but the implication from your words is that NVIDIA is screwing Quadro/Tesla customers due to the fact that Geforce customers can purchase graphics hardware at lower prices. This completely ignores the reality of the situation, in which NVIDIA has to spend an incredible amount of time and resources directly supporting individual Quadro/Tesla customers.

And yes, I have little faith that Tesla will be a savior for Nvidia if they are uncompetitive in the gaming market, same for quadro, as neither of the SKUs can exist without the R&D being paid for by the much larger gaming market.

Who said that Tesla needs to be a "savior" for NVIDIA? This is just a case of NVIDIA expanding into new markets. You can paint a doom and gloom picture about "what if" scenarios involving the gaming market, but that's just spreading FUD.

And if I'm crapping on hundred of engineers by saying the only reason that Quadro out performs Geforce in professional graphics benchmarks is because Nvidia purposely cripples Geforce performance, then so be it. Sometimes the truth hurts.

This is again just spin. NVIDIA has many engineers that spend time optimizing Quadro drivers for professional apps, just as they have many engineers that spend time optimizing Geforce drivers for gaming apps. Why in the world should they optimize Geforce drivers for professional apps? That makes absolutely no sense.

As far as Tesla, the number of engineers required to support that SKU is minimal, a handful of additional board designs, big deal. All the software is payed for by Geforce marketing efforts.

Again, a trivialization of the time and effort involved on both the hardware and software side to support Tesla, what else is new :)
 
This is incorrect.

GTS 250 was essentially a rebranded 9800 GTX+, no question about it.

GT 330 is quite different than GTS 250. The closest comparison to GT 330 is GT 240. But even then, GT 330 can potentially differ from GT 240 in the following areas: # of CUDA "cores", graphics clock freq, processor clock freq, memory clock req, memory config, memory interface, DirectX version, etc.

GT 330 is an OEM-only card. The OEM is not going to give consumers the choice of choosing between a GT 240 and GT 330, unless they are out of their mind.

I think you've got things wrong again.. the GT330 is using a G92 core where as the GT240 uses the GT215. The fact that the 330 offers DirectX 10 only (and not the 10.1 the GT21x gpu offers) would seem to indicate that the 330 is yet another salvage part from the G92 core.. it's specifications seem to range wildly from 128 to 192 or 256 bit, 96 or 112 "Cuda Cores" and only 500Mhz DDR2 or 800Mhz GDDR3, where as GT21x are capable of running DDR3, GDDR3 or GDDR5. If anything the GT330 looks closely to be a rebadged 9800GT (112c - 256b -GDDR3) or 9600GSO (96c -128b/192b- DDR2/GDDR3). I'd wager with such a wide possibility it's simply to act as a dumping ground for any remaining g92s that couldn't be positioned elsewhere, with final specification left up to OEMs. So while the particular specification may vary as well as performance, the GT330 is closer to a GTS250 in construction than that of a GT240.. though performance wise it will depend on configuration.

Honestly it's sad that such rehashing/rebadging continues from any party.. you could have 2 OEM systems both with GT330s, one a 128bit DDR2 @ 500 Mhz and the other 256bit GDDR3 @ 800 Mhz... confusing to say the least.

http://www.xtremesystems.org/forums/showthread.php?t=244604 MSI V186 GeForce GT330 768MB GDDR3 192bit
 
Last edited by a moderator:
If you look at Fermi, there are a number of improvements that are more focused on programmability than graphics performance - these do increase die area and power.

The coherency and weakly consistent caches with a unified memory address space are one example. That definitely adds quite a bit of overhead in terms of TLBs, etc.

The caches help for graphics too, and replace inter stage FIFO's altogether in fermi. So they are not exactly compute specific.

Double precision support.

With SP:INT at 1:1, I think 2:1 SP : DP was pretty much free.

The fast denorm support is another - and FWIW, the reason to keep it around is not PR, but users. You don't want code to suddenly get 2X slower.

If you are running into denormals, the 2x perf hit should be the last of your worries. Denormals popping up means (possibly huge) errors in data/code/algorithm. To be sure, they'll turn up occaisonally, and then and there it doesn't really matter if they are handled in sw. When they start tuning up all over the place is when you have a problem.

ECC for all on-die SRAM is also an overhead which doesn't help graphics.

That is an overhead, sure.

Some of the synchronization tweaks may not really be all that helpful for graphics either.

I am not aware of any new synchronisation tweaks beyond those in GT200. Could you give me some pointers on this one?

Given where NV makes most of it's profit (professional), it's quite reasonable for them to add all these things. But we should be clear, they will pay a price in the high-end. That may not be a big issue, since the truly high-end is hardly high volume. A more interesting question is how much extra are they paying for low-end and mid-range cards...
I am expecting all this overhead to be kicked out for all parts cheaper than GTX470.
 
My understanding is that without denormal support, two different ways of getting to the number zero don't guarantee they will have the exact same representation, which means they will sometimes turn out "false" on a comparison.
If you are running into denormals, the 2x perf hit should be the last of your worries. Denormals popping up means (possibly huge) errors in data/code/algorithm. To be sure, they'll turn up occaisonally, and then and there it doesn't really matter if they are handled in sw. When they start tuning up all over the place is when you have a problem.
Maybe you should read up on denormals e.g. in Wikipedia. Their point is to avoid underflow and e.g. divide by zero errors. They mean some loss of precision, but not usually bugs or errors in code.

Handling them in software wouldn't be a 2x performance hit, but more like 100x, I'm guessing. (Orders of magnitude in any case.) Although I'm not saying that would be a huge problem either.
 
Maybe you should read up on denormals e.g. in Wikipedia. Their point is to avoid underflow and e.g. divide by zero errors. They mean some loss of precision, but not usually bugs or errors in code.

Well, any series of computations that leads to O(10^-317) numbers is not just underflow, it is catastrophic.

Handling them in software wouldn't be a 2x performance hit, but more like 100x, I'm guessing. (Orders of magnitude in any case.) Although I'm not saying that would be a huge problem either.

He probably meant 2x overall. As such, denormal handling is OoM slower.
 
question:

were the rumored release dates of the GF100 already posted here? In the 3DCenter forum two guys said that the GF100 will be released on the 27. or 29. march 2010.
 
I like all the armchair MBAs here who just know (from their guts) how NVIDIA should runs its business.
If you're already annoyed about armchair MBAs, I can't imagine the agony you experience about the armchair silicon experts... ;)

If even Jawed is starting to spout this kind of nonsense...
Charlie's argument that the architecture (G80-derived, essentially) is unmanufacturable appears to hold some water, because G80 is the only large chip that appeared in a timely fashion.
 
It's funny that charlie thinks using the "gpgpu shader" for tessellation (second stage) is a good strategy after this:

In the R870, if you compare the time it takes to render 1 Million triangles from 250K using the tesselator, it will take a bit longer than running those same 1 Million triangles through without the tesselator. Tesselation takes no shader time, so other than latency and bandwidth, there is essentially zero cost. If ATI implemented things right, and remember, this is generation four of the technology, things should be almost transparent.
Contrast that with the GT300 approach. There is no dedicated tesselator, and if you use that DX11 feature, it will take large amounts of shader time, used inefficiently as is the case with general purpose hardware. You will then need the same shaders again to render the triangles. 250K to 1 Million triangles on the GT300 should be notably slower than straight 1 Million triangles.
The same should hold true for all DX11 features, ATI has dedicated hardware where applicable, Nvidia has general purpose shaders roped into doing things far less efficiently. When you turn on DX11 features, the GT300 will take a performance nosedive, the R870 won't.
http://www.theinquirer.net/inquirer/news/1137331/a-look-nvidia-gt300-architecture
 
without making this about charlie..... did you actual read all of what he said, or did you get 1/2 way into the sentence and go AHUH!!>?!?!?!

I read it...

GF100/GTX480 was not meant to be a GPU, it was a GPGPU chip pulled into service for graphics when the other plans at Nvidia failed. It is far too math/DP FP heavy to be a good graphics chip, but roping shaders into doing tessellation is the one place where there is synergy.
So, this really funny, or?

The same should hold true for all DX11 features, ATI has dedicated hardware where applicable. Nvidia has general purpose shaders roped into doing things far less efficiently. When you turn on DX11 features, the GT300 will take a performance nosedive, the R870 won't.
 
That doesn't sound very positive from Charlie. If thats the case then AMD ought to have the ability to release a faster version of the HD 5870 (5890??) to reclaim the single card performance crown though it'll be tough to deal with the noise and thermals without exotic cooling solutions. Maybe the late and great Eyefinity edition is the card which is tipped to do it?
 
It's funny that charlie thinks using the "gpgpu shader" for tessellation (second stage) is a good strategy after this:


http://www.theinquirer.net/inquirer/news/1137331/a-look-nvidia-gt300-architecture
How many times are you going to keep going back to that well... ??
Well I suppose if we go back through your post history and find all the things you posted that came out as incorrect it would be rather easy to pain you into a corner and make you look rather comical to say the least.. how about instead of dragging this thread back into a discussion about the Ramblings of ChuckyD you either post something that doesn't degrade into yet another spill over you maybe you go to S|A and take up your issues there (unless you've been banned there as well for being unable to follow posting rules)..

As far as his latest postings, I personally think the numbers don't add up to be as bad as some are saying but if we are to compare products equally then the GT100 HAS to be at least 30% faster then the (iirc) 5800 series for some here.. afterall weren't some here saying the HD5800 isn't that great as it's hardly faster then the competition (5850 being equal and 5870 < 30%). The 70% fan speed at idel seems more about bios profiling than anything else but the idle temps could suggest that power reduction schemes are not yet in place with performance a (the) main concern.

Talk about Pot V Kettle yet again.. I've reported the posts as off topic and I see the mods have seen to let the thread continue so I add my 3 cents anywho
 
That doesn't sound very positive from Charlie. If thats the case then AMD ought to have the ability to release a faster version of the HD 5870 (5890??) to reclaim the single card performance crown though it'll be tough to deal with the noise and thermals without exotic cooling solutions. Maybe the late and great Eyefinity edition is the card which is tipped to do it?

They're at 180W (and according to gaming consumption much less)
IIRC the 5850 uses less than a GTS250 when gaming... tied on (nV TDP vs AMD maxboardpower)

What I'd really like to see AMD build- besides the eventual 5890- is reduce length and power draw of the 5870 with a minor clock cut and voltage drop, oh and of course the price.

"Look you're using ~2x the power to be slightly faster!" might be even more damning than winning 15% on the halo.

Which brings us back to GF100. This better not be all true, especially the performance metrics, otherwise the 1/2 cut card even when pumped at screaming clocks wouldn't manage to do much.

Gonna make the same mistake like I did on R600 - assume drivers for now :p
 
Back
Top