NVIDIA GF100 & Friends speculation

Well, if these performance rumors are true, so much for my theory that AMD struck out because 5870 was only 40% faster than 4890 despite double most functional units.

It would appear Nvidia landed in the same boat, so apparently AMD did nothing wrong.

And this after some hinted Nvidia was going to shatter this barrier due to to double geometry setup or whatever...
 
Same thing happened back when the 2900XT launched. In 06 it was faster by quite a bit than the 8800GTS but in games it was equal to or slower than the 8800GTS.

If the new tape-out is just B1 (and we all know they will need B2 for mass production), than this is bad. It should have been another product of the DX11 range.

I am sure Charlie will have the details soon and be the first to report it.
 
And this after some hinted Nvidia was going to shatter this barrier due to to double geometry setup or whatever...
I think it's pretty obvious that the massive increase in geometry performance with the GF100 will not be something that will show up in most current benchmarks, but should allow the GF100 to pull ahead in upcoming games.
 
If the new tape-out is just B1 (and we all know they will need B2 for mass production), than this is bad. It should have been another product of the DX11 range.

I am sure Charlie will have the details soon and be the first to report it.
Why? If they want to release the rest of the family this fall, say in September, then they don't need to tape out for another 1-3 months (assuming the tapeout works).
 
I think it's pretty obvious that the massive increase in geometry performance with the GF100 will not be something that will show up in most current benchmarks, but should allow the GF100 to pull ahead in upcoming games.

Will it actually come true this time? I remember hearing how ATI's vast shader power edge would enable it to pull ahead in "future games" in the X1950 era. Never really happened to any great degree.
 
It appears that Nvidia has decided that the "announcement of an announcement" idea didn't go over quite as well as they'd hoped and decided to simply let the cat out of the bag:

NVIDIA We want to apologize for the confusion around our most recent GF100 update. To clarify, the launch date for GeForce GTX 480 and GTX 470 is March 26, 2010. This date also coincides with the GeForce LAN event NVIDIA is hosting at PAX 2010. Hope you can attend the show. For more info, please visit:
www.nvidia.com/paxeast

http://www.facebook.com/NVIDIA
 
Will it actually come true this time? I remember hearing how ATI's vast shader power edge would enable it to pull ahead in "future games" in the X1950 era. Never really happened to any great degree.
Well, in this case there's a relatively easy test: form a plot of performance vs. release date for recent games, and see if there's a correlation. Performance could be normalized, for example, to one of the ATI cards' performance in a given benchmark.

One would tend to expect that if the GF100 is doing better in more recent games, then future games should continue this trend.

If there is no such trend, then we shouldn't expect much of any improvement in newer games except those that make use of tessellation.
 
Translation is correct,but enet is a reliable chinese site,so it must be based on anonymous information,not pure guesswork.

How can the translation be correct when nobody can make any sense out of the translated sentence?

OT: Is chinese->English translation usually this bad?
 
How can the translation be correct when nobody can make any sense out of the translated sentence?

OT: Is chinese->English translation usually this bad?
Yes. The languages are quite different, making any sort of automatic translation hard.
 
As far as I understand Vol. 6 from AMD and the talk in sites and pages it appears that if both "left" and "right" thread of a module use FP-calculations the FPU behaves as 2x 128bit wide SIMD FPUs which goes twice over the YMM registers (like on K7 the 64bit FPUs did go twice over XMM registers). In case that only one side of the module uses FP-calculations in the current time-slice, it can occupy the entire FPU doing a full 256bit SIMD on a YMM register.

How flexible that load-balancing really is, has to be seen, but with some afford I can imagine they can load-balance on instruction-granularity.

Intel suggested a real strange mode-switch between 128bit and 256bit SIMD instructions, similar to the EMMS-switch between "regular" FPU and SIMD FPU. Basically they did that for not storing the entire YMM register-state on task-switches or something like that. I can't really follow the reasoning, but hey they're suppose to be the smart guys, let them be smart.

I can't find any mentioning of AMD following the lead, but such an instruction allows basically compiler/os emitted explicit FPU-split/merge (besides all the other nastiness which doesn't make it worth IMHO).

The SSE unit is not shared in the sense that each core had it's private 128 bit ALU in barcelona. In BD, the two ALU's from neighboring cores are taken and put together in center of the module. So while the fp instruction stream from both cores is muxed into a single 256 bit ALU, it is not sharing in the sense of 2 cores having to do with less since the per core FP throughput is the same.

Our differences are over the semantics of "sharing" I guess.
 
Will it actually come true this time? I remember hearing how ATI's vast shader power edge would enable it to pull ahead in "future games" in the X1950 era. Never really happened to any great degree.

don't mix up future with near future.
One should only need to check the DiRT2 benchmarks and see the X1950XT stomp all over the 7900GTX.

Any architectural advantages one may have should be displayed immediately and not three or four generations down the road.
 
Perhaops NV learnt something from G200. It makes no sense to launch with a high price just to be forced down by the competition days after the launch.

Maybe the card simpy sucks.

I'm guessing that the common sense option of the rumored price not being accurate, doesn't really appeal to you ?
 
Well, if these performance rumors are true, so much for my theory that AMD struck out because 5870 was only 40% faster than 4890 despite double most functional units.

It would appear Nvidia landed in the same boat, so apparently AMD did nothing wrong.

And this after some hinted Nvidia was going to shatter this barrier due to to double geometry setup or whatever...

Which performance rumors ? The Vantage scores ?
 
Yes. The languages are quite different, making any sort of automatic translation hard.

fixed :smile:

The gaming performance of GTX480 and 470 should be made available on 22/2 according to Nvidia's plan. However, due to the difference in time zone, it is still two days away, so wouldn't you just let me predict the performance numbers of the GTX470, basing on my pure imagination. If the following predictions ever become true, it would only be co-incidence. I will not be responsible for whatever I am going to say, and will not be answering any questions.

GTX470 will have a score of 167xx and 73xx, under Performance preset (1280x1048, 1x AA) and Extreme preset (1900x1200, 4xAA) respectively. Which are 13% and 20% faster than GTX285 and is quite competitive when compared to the results of 5870 under Performance preset.

GTX470 will have a wide lead in the benchmarks of the DX10 game, Far Cry 2, with 80fps under the following settings: 1920x1200, 4xAA, 16xAF. However, as there are different benchmarks of Far Cry 2, the data is not ideal for making comparisons with other video cards.

GTX470 can achieve a frame rate of around 30fps during the DX11 Uniengine Benchmark under 1920x1200, with 4xAA and 16xAF, which is apparently better than the 5870.



Tried my best to maintain the spirit and tone of the original article.
Hope I can help you guys out:cool:
 
The die photo shows something that looks very much like a 17th redundant x,y,z,w,t ALU set. I think the gains there probably had a lot to do with the library being used. Additionally there may be a small benefit from having the TU dedicated, unlike in R6xx GPUs - I'm guessing this reduces some intra-SIMD complexity because of routing data in-to/out-of the distributed register files (the register files are located close to the ALUs as register files in ATI are dedicated to the x,y,z,w,t ALU set).

Maybe a 17th is for compensating die-defects.
Too bad I didn't link those really good sources, I think it was Dirk talking about how they were going through all blocks and removing redundant logic; wild shot: ~15% it was I think.


Interesting, the DOT-logic is reduced to another FMA there. I wonder if the DOT-logic can be split into independent MUL,MUL,ADD operations. If (I don't know) x,y,z,w all can do a DOT in 1 clock, there is way more FLOP-logic in the chip available than utilizable.

I can't remember the page I saw the schematics, so I just made a custom one. I haven't made one for (14) years, so be humble. ;)

FMA-wires.png


I don't think there are flags - the programmer has to test the result of the instruction that might have generated the exception.

Uh, so how do you know an "exception" occured in the first place?
 
That's impossible to say, the ball is in Nvidia's court at the moment. AMD's response will depend on the price, performance and availability of the GTX470 and GTX480.

And why would nVidia wait for a response of AMD?
Do you forget what happend with GT200? It would be really stupid to ignore the fact that amd could and i think will reduce the price of their cards. A GTX470 for 399$ with the performance of 5870 looks not interessting after a nearly 100$ price cut of the 5870...
 
Back
Top