NVIDIA Fermi: Architecture discussion

(It was discussed here when it came)
And again 4890->5870 is +100%/100%/100%/23% (alu/tex/rop/bw) and becomes like +50% IRL.
While 285->380 is (from the article) +140%/60%/50%/21% and should be +100% IRL? That would require some SERIOUS graphics relevant architectural improvements (especially bw saving) that the article doesn't mention. (granted, Rys' claim is a bit more theoretic and therefore not as bold as Silus')

Unless 285's bottleneck was shader throughput, which I think is unlikely (to that extent on average, at least)...
 
I was scratching my head while reading Psycho's 21% bandwidth increase but then I went back to the techreport article and saw a 1300MHz memory clock for the GTX285, which is actually clocked at 1242MHz if memory serves well which equals to ~159GB/s.

I can't know where Rys got his frequency estimates got from but it seems that his memory frequency for GF100 is quite pessimistic and his hot clock estimates way too optimistic considering his core frequency estimate. I continue to expect a roughly 2.2x ALU:TMU ratio and a memory frequency =/>1100MHz.
 
Only problem with that theory tho is that 2x 9600GTs are almost fast as a GTX280 in shader intensive games. Crysis, Stalker and Oblivion. 2x 9600GTs are within 10-15% of a GTX280 at several resolutions, someting was borked in the SPs of GT200 chips as noway in hell should 2x 9600Gts or 1 9800GTX be within 50% of 1 GTX280 but they are in every single shader intensive game bench I've seen. I'll be doing some testing next week and I'll share my results here.

I'm looking forward to see those... but in the meantime: G94 was the most balanced chip I've seen in years, when taking into account the whole package.
 
I'll be doing some testing next week and I'll share my results here.

Yeah, some "clock reconfiguration" benchmarking for gtx285 would be interesting :)
For instance test at 325/750/620 (half 285) vs 500/1800/750 (half 380) if your shaders can go that high. In some realistisc modern games and maybe also with/out some cpu downclocking.

I can't know where Rys got his frequency estimates got from but it seems that his memory frequency for GF100 is quite pessimistic and his hot clock estimates way too optimistic considering his core frequency estimate. I continue to expect a roughly 2.2x ALU:TMU ratio and a memory frequency =/>1100MHz.

Also puzzled me a bit, as it would seem pretty memory limited (except that 5770/5870 isn't that badly limited after all), and faster memory is an easy option.
 
This is my plan of action.

I'll load up Crysis, Sta;ler and Oblivion, the 3 most shader intensive games I have. I will use 1920x1080 my monitors native res and I'll post results of testing using those 3 games. Each game shal have multiple results. First result will be single GTX260 216 at 650/1512, then at 325/1512 and then at 650/756. Then after some reconfiguring, I'll post 3 more results of the same games with my pair of XFX 9600GTs Alpha Dogs in SLi their default clocks are 700/1700 wihch I will reduce to 650/1512, the same clocks at which my GTX260 will be running. AA will not be used as that is fillrate/bandwidth intensive and would severely affect the scores.

My rig
Q6600@3.0(I will try and tweek it this week for 3.6 with my water cooling setup)
8GB DDR2 1066 ram
160GB HDD
780i motherboard
Win7U 64bit
Cosmos 1000

Cards to be used
2x XFX Alpha Dog 9600GTs 700/1700 GPU clocks
1 of my 2 eVGA GTX260 216 OCs

Games:
Crysis
Stalker:SoC
Oblivion using a saved game file from an area that stresses the shaders very hard.

I wish I had a GTX280 or GTX285 for a better comparison, hell a 9800GTX would be nice too.

Now given the games selected are shader intensive, the 9600Gts in SLI shouldn't be within 40% of the GTX260 216. But I can tell you now, it will be within 10-15%. 9600GTS in SLI have SLI scaling, 128SPs and 16 ROPs, the GTX260 216 is 216SPs and I think 24 or 28 ROPs, dont remember which.
 
Last edited by a moderator:
I consider crysis pretty bandwidth dependent, and considering it's age oblivion could be the same. Some bandwidth scaling would be nice too. But ofcourse within this thread I would especially like to see the proposed "fermi scaling" compared to the gt200.
The dual 9600s would also have dual rasterizers and more bandwidth (similar with the dual 4890s clearly outperforming the 5870)
 
Just because a game makes heavy use of shaders doesn't mean its entirely dependent upon shader performance. The bottleneck can change within areas of an individual frame, let alone from frame to frame or game to game.
 
So what about your theory about a screwed up memory controller? See your information just doesn't make sense, since its in a hundred different places.

When did GT240 ship? Why did it take that long? GDDR5 was a problem, and it looks to be fixed. What's the problem?

A1 wasn't a respin, so how you do you expect 6-7 weeks?

Blink. What is your question? Hot lot time for TSMC 40nm is 6-8 weeks.

If there were any silicon changes, minimum its 2 months to get chips back, and that is being very optimistic. Wasn't this already gone over in this very thread?

See above, 8 weeks ~= 2 months.

-Charlie
 
Because your history with this kind of predictions: There is no problem. Tape-out was late but after this they are in the timeframe. And a launch with A3 in february means they would be very fast.

ATI took 6-7 months from Evergreen tapeout to cards on sale. _IF_ GF100 goes on sale in Feb, that is 8 months. That is not very fast.

-Charlie
 
maybe you are guessing?

Oh no I was talking about Win 7 parts, thats why I only selected portions of your post. And since you have no idea about when nV pushed thier launch out at the time of your post, I don't believe you knew anything.

Wow, and I thought Sherbin was delusional, but you seem to be on quite another level. You are well suited to marketing, but I will give that spin 3/10 for effort.

-Charlie
 
ATI took 6-7 months from Evergreen tapeout to cards on sale. _IF_ GF100 goes on sale in Feb, that is 8 months. That is not very fast.

-Charlie

7 Months because your tape-out date was mid-end july:

Nvidia's GT300 has finally taped out.
Our sources tell us that the GT300 is beginning it's 7 or 8 week long process through the very expensive machinery at TSMC. We said the target was a mid-July tapeout, and if you are very charitable with 'mid-', Nvidia hit it. A hair under a week ago, the chip had not taped out.
http://www.semiaccurate.com/2009/07/29/miracles-happen-gt300-tapes-out/
 
When did GT240 ship? Why did it take that long? GDDR5 was a problem, and it looks to be fixed. What's the problem?

You were talking about Fermi if I remember correctly and it was somewhere in this thread, so I don't know where you are coming from then.


Blink. What is your question? Hot lot time for TSMC 40nm is 6-8 weeks.

You do know hot lots aren't the same as tape out? :D

See above, 8 weeks ~= 2 months.


I don't think you looked at what I quoted.


Wow, and I thought Sherbin was delusional, but you seem to be on quite another level. You are well suited to marketing, but I will give that spin 3/10 for effort.

-Charlie

Says the blind man with the blind dog ;)
 
You were talking about Fermi if I remember correctly and it was somewhere in this thread, so I don't know where you are coming from then.

When he talked about the memory controller, I recall it was only in relation to GT2xx GDDR5 parts. He possibly could have speculated that Fermi may be affected as well, but that's it.
 
When he talked about the memory controller, I recall it was only in relation to GT2xx GDDR5 parts. He possibly could have speculated that Fermi may be affected as well, but that's it.

Yup, and it you notice other people comparing the GDDR5 controllers pixel for pixel in the 2xx and Fermi pics, you might see that it is very likely the same, or very similar controllers.

-Charlie
 

Yet another event about computing with nothing about gaming... yay!

Oh and since we were on the topic of GF100's die size, Dudler from the SemiAccurate forums dug this up: http://www.fudzilla.com/content/view/15782/34/

Between Charlie's 23.x mm * 23.y mm (where x may be 8), Neliz's 24 mm * 24 mm, Fellix's die-shot matching and scaling and Fudo's... whatever that was, everything seems to indicate something between 550 and 576mm².
 
Back
Top