NVIDIA: Beyond G80...

Which recent games are even close to be called (purely) fillrate limited today? Those games that used to be fillrate limited in the past are CPU limited today.

By the way the Anand results show up like that for the simple reason that the shader clock changes while overclocking only by specific increments and it explains also why there appear occassional "flat lines" at several frequency levels. Under those cases the shader domain simply has the same frequency, despite the core clock being higher.
 
well anandtech is the only site to independently oc the core and shader clocks, so thats the only link i can use at the moment. im referring to situations that are gpu limited(high res with aa/af). i still think their article points to being fillrate limited. i think ud find similar results from virtually every game

i never made any comments with massive overconfidence, even in my original post i said "i think"
 
Which recent games are even close to be called (purely) fillrate limited today? Those games that used to be fillrate limited in the past are CPU limited today.

By the way the Anand results show up like that for the simple reason that the shader clock changes while overclocking only by specific increments and it explains also why there appear occassional "flat lines" at several frequency levels. Under those cases the shader domain simply has the same frequency, despite the core clock being higher.

i think most games that hit a gpu limit do so because of fillrate
 
Yeah, the G80 is a bitch for the 3D Tables :( It really wasn't designed with non-traditional architectures in mind, heh. I'm not sure what we have in the pipeline (no pun intended!) to fix that.

And yeah, G80 is certainly no slouch. However, consider what happens in a stencil pass: assuming you have implemented even just a basic compression algorithm, your bandwidth requirements are likely quite low. So, what's your bottleneck? heh! Well, it could be triangle setup limited too - that certainly wouldn't be the case for Doom3 or Quake4 :)p) but I'm not sure about F.E.A.R...

Anyway, not all of the frame is bottlenecked by stencil, only one part of it. But for that specific part of the rendering process, I would assume (I can't really say this reliably sadly, though...) that performance would scale pretty much linearly with core clock for the G80. I'm curious what a z-only rendering pass would be bottlenecked by on G80; I'd assume triangle setup, but I really don't know. Maybe I'm overestimating how much of a bottleneck that is.
 
well anandtech is the only site to independently oc the core and shader clocks, so thats the only link i can use at the moment. im referring to situations that are gpu limited(high res with aa/af). i still think their article points to being fillrate limited. i think ud find similar results from virtually every game

i never made any comments with massive overconfidence, even in my original post i said "i think"
Yeah, well, Anandtech's benchmarks don't come close to backup up your statements, because they show gains from both shader overclocking and core overclocking. Slightly more for core overclocking, to be sure, but not that much more. Furthermore, I have seen no reason here to believe that the G80 isn't texture rate limited, at least to some degree, in FEAR. Rather I suspect that the limitation is not simple and one sided in any modern game on the G80.
 
Yeah, well, Anandtech's benchmarks don't come close to backup up your statements, because they show gains from both shader overclocking and core overclocking. Slightly more for core overclocking, to be sure, but not that much more. Furthermore, I have seen no reason here to believe that the G80 isn't texture rate limited, at least to some degree, in FEAR. Rather I suspect that the limitation is not simple and one sided in any modern game on the G80.

Is there even a driver available that allows independent shader frequency overclocking?

Anand used BIOS modifications to change the shader domain frequencies.
 
why do u think ocing the core clock gives such better gains?
1. They are only slightly better. So there is no possible way to claim that the architecture is solely limited by those units that are governed by the core clock, let alone a single one of those units.
2. Even if the game is shader limited, the shaders may become thread dispatch limited when one attempts to overclock the shaders without overclocking the core.

In short, the differences are small enough that it appears to me that the G80 is a pretty balanced architecture in these games, becoming limited by different things at different parts of the frame.
 
Is there even a driver available that allows independent shader frequency overclocking?

Anand used BIOS modifications to change the shader domain frequencies.
Hmm, sorry, I haven't even looked into the subject. I haven't overclocked my machine in ages.
 
Hmm, sorry, I haven't even looked into the subject. I haven't overclocked my machine in ages.

NV has mentioned between the lines that they'll allow in future drivers to increase shader clocks independantly. Anand's experiment was made through BIOS changes, which they then flashed the GPUs with.
 
2. Even if the game is shader limited, the shaders may become thread dispatch limited when one attempts to overclock the shaders without overclocking the core.

This means that overclocking the shader domain alone would not be that usefull?
 
I always did find it funny that the G80 shader clock seemed to be dependent of the memory clock.

- 1800MHz (GTX memory) / 1350MHz (GTX scalar ALU's) = 1.33333...
- 1600MHz (GTS memory) / 1200MHz (GTS scalar ALU's) = 1.33333...
 
Has NV ever changed process without changing model number at the high end?

Not sure might have to check but im sure that it has never happened from nv30 onwards.

I was thinking of more along the lines of the 7800GTX 512mb, that is cherry picked G80s with the latest samsung 1.0ns GDDR3 ram. Also remember those rumours of G80 shader clock being scalable to 1500MHz or 1.5GHz? (I think it was the taiwanese slides) 1024mb of vram would be nice too.

So using INkster findings
Ultra - 2000MHz (1.0ns GDDR3 memory) / 1500MHz (128 scalar ALU's) = 1.33333..
GTX - 1800MHz (1.1ns GDDR3 memory) / 1350MHz (128 scalar ALU's) = 1.33333...
GTS - 1600MHz (1.25ns GDDR3 memory) / 1200MHz (96 scalar ALU's) = 1.33333...

Take this with a grain of salt though :LOL:

However, Like Uttar said, if the supposed 8800ultra had a die shrink to 80nm process, then i would think that they would consider changing the GPU code name. And why name it "8800 ultra" when 8900GTX would been the more logical choice.

edit - does G80 need a respin/tweaking inorder to use GDDR4 memory? or they can pull of what ATi did with R580 and GDDR4 memory? Not enough info on G80s memory controller to confirm this.
 
Useless speculation:

1350Mhz without Missing MUL = 345GFlops
1350Mhz with Missing MUL utilized 50% of time = 432Gflops
1500Mhz w/Missing MUL utilized 50% of time = 480GFlops
1500Mhz w/Missing MUL 66% of time = 512GFlops

So, an R600 spoiler (given peakalu figures) is a cherry-picked G80 clocked with 1500Mhz SPs,
and a new driver that enables the missing MUL to be used 66% of the time, with it's own "peakalu" benchmark that shows a cascade of MAD/MULs.
 
Hi guys,

Since we have started talking about 8800 Ultra, and possible speed increases, Digit-life.com has a nice test comparing standard 8800GTX at 575/1350/1800 and EVGA's overclocked model at 625/1450/2000. You can find the review here: www.digit-life.com/articles2/video/g80-6.html

DemoCoder, it seems that nVidia could easily bump up shader clock to 1500MHz (they can clock shader domain independently after all) with a little amount of binning.

BTW, a propos icecold's and Chalnoth's discussion.
In two games: in Fear (with AA and AF) and Call of Duty (regardless of the settings), 8800GTX is usually up to 50% faster than X1950XTX. Out of all popular games, these show the least difference betwen the two. Any guess why is that? NB, the difference between both cards in Fear without AA and AF is much bigger (up to 100%).
 
NB, the difference between both cards in Fear without AA and AF is much bigger (up to 100%).
It might be wise to isolate AF and AA there. I would tend to believe the reason you're seeing this is that G80 has a pretty big hit for AA in FEAR. What's interesting, iirc, is that it also has a huge performance hit between 8x MSAA and 16xQ CSAA, which primarily affects the ROPs' performance. As such, I suspect the ROPs are bottlenecking a significant part of the frame's rendering with high levels of AA in FEAR.

Useless speculation #2: we won't see an Ultra card, but a new core some time around september-october.
Maybe not-so-useless hint: We might actually see both. And before anyone points out the obvious, do consider that there would be a 4-7 months gap between the gap, which fits the so-called "6 months product cycle".
 
Back
Top