The LAST R600 Rumours & Speculation Thread

Status
Not open for further replies.
These arguements are quickly forgetting "The Basics (TM)", so here goes:

Res = very high ALU cost, high ROP cost, medium TMU/bandwidth cost.
AF = high TMU cost, low bandwidth cost, no ROP/ALU cost.
AA = high ROP/Bandwidth cost, very low TMU/ALU cost.

This is only approximative, but let me permit myself to insist that you can't get a proper arguement going based on the vague notion of "fillrate" nowadays! :)
 
I'm confused. I look at the F.E.A.R. numbers for 1600x1200 (at Rage3D) and I see with no AA and no AF that 8800GTS completely roughs up X1950 XTX. And then when 4xAA/16af is turned on, they are suddenly within the margin of error (1fps). And that's not a giant blinking "BW limitation bitch-slapped you right here" sign? What am I missing?
 
One thing to bear in mind with the G80 architecture is that it's unified. I haven't put a lot of thought into how this will affect performance, but it seems to me to be worth thinking about when examining resolution scaling.
 
I'm confused. I look at the F.E.A.R. numbers for 1600x1200 (at Rage3D) and I see with no AA and no AF that 8800GTS completely roughs up X1950 XTX. And then when 4xAA/16af is turned on, they are suddenly within the margin of error (1fps). And that's not a giant blinking "BW limitation bitch-slapped you right here" sign? What am I missing?

fillrates for the gts and x1950xtx are also very close, 10,400 for the x1950xtx vs. 10,000 for the gts
 
fillrates for the gts and x1950xtx are also very close, 10,400 for the x1950xtx vs. 10,000 for the gts

And yet 8800 has several advantages that clearly come into play with no aa and no af in the numbers giving it a 20%+ advantage that completely disappears when 4xaa/16xaf is turned on. . .even tho it has other advantages that *should* apply to such a scenario as well. For instance it has single cycle 4x AA whereas X1950 xtx has to take two cycles to do 4x. . . .and yet they are in a dead heat with 4xaa/16xaf turned on where GTS was kicking dirt in X1950 XTX's face without it. Doesn't that pretty strongly suggest that they both hit a bw wall there when AA was turned on?

Further, if you compare GTX to GTS on those FEAR numbers, what you see is that the 4xaa/16xaf results at 1600x1200 scale nearly exactly with BW between the two of them. . .whereas without aa/af GTS gets much closer to GTX, showing that GTS is not fillrate bound at 1600x1200 4xaa/16xaf, but rather bw bound.
 
But isn't 4xAA free in terms of ROP cost on G80? And when you say very low TMU/ALU cost - are those components involved in the AA process at all?
Yes, but not too much. They do extra work in pixels that are covered by more than one triangle.

As for bandwidth being a concern, I'm just not sure. After all, we are talking compressed framebuffers here. I doubt that the bandwidth increase from enabling AA is all that significant.
 
And yet 8800 has several advantages that clearly come into play with no aa and no af in the numbers giving it a 20%+ advantage that completely disappears when 4xaa/16xaf is turned on. . .even tho it has other advantages that *should* apply to such a scenario as well. For instance it has single cycle 4x AA whereas X1950 xtx has to take two cycles to do 4x. . . .and yet they are in a dead heat with 4xaa/16xaf turned on where GTS was kicking dirt in X1950 XTX's face without it. Doesn't that pretty strongly suggest that they both hit a bw wall there when AA was turned on?

Further, if you compare GTX to GTS, what you see is that the 4xaa/16xaf results at 1600x1200 scale nearly exactly with BW between the two of them. . .whereas without aa/af GTS gets much closer to GTX, showing that GTS is not fillrate bound at 1600x1200 4xaa/16xaf, but rather bw bound.

Ok yes I see what you are saying, I was reading the Rage chart incorrectly, but you still have resolution to play with here, if we look at the firing squad numbers as res goes up, bandwidth isn't what hurts the gts sepecially when we look at 2560x1600, there is a bottleneck shift. This isn't unusal, the res goes up fillrates and shader power is needed, if games coming out need more of these two anyways the results will be more extreme.

What is the balance between bandwidth and fillrates and shader power needed for newer games?
 
Yes, but not too much. They do extra work in pixels that are covered by more than one triangle.

Thanks.

As for bandwidth being a concern, I'm just not sure. After all, we are talking compressed framebuffers here. I doubt that the bandwidth increase from enabling AA is all that significant.

But what besides increasing bandwidth requirements could explain the sometimes significant (30%+) drop for applying 4xAA on G80 ?! If R600 comes out with single cycle 4xAA or better and is otherwise fillrate limited it could be a coup for high-res AA results.
 
But what besides increasing bandwidth requirements could explain the sometimes significant (30%+) drop for applying 4xAA on G80 ?! If R600 comes out with single cycle 4xAA or better and is otherwise fillrate limited it could be a coup for high-res AA results.
Link to said results? I don't remember seeing this, so I'm curious.
 
Well, yes, we're going to have to have a playable fps at any given res *before* the AA gets turned on to see any useable advantage from the extra bw, and if that point happens far enough down the res chain then the bw advantage could be squandered when the AA is turned on. But looking at the charts, whatever we might see from new games, I'm still thinking there should be plenty of existing games at 1600x1200 and 1900x1220 w/4xaa and 8xaa where R600 should shine. Assuming it can at least roughly equal G80 *before* the AA gets turned on, and I don't see any particular reason to question that right now. Tho like everyone else, reserving the right to change my mind as new facts roll in. :smile:
 
But what besides increasing bandwidth requirements could explain the sometimes significant (30%+) drop for applying 4xAA on G80 ?! If R600 comes out with single cycle 4xAA or better and is otherwise fillrate limited it could be a coup for high-res AA results.
In short, I don't really know. I'm no longer confident that it's just that simple, however. Today's triangle counts, for instance, may be a significant factor, in that it means there will be many pixels with multiple triangles contributing. Other possible issues are that it may increase pressure on the cache making memory access less efficient, or add in latency for frame buffer accesses that hurt performance for blending.

A decent way of testing this would be to downclock the memory and the core for separate benchmarks and look for how performance changes as a result. I suggest downclocking instead of overclocking because overclocking either may result in the memory interface becoming the bottleneck. That is, overclocking the memory, in some instances, may lead to a situation where performance doesn't increase not because memory bandwidth isn't a limit, but rather that the chip itself isn't capable of supplying the memory bus with that much information.
 
How so? The GTX takes a pretty significant performance hit for enabling AA. Also R600 presumably is balanced in such a way that it has enough fillrate or other features to effectively utilize it's enormous bandwidth.

In a way Razor is correct; if you clock a G80 to say 630/900 you'll get a higher performance increase than with 575/1000.

Where exactly is the "significant" performance drop for AA, I have missed?

http://users.otenet.gr/~ailuros/AAPerf.pdf

If you read carefully through the text you might notice that the current performance drop for 8xMSAA is only a tad higher to what the G70 initially used to lose in FEAR with just 4xAA.

You might find that one also slightly relevant:

http://users.otenet.gr/~ailuros/MCwithAA.pdf

NVIDIA needs to get off its collective butt and fine tune drivers and despite that I realise that it's not easy to have as many driver sets (especially now with Vista added to the mix), it's high time they bend a couple of things into shape.

Albeit I doubt we'll see any extravagant Fillrates on R600, I'll be generous with my speculation and give it a 50% advantage in 2048*1536 with 8xMSAA in Fear as one example. That would account for roughly 38fps; not really playable if you ask me. Of course is that completely grabbed out of thin air, but how good are the odds for higher increases after all?
 
something u guys might find intersting, 16x af causes a 30% performance drop in trackmania united on the 8800 gtx at 1920x1200

I've never touched the game in question, but I sure as hell hope it's using at least MIPmapping; it wouldn't be the first racing game without it.
 
Status
Not open for further replies.
Back
Top