Chalnoth said:
The GeForce 6800 also has four times the ROP's, which, I claim, is much more important in this case. Put simply, due to compression, multisampling AA no longer imposes a significant memory bandwidth penalty.
Again the NV43 is a mainstream card and as that it's an excellent offering. I doubt anyone would expect with ~200$ to be able to play with 4xAA in high resolutions.
MSAA is virtually for free with 2x samples, while wherever there aren't any CPU limitations for instance 4xAA does take a bit more of a performance hit on today's accelerators. Saying though that multisampling doesn't impose a significant memory bandwidth penalty is only really true for 2xAA and is a VAST generalization. If I take a look at ATI's 6x sparse or the possibility of higher sample densities on today's cards the bandwidth penalty will be there, just because those cards weren't obviously layed out/designed for more.
I doubt AF performance penalties differ that much, though I haven't looked at that specifically.
I've been begging someone to write a good fillrate tester for quite some time now. There's
zeckensack's Archmark for instance, but it's somewhat outdated. Just 2xAF because there's no angle-dependency there (6800/16p):
1,2,3,4 = single, dual, triple, quad texturing
High Quality, noAF:
--Textured fillrate-----------------------------
----Bilinear filter-----------------------------
1 4.831 GPix/s
2 2.813 GPix/s
3 1.876 GPix/s
4 1.407 GPix/s
----Trilinear filter----------------------------
1 2.821 GPix/s
2 1.393 GPix/s
3 940.988 MPix/s
4 705.767 MPix/s
High Quality, 2xAF:
--Textured fillrate-----------------------------
----Bilinear filter-----------------------------
1 2.739 GPix/s
2 1.410 GPix/s
3 940.102 MPix/s
4 705.101 MPix/s
----Trilinear filter----------------------------
1 2.782 GPix/s
2 1.410 GPix/s
3 940.105 MPix/s
4 705.106 MPix/s
Quality, noAF:
--Textured fillrate-----------------------------
----Bilinear filter-----------------------------
1 4.907 GPix/s
2 2.813 GPix/s
3 1.876 GPix/s
4 1.407 GPix/s
----Trilinear filter----------------------------
1 3.321 GPix/s
2 1.661 GPix/s
3 1.103 GPix/s
4 825.448 MPix/s
Quality, 2xAF:
--Textured fillrate-----------------------------
----Bilinear filter-----------------------------
1 4.710 GPix/s
2 2.645 GPix/s
3 1.759 GPix/s
4 1.317 GPix/s
----Trilinear filter----------------------------
1 5.280 GPix/s
2 2.645 GPix/s
3 1.759 GPix/s
4 1.317 GPix/s
Anyway, the point here is that the 6600 GT has quite a bit less memory bandwidth per pixel rendered when compared to the 6800, but has about the same fillrate. Even with this deficiency, though, the 6600 GT has no problem sustaining high performance levels. Furthermore, I claim that the lack of memory bandwidth of the 6600 GT will mean less and less as games use more long shaders.
As long as you won't use anything that stresses it's ROPs and or bandwidth yes. It doesn't take a wizzard to think that a 6800 and a 6600GT will be damn close in performance up to let's say 1280 w/o any AA, it's already today the case.
Speaking of ROPs or Z units and since it's not irrelevant to the thread here, KYRO had 16 Z/stencil units per pipeline.
Edit: More telling information comes if you directly compare some of the high-resolution, 4x FSAA benchmarks between the 6800 and the 6600 GT.
Here is one such example. Notice that as the resolution creeps past 1280x1024, the performance of the 6600 GT drops below 71% that of the 6800. The GT has 71% the memory bandwidth of the 6800, and the two have the same amount of memory, so I claim that memory bandwidth cannot possibly be the sole differentiator in performance between the two. I further claim that it is likely that it is subdominant when compared to the ROP limitation of the GT with 4x AA enabled.
Who on God's green earth claimed otherwise anyway? You're the one who keeps preaching for quite some time now that bandwidth loses it's importance; while it's within limits somewhat true for specific cases only, it's neither an absolute and you make it sound like an exaggeration too. Bandwidth is important and it will remain important on future accelerators too. There will come a time in the not so foresseable future where we'll see 512bit buses; yeah what the heck for hm?
Of course, but if you double the memory bandwidth of, say, the GeForce 6800 Ultra, you'd not gain any significant performance.
Don't pretend like you're not understanding what I'm trying to say here or try to twist things in the wrong direction.
The NV40 in general has a pretty good balance between fillrate and bandwidth. Would I double theoretically either/or the differences wouldn't be "huge"; would I give it though exactly twice the current fillrate and twice the current bandwidth than most likely yes.
Hell, for a number of benchmarks, you'd also not lose a huge amount of performance by cutting in half the amount of memory bandwidth of the 6800 Ultra.
That's true but I'm sure you're not trying to tell me that high end accelerators don't have a reason for their existence, do you? An enthusiast knows (or should know) how to make good use of that spare memory bandwidth.