ShootMyMonkey said:
There will be no difference in the fill-rate limit between AA and non-AA.
The parent->daughter bandwidth, 32GB/s, is enough for 4GP/s including 4xAA sample data.
You're completely missing what I'm talking about. Unless you're trying to say that everything on chip magically quadruples in performance when enabling 4xAA.
No, the ceiling on the GPU's fragments is set relatively low. There are 48 fragment pipes in Xenos, but it's designed to generate 8 fragments per clock (roughly half the speed of R420, for example).
The EDRAM's ROPs are running at 1/16th capacity, roughly (guess), with AA turned off.
It's only when you turn on AA that the ROPs are able to use all their bandwidth.
I'm saying that AA means more pixels to fill than not AA. Those extra pixels don't come for free, whether that's a purely fillrate problem or a problem of pixel shading power...
It appears to me that you are confusing fragments, AA samples and screen pixels...
Fill-rate is conventionally expressed in terms of screen pixels rendered - edit: scrub that, there are competing definitions of fill-rate.
The number of AA samples used in multi-sample anti-aliasing is not counted towards fill-rate.
In super-sampling, yes, you would count AA samples. But not in Xenos's AA.
doesn't really matter -- Pixels processed are pixels processed... rendering more pixels is something that cannot completely come for free if you're approaching some pixel or texel-related limit no matter how much bandwidth you have. Say I'm making 25 texture reads on every pixel shader, I'm going to hit texel fillrate limits with AA enabled sooner than I would without it simply because that shaders will be run that many more times. There's going to be a limit to how far you can say that the hit due to AA will be small. The more render passes you have, the more likely you are to see it.
An AA sample passes through the GPU without a single shader operation being performed on it. AA samples in multi-sampling AA are nothing more than sub-pixel geometry-present flags. Within a single pixel at the edge of a triangle, the edge will cover only, say, 30% of the pixel. Multi-sampling simply determines which of four pre-determined positions within the pixel are covered by the triangle. If one position is covered or 3 positions are covered, it makes no difference to the shader workload, because the GPU doesn't run the shader on the AA samples.
You can't count multi-sampling AA samples as part of the total "fill-rate".
Jawed