Mintmaster said:Given that 32GB/s is enough for 8 bytes per pixel at 8 pix/clk, I can see where you're coming from. However, AA needs to determine the Z value for each sample. If I were to guess, I'd say it's probably a depth value plus two slopes for the quad, because slopes are needed anyway to interpolate the Z values. You also need an AA coverage mask and position information for the quad, which is likely 36 bits minimum.
i was speaking from blurried memories of an ATI conference, but now that you mention it - yes, that'd be more or less the case at hand - 4 colors + z + 2 z-grads + a coverage mask - hardly any savings at an individual fragment level, but your multisampling is indeed free from the POV of the fragment pipeline. and the stencil part was just thrown in as a bandwidth contributor, not that it goes in the same batch.
Also, not counting the z/rop read-write multiplicity is not really fair. For any pixel actually written, you must do at least a z read and write per pixel. Moreover, the only reason PS2 had that much bandwidth was so that the worst case was handled fast enough, so that's the case we should look at if we want to compare the figures. The PS2 eDRAM, AFAIK, was 19.2 GB/s read and 19.2 GB/s write for the framebuffer, exactly enough to read and write both z and colour values at 2.4 GPix/s. The remaining 9.6 GB/s was for texture bandwidth.
well, yes. i know you were originally comparing xenos to the ps2 but i wasn't - i was merely commenting on xenos' interconnect figures. but the ps2 numbers were welcome nevertheless : )
Anyway, it looks like Xenos needs 8 bytes of data transfer per pixel (which is what the interconnect has available at 4GPix/s), not 4. So the factor is 8, not 16. My mistake. Looking at these PS2 numbers is rather interesting, though. Framebuffer bandwidth for PS3 is less than half of PS2, yet 1080p has seven times the pixels of 640x448. I think PS2's GS had slightly misplaced priorities.
yep, and that's one of the things whose emulation i'm particularly curious about : )