The GPU's rasterizer determines coverage for a triangle in rectangular chunks of pixels. The goal for good utilization is to make sure the batch of pixels that comes out of this stage has as many pixels as possible inside of the triangle, as the part of the rectangle that lies well outside of it may wind up becoming multiple SIMD lanes that are dead for the the wavefront. The mechanics for wavefront packing are something of a mystery to me, however.
If the pixels are shaded expensively, the saving on only doing 50% of them probably outweighs the locality easily?
Even if the scene is complex, it's still rendering at 50% of the pixels so the efficiency would scale well. I'd argue it's even more bang for the buck because you are then reusing the really expensive pixels from the past at rather constant cost of the reprojection.That should be the common case, as there's going to be a floor of arithmetic work and memory references that is proportional to the screen's size, not its content.
The heavy load case is that the screen is dominated by complex materials that hopefully require more ALU and memory accesses than the check of the motion vector, the recalculation of the motion vector, and multiple reads and writes.
I also realized yesterday the ugly chart that I made was attempting to convey the high level scene (shaded color space) with regard the effect of resolution, it seemed that it's taken as the geometry hence there's a disconnect in how the the chart is is wrong.
Last edited by a moderator: