Framebuffer content and pixel shaders

Bastion said:
I knew the reason why modern 3D chips don't allow this (performance) but I didn't know how or why performance could be negatively affected. From the replies in this thread and elsewhere, I think the problem is going from memory to the shader units and that some kind of stop-start must be happening. Thanks to Xmas, I am now checking out S3.
The problem is that the hardware can process multiple polygons covering the same pixel simultaneously. Thus the shader processing can reach the point where you want to read from the framebuffer before the previous polygon at that pixel has been finished. To give you correct behaviour, the hardware would have to detect these overlap situations and stall the read operation until the previous pixel has finished.

I don't think you'll have any luck with S3. That functionality is something they advertise, but it isn't exposed anywhere in any public API.
 
Is there really anything stopping company X from implementing a programmable blend? This could actually be very useful... A mini shader with just 'dest' and 'src' inputs :) Compacted HDR formats would be the first use that comes to mind. (or does DX10 allow this? - I should look that up I suppose before posting :p)
 
Hardware cost is the main party pooper.
Direct3D 10 paper said:
Finally, the fixed-function limitations of the OM are a fre-quent source of discussion. The OM unit is the only stage where memory read-modify-write operations are supported and this is one of the features frequently requested for the programmable units. One proposal is to merge the OM functionality into the PS. However, the complexities in managing pipeline hazards and maximizing memory system efficiency do not yet lend themselves to a justifiable cost. Multisampling further complicates the struc-ture since PS computations are performed on pixel fragments whereas blending operations are performed on samples. Promot-ing the PS to execute at sample granularity has significant per-formance ramifications. Furthermore, notable performance gains are achieved using early depth and stencil rejection optimizations. Success relies on the predictability of the outcome of shading operations, creating an argument against migrating that function-ality to the PS.
 
IIRC this (frame buffer contents as input to shaders) was also discussed during the development of OpenGL 2.0. It was in the draft spec for some time, but got nixed by the IHVs.
 
Back
Top