Mordenkainen said:
Xmas said:
UltraShadow is GL_EXT_depth_bounds_test. ATI may be technically capable of doing the same, but only with a new chip
thanks for the refresh. Do you know what ATi is currently lacking in implementation?
Schematically, UltraShadow works this way: You provide two reference values(min, max) which form a depth range. For every pixel/sample, the depth value in the depth buffer is checked against this range, and if it's outside, the pixel/sample is discarded.
In practice, this is mostly useful for stencil shadows. If you did that per pixel, you would only save bandwidth (best case: depth read only, as opposed to depth/stencil read and stencil write*). But even with 16 samples per clock and combined depth/stencil, you're usually not going to be bandwidth limited on a 256bit DDR bus (16 * [32 bit read + 32 bit write] = 1024 bit, compression should put that down to a third at least)
To save fillrate, you have to employ an early Z method that can discard several pixels per clock, like hierarchical Z. And you need a depth
range per tile so you can check on both sides of the reference range.
AFAIK the R3x0 only stores one value per tile in the hierarchical Z buffer, and it doesn't support checking of the depth buffer content against a reference range. So it basically lacks everything UltraShadow needs.
* that's for separate depth and stencil buffers, I'm not sure the current chips handle them separately.