That article is explaining two things at once. Xbox 360 supported tiled rendering to eDRAM (as eDRAM size was only 10 MB). The same command list was replayed multiple tiles. Xbox 360 GPU had some hardware features to make this slightly faster. Most games didn't use this feature.Just to make sure I'm getting this right: the HiZ data was generated by a Z prepass into the daughter die, that was then copied out to main memory, and then read back into the motherdie to and stored in the a dedicated HiZ buffer on the motherdie?
Is hierarchical z still used?
Hierarchical Z on the other hand is/was a great feature. It was introduced already in DirectX 7.0 GPUs (http://www.graphicshardware.org/previous/www_2000/presentations/ATIHot3D.pdf). It is still used by all modern AMD/Nvidia/Intel GPUs. The idea is simple. You have an additional lower resolution depth buffer (for example at 8x8 lower resolution, saving 64x BW and memory). This lower resolution depth buffer stores the maximum (furthest away) depth value of the (8x8) tile. As the HiZ buffer is much smaller than the actual Z-buffer, the GPU can keep it either in dedicated fast on-chip memory or efficiently cache it (all modern GPUs have general purpose R&W caches). This cuts down the bandwidth required to read the (full resolution) depth buffer quite a bit, especially in scenes with lots of depth overdraw. Also it allows the GPU to cull multiple pixels at once with a single HiZ depth test.