HW accelerated visibility determination?

991060

Regular
This thought is basically rooted in the project I'm working on right now. Since the virtual world in computer game is getting more complex everyday, runtime visibility determination is getting more important likewisely. There're many software based algorithms which can be easily accelerated by proper hardware(like hierarchical occlusion map, hierarichal z-buffering, frustum culling, etc.) I'm wondering, if people in the graphics giants are thinking implementing some of them in the next gen hardware? I dont know much about VLSI design, but the math behind those listed algorithms isn't that complex. The only major change I can think of, is that such hardware might need an input buffer holding raw data sent from application, and send those survive the visibility test to the rendering pipeline. Also, the hardware might be responsible for space partitioning and bounding volume generation if they're easy to implement, but this is not quite needed since you can do such things in preprocessing stage.

Now someone please tell me, is it a plausible appraoch?
 
Hierarchical Z is already present in today's hardware, and IIRC at least Nvidia's drivers have been capable of frustum culling in the past. If you need to get visibility results back to software, there are also occlusion queries.
 
Well, that's a little bit different than what I mean.
IIRC, all existing culling operation in HW is applied after the geometry transformation. That is to say, you still need to send those hidden triangle to the pipeline no matter what. It's better if you dont send them at all.

As for occlusion query, it's good in theory, but if you really want to use it in real world situation, there're problems. For example, you have a world consisting of 10000 objects, each has a bounding volume. Now you want to check everyone of them if it's occluded by the occluder in the scene. With occlusion query, you'll have to draw 10000 cube or sphere or anything else you choose as the bounding volume, that's a LOT of driver overhead(note that you can't draw all of them in one pass, normally just a few per pass). Even if you use a hierarchical structure, it's still easy for the driver overhead to overweight the saving from culling. What I'm dreaming of, is a self-sufficient GPU, it detects visibility information, and culls those hidden geometry WITHOUT the involvement of CPU.
 
With occlusion queries, I have heard talk about an experimental extension in upcoming Nvidia OpenGL drivers that allows conditional rendering on NV4x chips (not sure of its name, think it was NVX_conditional_render) - that is, the hardware itself decides whether to draw a set of primitives based on an occlusion query, with no need to pass query results back through the driver or the application.

Other than that, the new added functionality in WGF (geometry shader or whatever it is called + ability to read out its results) may or may not be usable for culling purposes.
 
Back
Top