Dynamic branching, again.

Frank

Certified not a majority
Veteran
So, how good does it work? Is it actually used in games? How fast is it? Faster than using multiple passes?

While being taunted as the best thing since sliced bread in SM 3.0, is it? Is it actually useful at this moment, with the quad-centered architectures? I haven't heard much about it anymore since there is hardware around that can use them.
 
So, looking at the views and the lack of responses, I take it that it is not really useful at the moment? So, I guess we will have to wait for the unified shader model without quads for them to become useful.

Who can tell how the 6800 handles it at the moment? Does it switch to serial processing on a split in the quad every time, or does it have another way to deal with that?
 
Dynamic Branching itself is not fast or slow. Only a implementation of this feature can have an execute speed.

At the moment we only have the NV4X implementation. It is not perfect because branch decisions are done at Quad-Batch Level. A batch contains ~1000 pixel. If only one of this pixel need the other branch path all pixel have to execute both path with predication.

This means if you have large areas that need the same branch path dynamic branching can be win. But in the other case it can be slower.

I heard that Pacific Fighter use dynamic branching. But it did not own this game and can not test it by myself.
 
Demirug said:
It is not perfect because branch decisions are done at Quad-Batch Level. A batch contains ~1000 pixel. If only one of this pixel need the other branch path all pixel have to execute both path with predication.

This is interesting, is it confirmed by any reliable source? Or you figured it out by yourself through some clever test programs? Either case, I'd like to hear more on the detail. :p
 
Thanks, Demirug.

So, it can be useful to combine shaders that cover large surfaces, but it wouldn't be good for detail shaders, like complex procedural textures.

Do you think it might be worth it to try and make a single shader that branches for each light source? I can imagine that being the most desirable one. But I might be wrong.

What would be the best way to use it?
 
NVIDIA have stated in dev conference that the pixel units branch in quad units due to there SIMD architecture there. There vertex units however are MIMD and suffer no group branching problems.

Other architectures may have radically different performance characteristics, as such dynamic branching is on the list of things to only do if you really really need it.
 
Has anyone done some further testing? Tridam's posts are very interesting, but they leave many questions unanswered.
 
From NVIDIA GPU Programming Guide:

Use dynamic branching when the branches will be fairly coherent. As mentioned in Section 4.1.3, dynamic branching can make code faster and easier to implement. But in order for it to work optimally, branches should be fairly coherent (for example, over regions of roughly 30 x 30 pixels).
 
Back
Top