I wonder how much you can optimize RT for Rdna2, or Ampere, on PC where direct x/rt is limiting what you can do anyway... Invisible RT is not an optimisation...
I don't believe dxr is limiting optimizations too much. Mainly the BVH building/optimization is black box. That is anyway very hw specific. I doubt anyone sane would want to write their own BVH implementations for every hw if alternative is that it just works and gpu vendor does this optimization for you(for free). It's different on console exclusives though.
You can for example check presentation below and consider how much tricks developer figured out when implementing RT for shadows only. Something like metro exodus enhanced edition likely has much more extensive optimization and design in engine to support RT efficiently.
Where I think it gets messy is considering that amd reuses tmu's and compute for RT whereas nvidia has independent accelerator. This leads to situation where ray tracing in nvidia HW can be treated similar to async compute where as amd cannot. In essence nvidia hw can fill bubbles with RT whereas in amd hw one wants to find out slots where tmu/compute goes unused and fit the RT load there.
Last edited: