Only had a brief chance to look at this but my initial impression is this is the community digging into stuff they don't understand and looking for blood. It is extremely common to have indirect execution with 0 counts/draws because that's how the APIs and hardware work right now. This is how you do GPU-driven rendering. Similarly in the Nanite materials (base pass shading) step there's a lot of indirect draws that end up drawing nothing because the APIs do not allow you to set up sufficient state on the GPU side, so you are forced to set up any *possible* rendering that might happen on the CPU, then zero it out on the GPU if you don't actually need it. Similar things again in GPU instance culling of non-Nanite geometry for virtual shadow maps.
- Starfield abuses a dx12 feature called ExecuteIndirect. One of the things that this wants is some hints from the game so that the graphics driver knows what to expect. Since Starfield sends in bogus hints, the graphics drivers get caught off gaurd trying to process the data and end up making bubbles in the command queue. These bubbles mean the GPU has to stop what it's doing, double check the assumptions it made about the indirect execute and start over again.
- Starfield creates multiple `ExecuteIndirect` calls back to back instead of batchi
This has been a fundamental limitation of these APIs from day one and everyone (IHVs, OSVs, ISVs, etc) understands it thoroughly. On consoles there are some tricks you can play with fixed hardware but on PC, despite various efforts, the industry has so far been unable to evolve the APIs in a portable way that significantly improves the situation. The linked PR seems that it is an attempt to implement vendor-specific paths in Vulkan using extensions to try and reduce some of the impact. These sorts of optimizations are things that drivers are expected to be doing, but are often behind various heuristics that may need to be adjusted for Starfield (the sorts of things that day 1 driver updates tend to tweak). Even if a performance improvement is demonstrated (which as I understand it, has yet to even happen), these are not things you can do portably in DirectX, or even Vulkan at the API user level. I suspect if this turns out to actually be a decent part of the performance issues NVIDIA will tweak their drivers and life will move on.
Last edited: