Same hardware, same game settings, same graphics driver:
The only difference is that on the left we have Windows 10 and on the right Windows 8.1.
Windows 10 deriver supports WDDM 2.0?
Which is one of the main public features they're touting with DX12 which also requires Windows 10. Hence there's no way they'd want it ending up in DX11?This feature would have single handledly increased DX11 draw call performance on par with DX12 (in cases where you don't need to change the GPU state betweeen the draws).
ExecuteIndirect needs root constants/descriptors and resource barriers to be useful and efficient on a variety of hardware.There is no ExecuteIndirect. It would have been a super nice feature especially for DX11, as DX11 is so slow on draw calls. This feature would have single handledly increased DX11 draw call performance on par with DX12 (in cases where you don't need to change the GPU state betweeen the draws).
That's true. Full ExecuteIndirect needs root constants/descriptors. However they could have implemented a limited subset (equal to OpenGL multiDrawIndirect) to DX 11.3. That would't need any API refactorings at all.ExecuteIndirect needs root constants/descriptors and resource barriers to be useful and efficient on a variety of hardware.
I think it would need to at least have support for something like draw parameters to be very useful though. If you literally just need a sequence of DrawIndirect calls that's not terribly inefficient to do today; GPUs are pretty efficient at throwing out 0-length draws if you need to cull some of them out.That's true. Full ExecuteIndirect needs root constants/descriptors. However they could have implemented a limited subset (equal to OpenGL multiDrawIndirect) to DX 11.3. That would't need any API refactorings at all.
We ONLY need the ability to control the draw call count from the GPU side. Pushing a constant number of draw calls (most empty) from the CPU side wastes lots of GPU performance (empty draws cost surprisingly much). We don't need binding changes since we use virtual texturing (and all our mesh data is in a single big raw buffer). SV_DrawId would obviously be mandatory.I think it would need to at least have support for something like draw parameters to be very useful though. If you literally just need a sequence of DrawIndirect calls that's not terribly inefficient to do today; GPUs are pretty efficient at throwing out 0-length draws if you need to cull some of them out.
Don't get me wrong, I like the feature but it's really the binding changes that make it cool.
With D3D11? What about 12?(empty draws cost surprisingly much)
I am talking about the GPU cost. The command processor will be a big bottleneck if you push the maximum worst case (let's say 50k, mostly empty) draws for each viewport (let's say main + 4 shadow cascades + 10 shadow casting local lights). If you don't know what you are going to render on CPU side, it is hard to estimate tight (conservative) maximums that are never exceeded, especially when you use fine grained (sub object precision) occlusion culling for all viewports (including shadows).With D3D11? What about 12?
Simple answer : NO.A quick question, perhaps not so quick answer. Is it true that D3D12's lower cost to draw calls only helps bad console ports and bad coding in general? Is it true that writing better code and draw things in batches would overcome every benefits D3D12 have with low cost to draw calls?
There are tradeoffs when you batch draw calls. Good code should leverage draw calls where necessary and batch where necessary.A quick question, perhaps not so quick answer. Is it true that D3D12's lower cost to draw calls only helps bad console ports and bad coding in general? Is it true that writing better code and draw things in batches would overcome every benefits D3D12 have with low cost to draw calls?
I can't answer technically. Senior members here can provide you more accuracy. But my understanding is that there is no way to optimize your API overhead, but you can optimize around it - hence batched draw calls. In D3D11, say you make a call to draw a triangle strip, maybe that unpacks to 50 instructions for the GPU (that the CPU needs to send), where with D3D12 maybe it only takes 8 instructions. As the instruction overhead drops, that also means that GPU saturation can increase. In this scenario the GPU is waiting for all the commands to come in before it starts doing work, so the less instructions it needs to wait before it starts doing work the better. Lower overhead should result in immediate gains, as well allows for better control over the GPU so there should be less time spent fighting against what the API is doing, and more time programming the graphics for the game.So is this in term of cost effciency, as in less time in optimizing/minimizing draw calls more time for other things. Or is it also a pure technical limitation with D3D11 which no optimizing can overcome? Either way, is the lower overhead going to be a big step forward in practice?
I suppose small studios might not have people proficient enough to use low level API, that's why MS is updating D3D11, therefore the gain will be freed CPU & GPU time for those able to use those API.So is this in term of cost effciency, as in less time in optimizing/minimizing draw calls more time for other things. Or is it also a pure technical limitation with D3D11 which no optimizing can overcome? Either way, is the lower overhead going to be a big step forward in practice?