No DX12 Software is Suitable for Benchmarking *spawn*

I keep saying it. DX12 and the low level API thing was the worst "evolution" ever. Even Microsoft can't make it right for its most CPU limited game ever. What an irony :runaway::runaway::runaway:
It's not Microsoft's renderer though, the game is made by Asobo and this is their first approach to D3D12. It'll get better. But I'm not holding my breath over them beating D3D11 driver any time soon.
 
So Halo Infinite shows wild performance irregularities for all vendors, I will categorize results into two groups of testing methodologies.

Starting with testing world traversal scenes either on foot or by cars, results are as follow:

-RDNA2 GPUs trounce Ampere, the 6900XT is 10% to 20% faster than RTX 3090, @1440p!
-Turing GPUs trounce RDNA1 GPUs, the RTX 2070 Super is 40% to 50% faster than 5700XT @1440p!
-Pascal GPUs trounce Vega GPUs, the GTX 1080 is ~20% faster than Vega 64 @1440p!


https://www.sweclockers.com/test/33367-snabbtest-grafikprestanda-i-halo-infinite
https://www.pcgameshardware.de/Halo...als/PC-Benchmarks-Technical-Review-1384820/3/

However, once the screen gets busy with AI and battles, the table turns:

-Ampere comes out on top, the 3090 becomes 10% to 15% faster than 6900XT @1440p!
-Turing's lead over RDNA1 shrinks, the 2070 Super becomes just 15% to 20% faster than 5700XT @1440p!
-Pascal becomes equal to Vega, the GTX 1080 and Vega 64 achieve equal results.

https://www.techspot.com/article/2382-halo-infinite-benchmark/
https://gamegpu.com/action-/-fps-/-tps/halo-infinite-test-gpu

It's a total mess! Though there are some constants, specifically @2160p: NVIDIA GPUs do better in all situations, Ampere closes the gap to RDNA2, or expands it's lead over RDNA2 depending on the scene, Turing expands it's lead over RDNA1 in all situations, and Pascal GPUs widen their distance over Vega GPUs in all situations. As AMD cards with 8GB seem to suffer a VRAM management issue that comparable NVIDIA GPUs don't suffer from.

All in all, it's a mess of a game, and the visuals don't justify the performance at all. The DX12 renderer comes with a wide variance of performance across different hardware. Which is disappointing to see in such a high profile game.
 
So Halo Infinite shows wild performance irregularities for all vendors, I will categorize results into two groups of testing methodologies.

Starting with testing world traversal scenes either on foot or by cars, results are as follow:

-RDNA2 GPUs trounce Ampere, the 6900XT is 10% to 20% faster than RTX 3090, @1440p!
-Turing GPUs trounce RDNA1 GPUs, the RTX 2070 Super is 40% to 50% faster than 5700XT @1440p!
-Pascal GPUs trounce Vega GPUs, the GTX 1080 is ~20% faster than Vega 64 @1440p!


https://www.sweclockers.com/test/33367-snabbtest-grafikprestanda-i-halo-infinite
https://www.pcgameshardware.de/Halo...als/PC-Benchmarks-Technical-Review-1384820/3/

However, once the screen gets busy with AI and battles, the table turns:

-Ampere comes out on top, the 3090 becomes 10% to 15% faster than 6900XT @1440p!
-Turing's lead over RDNA1 shrinks, the 2070 Super becomes just 15% to 20% faster than 5700XT @1440p!
-Pascal becomes equal to Vega, the GTX 1080 and Vega 64 achieve equal results.

https://www.techspot.com/article/2382-halo-infinite-benchmark/
https://gamegpu.com/action-/-fps-/-tps/halo-infinite-test-gpu

It's a total mess! Though there are some constants, specifically @2160p: NVIDIA GPUs do better in all situations, Ampere closes the gap to RDNA2, or expands it's lead over RDNA2 depending on the scene, Turing expands it's lead over RDNA1 in all situations, and Pascal GPUs widen their distance over Vega GPUs in all situations. As AMD cards with 8GB seem to suffer a VRAM management issue that comparable NVIDIA GPUs don't suffer from.

All in all, it's a mess of a game, and the visuals don't justify the performance at all. The DX12 renderer comes with a wide variance of performance across different hardware. Which is disappointing to see in such a high profile game.
Pascal only wins over Vega because of some apparent VRAM bug affecting AMD. At 1080p Vega is some 30% faster.
 
Halo Infinite is a mess on pc. I just set everything low or off because it still looks ok enough and I get better frame rates without input lag spikes, but there’s no avoiding completely random poor cpu performance that was introduced for me with the launch day update, even on small multiplayer maps. Enjoying playing the multiplayer but it’s a technical mess. I actually hope they don’t fix the 30 fps animations because I don’t want it to impact my cpu performance.
 
I would like to thank @Lurkmass for his tremendous insight on why DX12 behaves the way it is on AMD and NVIDIA GPUs.

The D3D12 binding model causes some grief on Nvidia HW. Microsoft forgot to include STATIC descriptors in RS 1.0 which then got fixed with RS 1.1 but no developers use RS 1.1 so in the end Nvidia likely have app profiles or game specific hacks in their drivers. Mismatched descriptor types are technically undefined behaviour in D3D12 but there are now cases in games where shaders are using sampler descriptors in place of UAV descriptors but somehow it works without crashing! No one has an idea of what workaround Nvidia is applying.

it's 'how' developers are 'using' it that makes it defective since that just means in practice that D3D12 binding model isn't all that different from Mantle's binding model which was pretty much only designed to run on AMD HW so it becomes annoying for other HW vendors trying to emulate this behaviour to be consistent with their competitor's HW ...
I don't think any of the discussion behind multithreading or software vs hardware scheduler crap in the background are related to the reasons why NV sees higher overhead on D3D12 ...


Root-level views in D3D12 exists to cover the use cases of the binding model that would bad on their hardware but nearly no developers use them because they don't have bounds checking so they hate using the feature for the most part! This ties in with the last sentence but instead of games using SetGraphicsRootConstantBufferView, some games will spam CreateConstantBufferView just before every draw which will add even more overhead. It all starts coincidentally adding up when developers are abusing all these defects behind D3D12's binding model.


Bindless on NV (unlike AMD) has idiosyncratic interactions where they can't use constant memory with bindless CBVs so they load the CBVs from global memory which is a performance killer (none of this matters on AMD) ...
 
Isnt it ironic? An API designed specific to reduce the dependence on the CPU ends in a huge CPU limitation on modern GPUs.
It's not ironic for AMD who have pushed for these APIs to be made this way and has basically reached that goal for 99% of the time when compared to their D3D11 driver.
It does rise the question of why Nv didn't care enough about these issues on their h/w in the first place however. It's like "we don't care" is their general approach to gaming use of their GPUs these days.
 
After all these years Developers aren’t changing practices and Nvidia isn't changing their hardware. It would take losing their significant marketshare advantage to get them to care. Are there benefits to Nvidia’s approach over AMD’s? Reasons other than R&D/SW investment to avoid change?
 
After all these years Developers aren’t changing practices and Nvidia isn't changing their hardware. It would take losing their significant marketshare advantage to get them to care. Are there benefits to Nvidia’s approach over AMD’s? Reasons other than R&D/SW investment to avoid change?

The biggest advantage for them would would be good compatibility and performance with legacy software. There's other advantages with Nvidia's HW binding model like support for D3D11 style bound textures and constant buffers. If you don't need bindless textures then bound textures might be theoretically faster because it's one less indirection for the HW. It's a well known quirk by now that using bindless constant buffers means you won't be able to hit the faster constant memory path on Nvidia so bound constant buffers are ideal in their case. The D3D12 binding model wouldn't be too bad if developers actually used the validation layers to catch mismatched descriptor types so then vendors wouldn't have to implemented an unintended feature like mutable descriptors in the first place ...

There might be legitimate ways to further reduce CPU overhead on Nvidia HW on D3D12. You could use the ExecuteIndirect API to specify in the command signature to update the root signature and vertex/index buffer bindings in the pipeline which will have the effect of the GPU being able to change the resource bindings instead of the CPU but this capability is hardly ever used in current games. ExecuteIndirect is basically a watered down version of Nvidia's Vulkan device generated commands extension which is ideal for GPU driven rendering pipelines ...
 
Hot take: I think PSOs were the right idea despite all the recent noise ...


More explicit APIs moved away from state objects/dynamic states and separate shader objects in favour of monolithic pipelines (PSOs) since pre-baking states and shaders was more optimal compared to hidden recompilations. While more dynamic states and separable shader stages can help reduce the number of pipelines, there's a possibility where different combinations of hardware/states/stages can generate unique microcode for each permutation. In a simple example where we only have 5 different vertex shaders and 5 different fragment shaders with no other state changes, we would ideally have only compiled a total 10 programs if hardware is truly capable of dynamically selectable separate shader stages. The worst case scenario is that you end up on other hardware generating 25(!) different programs which are dependent on the different combinations of the 2 shader stages. A reduction in the number of pipelines won't necessarily help reduce the number of compilations. There are good intentions behind the design of PSOs to not compromise performance on any hardware where necessary. HW vendors did relax some rules on dynamic states but separate shader objects still remains highly controversial. Virtually all hardware vendors recommend pre-generating PSOs for ray tracing since it is not practical to do it at runtime ...

The register slot based binding model that we used to see on D3D11 or Metal could get away with automatic barrier generation and tracking. With bindless, explicit barriers are exposed instead because automatic barriers would cause devastating performance losses since drivers would have to be conservative about inserting many more unnecessary barriers if they didn't know how these resources are accessed. Even Metal exposes explicit barriers with bindless. By association, if bindless is a hard requirement for ray tracing then so too are explicit barriers as well ...

I think we can safely discard the idea that we can somehow keep the simplicity of old gfx APIs if we want a feasible implementation of ray tracing at all. Need bindless ? Explicit barriers come with that too. Expensive compilation and performance behind ray tracing ? Suddenly, PSOs are a good idea there as well. The older gfx APIs with these features aren't going to be much more different in terms of complexity compared to the newer explicit APIs ...
 
I think PSOs are definitely the right idea (I'm a know nothing nobody though haha) but from what little I actually do understand, I believe the way it's set up is best for the future. I feel the mechanisms to deal with a lot of these issues present are there, we just need developers to start utilizing them properly and designing around them in the first place.

There's definitely going to be a long transition period, but I do feel that the industry will eventually converge on the right path. There's going to be some bumps and bruises along the way though.

I read quite a bit of developer tweets and back-and-forths, and you start to begin to appreciate the complexities they face and the constant "struggle to no avail" to get changes implemented. It's a really slow process and things often lag YEARS behind from when developers first start raising these issues to when or if they ever get implemented. What I wish for more of, and I know it's not exactly always possible for them to speak on things which are NDA'd or whatever, but I wish more developers would candidly speak out about changes they wish to see implemented in engines/APIs and what exactly those changes could mean for the future of games and why they are important... in a way to get gamers behind their cause. I honestly feel like maybe things would move a bit faster, and maybe the "standards bodies" and API engineers would take more rapid action to implement some of these things if these ideas were brought to light by the gaming media and gamers asking questions about this kind of stuff.

On the PC side, you constantly see developers complaining about how difficult it is compared to consoles with tools and APIs being either non-existant, or far behind their console counterparts, with developers constantly pushing for features and improvements to come closer to parity... It's disheartening to hear developers say they've been pushing for X or Y for years, and them being completely ignored... Obviously PC is always inherently going to be more complex, but a lot of what developers ask for isn't a reduction in complexity, but just better tools to deal with that complexity. A lot of what they ask for they already KNOW is possible, it's just not being done for some reason..

I know it's extremely complex.. you have multiple issues at play which drag this stuff down... usually resulting in nothing being done until there's no choice but to act. I wish the situation was better, and there was a way we could get a bigger push behind developers to help them get the changes they want, actually implemented sooner than later. I think it just starts with developers being extremely candid with gamers as to why things are the way they are, and perhaps trying to get the influence of the gamers and media on their side. I think it would be cool if say a channel like Digital Foundry would do interviews with developers about challenges they face when developing/porting games and actually speak about a lot of these issues. Again, I know they probably contractually can't say too much, but these are the damn people who make the games. At the end of the day, everything that APIs and Engines, and developers do should be in service of giving gamers the best experience possible and allowing for new never before done type experiences. So when developers are complaining about not having X and Y features, all the way up the stack that should ripple and the powers that be should be working towards those goals of servicing the developers.

I dunno, I'm blabbing on now... but I feel things can definitely be better than they are right now and I'd love to see more developer issues get the spotlight they deserve to hopefully invoke change at a quicker pace. As I've said before, it's amazing how quickly things can change when you bring the right kind of attention towards it.
 
On the PC side, you constantly see developers complaining about how difficult it is compared to consoles with tools and APIs being either non-existant, or far behind their console counterparts, with developers constantly pushing for features and improvements to come closer to parity... It's disheartening to hear developers say they've been pushing for X or Y for years, and them being completely ignored... Obviously PC is always inherently going to be more complex, but a lot of what developers ask for isn't a reduction in complexity, but just better tools to deal with that complexity. A lot of what they ask for they already KNOW is possible, it's just not being done for some reason..
Most of such features are specific to console h/w and aren't being implemented in PC APIs because they aren't (and won't be) supported by all h/w vendors or will break something in what is expected to be added in the future.
Developers tend to think that they know better how to design the h/w and work with it but we all see what happens when they are given something like D3D12. Their opinion on these things shouldn't be taken at face value.
 
Developers tend to think that they know better how to design the h/w and work with it but we all see what happens when they are given something like D3D12.
Back in 2015, I remember Dice specifically asking for lower level APIs, now almost every frostbite game that uses DX12 manged to completely screw both AMD and NVIDIA GPUs compared to DX11, the latest of such examples was Battlefield V. Star Wars Squadrons was even released lacking DX12 completely.
 
Most of such features are specific to console h/w and aren't being implemented in PC APIs because they aren't (and won't be) supported by all h/w vendors or will break something in what is expected to be added in the future.
Developers tend to think that they know better how to design the h/w and work with it but we all see what happens when they are given something like D3D12. Their opinion on these things shouldn't be taken at face value.
Good points. I know many things devs ask for either can't or wont be implemented for very valid reasons, but I also see what I can only assume are valid requests by fairly prominent developers who I believe know what they are talking about full well. Usually the way it goes is a dev makes a tweet about some feature finally being added, and then they link to blog posts from years before detailing this problem and what to do to fix it. Often times it's too little too late, sadly.

But yea, you're right... just as with anything else you can't give people ALL the power.. they might think they want it, but it can cause a mess in so many other ways potentially. Anyway, I'd just love to see the industry be able to react and pivot quicker in the future. I believe the ability is there, but it just lacks the motivation in many aspects. And I'm not trying to be disrespectful either, because I fully understand it's no easy task and people are putting in massive amounts of work every day to inch ahead.
 
Back in 2015, I remember Dice specifically asking for lower level APIs, now almost every frostbite game that uses DX12 manged to completely screw both AMD and NVIDIA GPUs compared to DX11, the latest of such examples was Battlefield V. Star Wars Squadrons was even released lacking DX12 completely.

Battlefield V was obviously unfinished, in true AAA fashion. As the patching continues DX12 seems to offer lower input lag. Maybe with time even more.
 
Battlefield V was obviously unfinished, in true AAA fashion. As the patching continues DX12 seems to offer lower input lag. Maybe with time even more.
I don't trust the current team at DICE to honestly fix anything. I say that engine is dead in the water.. and EA better hope some team in their stable of developers is up to the challenge to push things forward, because IMO all the great talent there has left and most are with Embark now... which look to be doing some really awesome things.
 
I don't trust the current team at DICE to honestly fix anything. I say that engine is dead in the water.. and EA better hope some team in their stable of developers is up to the challenge to push things forward, because IMO all the great talent there has left and most are with Embark now... which look to be doing some really awesome things.

There are probably great people left at DICE, SEED and the EA Frostbite team. I think the question is whether EA will allow them the time to basically overhaul the technical debt that's built up in Frostbite. Rewriting a major engine like that could take years. Who knows, maybe they're already doing it, and we're just in a window where people are stuck working on the original branch of the engine, and in a year or two they'll be able to switch over to a new frostbite, or new engine, that has the major issues fixed.
 
Back
Top