No DX12 Software is Suitable for Benchmarking *spawn*

Memory Pooling in LDA? Was this public info? Would SFR benefit from it?
If linked they are effectively pooled. SFR would be far better off with some sort of link. Any intermediate resources ultimately will be required on the other adapter for most scenarios. How robust the link is will be another question.
 
Is it the platform or the developer?

Non-UWP games have plenty of problems of their own. The recent Mafia 3 release, for example. Or when Batman: Arkham Knight released if you want a rather extreme example. Assassins Creed: Unity didn't have UWP to blame either.

Some of it was obviously UWP for something like Tomb Raider when UWP was just starting to be used for AAA games. Or Quantum Break at the start with them releasing their first Dx12 game. It's much better now, and much of the problems some people experience with the recent releases could also be that these are the first full titles by these developers on PC. It's entirely possible that their non-UWP release would have had similar issues.

Heck, depending on your criteria (frame time consistency or FPS) and your hardware, the UWP version of Quantum Break runs better than the non-UWP version of Quantum Break.

Regards,
SB
That is a good question, and tbh I think it is a mixture of both.
Regarding Quantum Break, you would expect the DX12 version to perform better than the DX11, but remember how much of a crap storm that game was for the 1st months and required patching game and UWP, it was a nightmare to begin with and yet again we see problems for other games that look to be a mix of developer's getting to grips with UWP and UWP itself as well.
BTW which site you going by regarding Quantum Break?
The Computerbase.de review is a bit confusing because it mentions the UWP 'is now better' with the update that improves VSYNC and the related issues, but their charts clearly show the DX11 is actually faster than the UWP-DX12 version and this is also applicable to AMD (apart from Fury X) albeit with a bit more frame variance so far (AMD suffering much more with the 480 but lets see if they can resolve this in under 2 months for what it has taken to get QB-UWP stable with most issues resolved, and does this also apply to Hawaii or just Polaris card - shame we did not have info on the other models).
That said they also mention the UWP-DX12 QB still suffers hiccups when they were evaluating it to the Steam DX11.
And this is comparing a lengthy patched UWP-DX12 Quantum Break to a just launched DX11 version with no further optimisation patches so far....
Maybe perspective regarding Quantum Break and which manufacturer/models one looks at, if only they had kept DX12 as well for the Steam version as AMD seems to need this for now, while Nvidia is coping well with the Steam DX11.

Anyway I guess time will tell and how long it takes to iron out those game issues for both FH3 and also Gears of War 4, and whether we will see another UWP update as well (and how often this will need to happen for games).
However a key factor is also curveball settings/hardware setups that may have 'side effects' with UWP version, and as Computerbase.de mention Gears of War 4 would be perfect if it was on a different platform to UWP-Microsoft Store :)
According to them it and FH3 are still held back with issues that seem to stem from UWP rather than the game.

Your 1st sentence would have merit if Steam was the issue for Mafia 3/Batman/etc; indications are the issues I am talking about and also raised by credible review sites comes back to UWP rather than actual game related flaws and bugs.

This does also fit into context of DX12 testing and reviewing.
If a game is released on multiple PC 'store' platforms, which one do they to their test/analysis on such as released on both Steam and UWP-Microsoft store.
There is a likelyhood of different behaviour-issues encountered; in theory the platform should be transparent and maybe at some point the UWP-Microsoft store will achieve that but until then it is a consideration with regards to testing-evaluating games.

Cheers
 
Last edited:
DX12 is now out of Beta in Deus Ex Mankind Divided. DX12 mGPU is supported in Beta form.

https://steamcommunity.com/app/337000/discussions/0/343788552537105000

I came home to try the new patch and find my own sweetspot. It seems to be at everything maxed out except for Very High textures and 8x AF:

wGYw7m.png


Here are my average results with a E5 2670 v2 ES CPU (10 core / 20 thread at 2.8GHz), frame pacing enabled at all times, Freesync 40-75Hz + in-game VSync, and mGPU results had triple buffering:

BoKBLZ.png


DX11 Crossfire gives me 43% scaling but DX12 mGPU does a whopping 96%. At first I though it could be from the FPS coming under the Freesync range and pushing the results down, so I repeated the DX11 Crossfire benchmark without any VSync and I got 31.9 FPS. So practically the same performance, but with quite a bit more stuttering. This is probably because AFAIK Freesync still allows for more "low" refresh rates (half of 40Hz, half of 50Hz, half of 60Hz) so the average result doesn't suffer that much.


I also did a screenshot to CPU utilization charts between DX11 Crossfire and DX12 mGPU. Although "total utilization area" seems to be equivalent between APIs, in DX12 each thread seems to show a much more stable/even utilization whereas the DX11 one has a lot more spikes.
 
BTW which site you going by regarding Quantum Break?

The link was posted earlier in the thread but I'll post it again.

https://www.computerbase.de/2016-09...agramm-frametimes-unter-steam-auf-dem-fx-8370

The Dx12 version has far superior frame pacing. While on Nvidia hardware the Dx11 version has higher framerates.

The tradeoff is higher framerates or more stable frame pacing. Although on Nvidia hardware in Dx12 the frame pacing isn't nearly as good as AMD hardware with some notably large anomalies on the FX-8370 which don't manifest on the i7-6700K. What's interesting is that Dx11 frame time variance gets a lot worse when moving to the FX-8370 on the Nvidia GPU than it does on the AMD GPU when compared to their respective frame time graphs for the much faster CPU..

Basically for Quantum Break, Dx12 UWP version is almost universally better than Dx11 for AMD hardware. While on Nvidia there are some trade-offs.

Regards,
SB
 
The link was posted earlier in the thread but I'll post it again.

https://www.computerbase.de/2016-09...agramm-frametimes-unter-steam-auf-dem-fx-8370

The Dx12 version has far superior frame pacing. While on Nvidia hardware the Dx11 version has higher framerates.
SB
Ah thanks that was the one I was referencing in my response.

Depends upon CPU and being pragmatic it would be fair to say most (emphasising I am not saying all but the large majority) are not using AMD CPUs for modern gaming as it is so hit and miss.
Well I would not say far superior frame pacing for Nvidia on DX12, definitely for AMD yes but we have only seen the results for 480.
Also worth noting that the DX11 framerates is also technically higher for AMD on DX11 and it should be notably behind DX12 considering how long that version has been out, context even with decent Intel CPU.

But again this is compaing a game and UWP that has been patched multiple times over several months to a DX11 initial launch of just 6-days ago and I am surprised that is the only issue found on the DX11 in terms of environment-performance issues, and even then Computerbase.de mention that the DX12-UWP version is still having intermittent problems.
Regardless of the graphics card has Quantum Break under DirectX 12 from time to time from time to time a small hiccups, which you also can immediately sense when playing.
So something is still up with the frame pacing or something else causing stutter/latency/etc now and again on DX12-UWP version, and Remedy only got to grips with the VSYNC-UWP technicalities and other resolutions in August patch, many months after release.

Anyway just a shame we did not get to see how Hawaii performs compared to Polaris on DX11 for frame pacing with the computerbase.de tests.
The key is how quickly AMD can get to grips with the frame pacing issue on DX11 and how many models the issue affects, along with what the next set of optimisations (driver and game) bring for the DX11 version of the game, but that is probably for a different topic-thread.
Cheers
 
Last edited:
More Gears 4 gameplay testing :

gow4_1080p_ultra.png

http://pclab.pl/art71543-7.html

Seems we have another Deus Ex moment in Gears Of War 4. The built in test is running a little faster on NVIDIA than the actual gameplay, as shown in tests from gamegpu, gamersnexus, pcworld, and wccftech. However gameplay tests from computerbase, pclab and pcgameshardware reveals AMD closer to NV than NV would like to be (especially at 1440p and 2160p), and depending on the area, Fury X can lead the 980Ti.

It's worth noting though that gamersnexus noted some specific issues on AMD hardware:

AMD's frametime performance seems disproportionately impacted to nVidia's, reflected in our raw data and low values above,

The most critical takeaway here, and this will continue to 1080p, is that Gears of War 4 benchmarks with a fairly high variance in framerates on AMD hardware. This leads to results which, at the low end, can look somewhat inconsistent despite relatively consistent averages. In gameplay, the performance measurements manifest themselves in the form of “stutters” and frame drops. Play can feel choppy at times, and that's when there's a brief drop and hard hit to the low recorded framerates. Not every single test pass exhibits this behavior, but you'll run into the issue at least once every minute (or so) when playing.

GN has reached out to AMD for help in researching this issue. AMD has informed us that the team is investigating.

Note that 16.9.2 and 16.10.1 drivers both exhibit the same performance in Gears of War 4, which we've validated directly with AMD. We will have to look to future driver or Gears of War updates for resolution of these lower frametimes under certain conditions. Our present hypothesis is that, outside of VRAM limitations with some configurations, one of the couple dozen options within the “Ultra” preset is taxing AMD hardware exceptionally hard, because the impact is lessened at “High.” Again, this will require further research.
http://www.gamersnexus.net/game-bench/2630-gears-4-pc-benchmark-updated-with-ultra-high-settings
 
Shame they used the internal Gears of War 4 benchmark and ruins it a fair bit for me,
you would think more review sites would move away from that especially when they are making the effort to compile their own data using PresentMon and Python script.
From what I can tell, PCGameshwardware still has Nvidia performing slightly better than Fury X for the 980ti even when Async Compute is on for AMD, and with 1060-480 trading blows, context is the average-low fps.
Need to be wary of the earlier frametime/latency charts where it is not using FHD monitor natively, and I think something is still up with the ultra HD as it has near identical behaviour with the various environment settings as when used at FHD.
Computerbase.de show that Nvidia is trading blows at the levels you would expect, but unfortunately we do not have the lower percentiles or frametimes beyond 480/1060.
Cheers
 
Last edited:
Hmm. The reports I was seeing had several people whose issues with DX12 were fixed. It is working better for those people.

So, dunno what to tell you there. I guess partial fix might be a better term?
 
Hmm. The reports I was seeing had several people whose issues with DX12 were fixed. It is working better for those people.

So, dunno what to tell you there. I guess partial fix might be a better term?


As far as my experience goes, the rendering engine part seems to be mostly done by now. IIRC, I had one rendering-related CTD in about 20 hours of gameplay, and this was using the DX12 mGPU path which is still in Beta.
Single GPU average gains aren't great, but minimum FPS rise a bit and stuttering practically goes away in DX12. I reckon I probably only got a FPS bump because my CPU has really low clock speeds (up to 2.8GHz).

Multi-GPU gains in DX12 are borderline ridiculous in the good sense. I've shown my own results with a couple of Hawaii cards, but if you go to Steam's game forums you'll see people going from 25 FPS DX11 to 44 FPS DX12 with dual GTX 1080 in 4K (another user claims 37 FPS DX11 vs. 49 FPS DX12 also with dual GTX 1080). If these users were also getting 30-40% scaling from DX11 SLI, they're probably over 90% now, just like my AMD setup.
It's such a shame that no pc gaming website is talking about this, really.


The game has huge issues regarding to its handling of data, though. First, the game's binaries and assets are completely bloated, as the game now occupies over 60GB even though the total gameplay area is actually quite small (Witcher 3 is probably 50x larger and occupies half the storage). Loading times are horrendous, going up to more than a minute on a 500MB/s SSD. Some people measured and it seems the game won't ever pull more than 12MB/s out of its mass storage drive.
There are quicksave/quickload shortcuts but the "quick"load takes as much time as if you were loading the game for the first time, making those "let's see what happens and go back" experiments really tiresome.

Sometimes the game crashes and screws up the whole saves I did for that particular area. I'm not joking: one time the game crashed and then the only save file I could load was from a couple of hours before, just because it was a save made in a previous area. I now actually travel in the metro purposely just to saveguard some important save points.

Many users think this might have to do with the Denuvo DRM, which seems to interfere with all of the game's data management.
Which is ridiculous because apparently the game was cracked within 2 weeks after release. It would be hilarious if the pirated versions aren't getting these problems.
 
Maxwell cards have Async forcibly disabled, Only Pascal cards benefit from it.

Gameplay tests from golem.de, pclab.pl, overclock3d, purepc.pl and pcgameshardware show NV cards are generally faster in this title, while computerbase shows AMD faster @1440p, it's likely due to the choice of test area. Built in benchmark shows NV faster most of the time.
 
Maxwell cards have Async forcibly disabled, Only Pascal cards benefit from it.

Gameplay tests from golem.de, pclab.pl, overclock3d, purepc.pl and pcgameshardware show NV cards are generally faster in this title, while computerbase shows AMD faster @1440p, it's likely due to the choice of test area. Built in benchmark shows NV faster most of the time.

Right, this was always the case, yet they still lose performance when async is enabled in Ashes of the singularity for example. If I'm not mistaken ext3h said it was some kind of d3d emulation layer which accepted multiple streams then serialized the tasks for the submission to the driver, this incurs a performance penalty in Ashes' case. Wondering what's going on here if you force it on through the game
 
Well, Maxwell cards don't perform well with Async, they take performance hit from it, or at best gain nothing, this was pretty obvious even in 3D Mark TimeSpy demo, Only Pascal cards can use Async to gain an increase in fps. As such, comparable cards from NV and AMD (R480 vs 1060) are able to achieve equal performance in this title.

GTX-1060-REVIEW-78.jpg

http://www.hardwarecanucks.com/foru...iews/73040-nvidia-gtx-1060-6gb-review-15.html

That's not the reason rx480 and 1060 perform similarly.

Maxwell's performance tracks compute throughput fairly consistently, and Pascal behaves in much the same way as Maxwell.

The RX480 underperforms relative to Hawaii and Fiji in at flop/s parity - anyway this is besides the point. I know Maxwell doesn't benefit from async compute, I know it's disabled by default - just like in Ashes.

However, in Ashes you could force enable async and the engine would switch to a parallel command stream submission model and something in the stack (d3d emulation layer I was asking about, or driver) then takes those parallel streams and serialized them for submission to the graphics queue only. This is what incurred the performance hit.

The benchmark posted above specifies async compute is enabled, I'm asking if this affects performance at all on maxwell.

@Ext3h I was wondering about that d3d compatibility layer you mentioned, if thats true and nvidia does not expose the compute queue whatsoever then wouldn't that actually make it not d3d12 compliant? Makes more sense for the driver to be doing this when d3d passes it work on compute.
 
Back
Top