No DX12 Software is Suitable for Benchmarking *spawn*

Theres an entirely different build with embedded nvapi, and extensions in engine aren't the same as extensions for the api...

DOOM uses gcn intrinsics, why are we not in arms about this too?

It seems that lately whenever anything uses nvidia specific extensions/libraries gameworks whatever it's bad, if amd does it its a huge victory
The intrinsics being used are the closest thing to compute compression that exists. What's being used should be standardized with SM6.0 for DX12 and eventually an update to Vulkan. The functionality is also supported by all vendors, but primarily used on consoles. In the case of Doom the developer just got ahead of the API a bit and the extensions make porting from console easier. That's a far cry from extensions another IHV can't reasonably use and only exist on one platform.

I like the idea of vendor extensions, but they should be properly integrated into major versions unless novel or very situational. They should also be primarily used for experimental technology.
 
DOOM uses gcn intrinsics, why are we not in arms about this too?

Well you certainly seem to be.

All games end up using code directed at one or other IHV implementations. The problem stands when said code exists to cripple performance on the competition (or even the source IHV's own older hardware, to add insult to injury).
And that problem gets orders of magnitude greater if implemented at the engine level.

GCN intrinsic functions on Doom aren't hurting performance on nvidia cards beyond an acceptable ~5% margin of error, in any benchmark I've seen so far.
The game is just hellishly compute-dependent, which is why it's also the game where Fiji get the longest distance from Hawaii.
 
Well you certainly seem to be.

All games end up using code directed at one or other IHV implementations. The problem stands when said code exists to cripple performance on the competition (or even the source IHV's own older hardware, to add insult to injury).
And that problem gets orders of magnitude greater if implemented at the engine level.

GCN intrinsic functions on Doom aren't hurting performance on nvidia cards beyond an acceptable ~5% margin of error, in any benchmark I've seen so far.
The game is just hellishly compute-dependent, which is why it's also the game where Fiji get the longest distance from Hawaii.


Right even if those features can be turned off? Well Intrinsics is a good feature for developers as it makes development time a bit easier for console to PC ports (this is only part of the solution though, ports are still not straight forward), but nV did do driver optimizations to equalize performance to where it should be in later reviews, the dev is still working with nV and pascal according to them, they will see even more improvements in performance possibly.
 
Are you sure Khronos has that kind of authority?
When Epic starts including nvidia extensions in their Vulkan path for UE4, what can Khronos do?


Epic will not use nV only extensions in their engine, they have stated this many times, they try to make the engine run well on both IHV's, each IHV has the capability of making their own branch and adding their own libraries to UE4, which only one IHV has done so so far, even though the engine has been available to do this for free for 2 years now.
 
Well you certainly seem to be.

All games end up using code directed at one or other IHV implementations. The problem stands when said code exists to cripple performance on the competition (or even the source IHV's own older hardware, to add insult to injury).
And that problem gets orders of magnitude greater if implemented at the engine level.

GCN intrinsic functions on Doom aren't hurting performance on nvidia cards beyond an acceptable ~5% margin of error, in any benchmark I've seen so far.
The game is just hellishly compute-dependent, which is why it's also the game where Fiji get the longest distance from Hawaii.

I'm not up in arms about it, it's a good thing that they were used, and I hope other capable developers like id follow suit for both IHVs. Why would the use of GCN instrinsics hurt NV ? That code doesn't run on NV hardware at all.

Like @Razor1 said, it's separate branch - I can't help but feel this is a very one-sided argument; I've **never** argued that it's unfair to use IHV-specific extensions, it's fair game imo.

Intentionally crippling the competitor though, I agree that's not acceptable, but what do you really mean here ? Code that serves no purpose but to degrade the experience on other IHV's hw ? I have no doubt tesselation and gameworks is about to come up in this conversation, and I'm going to repeat what razor said, you have the option of disabling those effects. You can argue that jacking up the poly counts is an intentional move by NV to cripple GCN, but you can also argue they're simply exploiting the strengths of their products - I really don't see the issue, just a lot of HOO-WAA and witch hunts.

AMD made a big deal of primitive discard on Polaris, and surprise surprise, it doesn't get crippled like Fiji or Hawaii do...
 
The intrinsics being used are the closest thing to compute compression that exists. What's being used should be standardized with SM6.0 for DX12 and eventually an update to Vulkan. The functionality is also supported by all vendors, but primarily used on consoles. In the case of Doom the developer just got ahead of the API a bit and the extensions make porting from console easier. That's a far cry from extensions another IHV can't reasonably use and only exist on one platform.

I like the idea of vendor extensions, but they should be properly integrated into major versions unless novel or very situational. They should also be primarily used for experimental technology.

I get the gist of what you're saying, but I fail to see the distinction you're drawing between it's current form as an extension and future integration into DX with SM6.0, it's just going to be standardized by the API but on some level it's still going to fork into AMD and NV IL, the only difference is that it's no longer "additional" but rather a standard part of the API.
 
Intentionally crippling the competitor though, I agree that's not acceptable, but what do you really mean here ? Code that serves no purpose but to degrade the experience on other IHV's hw ?
It doesn't even have to be intentional. The issues with OpenGL demonstrate this. The same code can execute with different results on different IHVs because the specification wasn't clear. AMD generally strictly adheres to the standard while Nvidia cuts some corners for performance or coding reasons. Devs code for the more lenient standard and suddenly nothing works on the compliant version. That's inline with the extensions mess where a developer makes a path only one part can execute.

I fail to see the distinction you're drawing between it's current form as an extension and future integration
The distinction is having an extension that only one vendor can reasonably support on a single platform. Using GCN intrinsics to facilitate porting from another platform is acceptable IMHO. Not ideal, but neither is a port. Bottom line, extensions shouldn't be used beyond experimental code, development, or very narrow use cases unless all vendors plan on adopting the extension.

A bad extension would be "perform this proprietary and unique sampling pattern with purpose build hardware".
 
Isn't an engine that's built from the ground up for Vulkan or DX12 going to require distinct paths for different architectures, and developers that have extensive knowledge of the low level details of said architectures? So in that case, wouldn't IHV specific extensions actually just make it easier? The abstractions are getting in the way, and the underlying hardware is quite different, so what better way to get in the nitty gritty than to have things exposed directly by the IHVs?
 
Isn't an engine that's built from the ground up for Vulkan or DX12 going to require distinct paths for different architectures, and developers that have extensive knowledge of the low level details of said architectures? So in that case, wouldn't IHV specific extensions actually just make it easier? The abstractions are getting in the way, and the underlying hardware is quite different, so what better way to get in the nitty gritty than to have things exposed directly by the IHVs?
But you see DX12/Vulkan are not low level APIs, they are lower level APIs. They still have abstractions left in. And those get in the way of true (to-the-metal) coding for each vendor.
 
CB show very good improvement with frametime variance for RX 480 on dx12 and to a less extent for 1060.

https://forums.overclockers.co.uk/showthread.php?p=30060384#post30060384

The original link that post is quoting. https://www.computerbase.de/2016-09...agramm-frametimes-unter-steam-auf-dem-fx-8370

Interesting, the frame times for Dx11 are hugely variable compared to Dx12. Especially for the 1060 on the slower CPU. So the 1060 gains greater performance in Dx11 at the expense of more variability in frame times leading to faster but more inconsistent performance. On the other hand, the 480 gains nothing by moving to Dx11 except for more variable frame times and potentially less performance.

Regards,
SB
 
CB show very good improvement with frametime variance for RX 480 on dx12 and to a less extent for 1060.
https://forums.overclockers.co.uk/showthread.php?p=30060384#post30060384
Though more variable, It's well within the tolerable range (for the 1060 at least), enough for the experience not to feel hectic or stuttery and to feel very playable (as evident by DF testing). It's not a smooth sailing under DX12 for both vendors as there are large spikes happening frequently under DX12, not to mention the lower fps (for NV), which are off putting. IMO significantly higher frames with slightly increased variances is preferable in this case.
 
Last edited:
Always interesting to see how people view data.

I remember when the whole frame time analysis started being used, AMD would get slagged for having a higher average FPS than the Nvidia counterpart at times but with a much more variable frame time.

How times have changed. So now, high frame time variance is good as long as FPS is higher? :)

Regards,
SB
 
Variance isn't the problem as much as the relative peaks. Variable refresh rates for instance aren't driven off the average fps or frametime, but a running average of the worst case with a safety margin. The focus really should be on max frametimes (excluding odd peaks) as nobody complains about their FPS being unreasonably high. VSync enabled and both cards over 60fps provides effectively the same experience.

For the computerbase results under an FX-8370 I'd say both vendors provide a better experience under DX12 than DX11. For the 1060 I'd say ~30ms (DX11) vs 25ms (DX12) effective frametimes. Interestingly 480's DX12 performance on an 8370 is close to equal, if not better than, Nvidia's DX11 with an i7-6700. The 480 with the FX-8370 under DX12 might be the best experience shown there. It's just not presented without results from both CPUs showing max frametimes on the same graph. I'm eyeballing these numbers so they may be off a bit. My results seem off a couple FPS from what the actual FPS graphics are showing, but that might be the frametime vs fps difference I used. At the very least there's a 2.3fps difference between Intel and AMD processors shown here for averages on 480. Not unreasonable if DX11 is overlapping frames a bit more, higher fps and frametimes at the same time.
 
Last edited:
How times have changed. So now, high frame time variance is good as long as FPS is higher? :)
Your target in the end is a smooth gameplay experience, A GPU outputting consistent but low fps is not good for the consumer, heck a lowly 650Ti can make that happen when running intensive graphics. high fps with tolerably more variance (ie, to an extent) is better than low consistent fps, and enables far responsive experience.

If two GPUs deliver nearly equal fps, then the one with more consistency wins, erratic consistency also hurts high frames.
I remember when the whole frame time analysis started being used, AMD would get slagged for having a higher average FPS than the Nvidia counterpart at times but with a much more variable frame time.
That happened when comparisons between SLI and CF were being investigated thoroughly, CF often exhibited erratic variances making the percieved experience far worse the mere fps measurement would imply, this is not the case here, the perceived experience is smoother according to DF comparisons.
 
Last edited:
Seems pcgameshardware has reproduced the same problem computerbase encountered in Forza Horizon 3, dropped frames occur when using a powerful GPU at a resolution and frame rate that are CPU limited, like using a GTX 1080 GPU @1080p Ultra settings with fps in the range of 70. Here dropped frames become very frequent. The more powerful the GPU the more frames are dropped, for example GTX 1080 drops more than 980Ti which drops far more than 970. RX 480 drops more than RX 470 which drops more than FuryX (which has bad performance in this title).

@4K and Max details, the problem disappears completely. So they excluded the display refresh rate as the likely cause.
http://www.pcgameshardware.de/Forza.../Specials/Benchmarks-Test-DirextX-12-1208835/

The game has an unbelievablly high CPU requirements, it stresses only the 1st core of the CPU, leaving the rest quite idle, a behavior noticed with Forza 6 too, which is quite ironic considering DX12's ability to distribute load on multi core CPUs. Forza 6 ran very differently though, with GPUs outputting expected performance range. Maybe CarstenS can chime in and shed some more light on the matter.

http://www.benchmark.pl/testy_i_recenzje/geforce-gtx-1060-test/strona/26424.html
 
The game has an unbelievablly high CPU requirements, it stresses only the 1st core of the CPU, leaving the rest quite idle, a behavior noticed with Forza 6 too, which is quite ironic considering DX12's ability to distribute load on multi core CPUs.
And if that's the same execution behavior that happens on an XB1, then that would mean you're throwing everything on a single Jaguar core, which would be very unfortunate.
 
And if that's the same execution behavior that happens on an XB1, then that would mean you're throwing everything on a single Jaguar core, which would be very unfortunate.

I can't believe that's the case. The game wouldn't run at a pretty locked 30fps if only 1 weak jaguar core was running it IMO.
 
Probably just the case that they didn't get to spend much time porting it properly to Win10, though I don't know how core/thread management differs.
 
Back
Top