DX12 Performance Discussion And Analysis Thread

CarstenS · Mar 2, 2016

What I've seen in earlier OpenCL tests was that Fiji compared to Hawai only pulls apart in large-block copy. Read and write OTOH were even a bit slower comparing Fury X to R9 390X. This was with very early drivers in july last year, so it could have improved since then. Need to re-test at some point. Or it could be some peculiarity of OpenCL or it's driver itself, I don't know.

CSI PC · Mar 2, 2016

Phil Spencer review just released regarding PC plans and was asked about the Universal Windows Application (this is what is causing the debate between ExtremeTech/Guru3d/Pcper as Ashes Singularity aligns with this criteria now and why it goes beyond directflip that ExtremeTech assumed and how Ashes was analysed - context by Ryan at Pcper-Guru3d is important and IMO factual around VSYNC and implications for DX12 game development and also GSYNC/FreeSync).
Here is a snippet of what PCGamer and Phil said, which shows how this is a headache up to now due to no clear practical strategy-leadership.
PCG: Universal Windows Applications currently don't support a lot of the features PC gamers would expect, for instance multi-GPU support, exclusive fullscreen, modding, etc. What are you doing to improve UWA functionality.
Phil: Yeah, well, we obviously have the same list, and maybe even a little longer than what the community has brought up around Rise of the Tomb Raider. Certain things will happen very quickly in terms of, like, mGPU support and stuff where there’s no policy, it’s just us working through the timeline of implementation. VSync lock, kind of the same thing. There’s specific reasons that it’s there, but it’s not something that’s kind of a religion on our side that this has to work.

CSI PC · Mar 3, 2016

Another performance consideration; the VSYNC one when looked into further has repercussions for potentially perceived performance whether visually (PCPER did use a video camera frame-by-frame to double check their investigation), or input as there is a valid reason why many serious gamers like certain games with VSYNC off.
The other consideration is just how beneficial is DX12 multi-GPU compared to either SLI or CrossFire.....
While I appreciate it is not really possible to do a fair comparison, current multi-GPU support of Ashes in DX12 is not near what one can expect from the bespoke driver option provided by either AMD or NVIDIA.
This is further compounded that it seems the faster option offered by AMD/NVIDIA is no longer available for DX12 games, so performance hit and consideration when trying to maintain maximum frame rate refresh of monitor (critical if both GSYNC and FREESYNC are crippled - needs validating to know how/if/limitations works within DX12 games and also those aligned more so with UWA).
Anyway could be a nightmare for those who have top tier multi-GPU with 120+hz refresh monitors or 4k gaming at 60hz.
Cheers

Ethatron · Mar 3, 2016

CSI PC said:
While I appreciate it is not really possible to do a fair comparison, current multi-GPU support of Ashes in DX12 is not near what one can expect from the bespoke driver option provided by either AMD or NVIDIA.

A Multi-GPU implementation has to be written 100% by the devs., there is no driver magic.

Ryan Smith · Mar 3, 2016

Ethatron said:
A Multi-GPU implementation has to be written 100% by the devs., there is no driver magic.

DX12 does allow for implicit multi-adapter. And it inherently works in windowed mode as well. But we've yet to see an example of it.

http://images.anandtech.com/doci/9740/Implicit.PNG

Silent_Buddha · Mar 3, 2016

CSI PC said:
The other consideration is just how beneficial is DX12 multi-GPU compared to either SLI or CrossFire.....
While I appreciate it is not really possible to do a fair comparison, current multi-GPU support of Ashes in DX12 is not near what one can expect from the bespoke driver option provided by either AMD or NVIDIA.

You're correct, it's better than Crossfire or SLI. You can pair same vendor or different vendor cards. Multi-GPU also doesn't have to be limited to crappy AFR. Ashes does have an issue of being CPU limited in their MGPU implementation as their MGPU implementation increases the CPU load of the game, but still offers greater performance with MGPU than single.

But it's still early days, and it'll be interesting to see what other developers do with it. So far Firaxis (with Mantle) is the only other developer that comes to mind that has experimented with MGPU in the game itself. Even better if the engine makers (UE, Unity, etc.) eventually provide MGPU functionality into their engines.

It's the same as AA. AA should be implemented in game, and we shouldn't have to rely on driver hacks to force it.

Regards,
SB

CarstenS · Mar 3, 2016

Silent_Buddha said:
It's the same as AA. AA should be implemented in game, and we shouldn't have to rely on driver hacks to force it.

Given the underwhelming choice in many current titles (either prohibitevly expensive or blurring the whole image), I for one would LOVE to see more options, either directly in the driver, via third party tools or whatever. Experience shows you cannot for the most part count on devs for nice AA implementation. I realize ofc that driver enforced AA is harder and harder to get to work correctly with modern rendering techniques.

Alessio1989 · Mar 3, 2016

I can guess that for most developers, 3rd party tools and driver settings interfering with their application is a pain in the ass, since they are cause of tons of issues and obviously most of people will blame them for that.

CSI PC · Mar 3, 2016

Ryan Smith said:
DX12 does allow for implicit multi-adapter. And it inherently works in windowed mode as well. But we've yet to see an example of it.

http://images.anandtech.com/doci/9740/Implicit.PNG

Ryan,
from your discussions regarding DX12 with various developers and AMD/NVIDIA; do you know if we can expect to see implicit multiadapter in any games?
Would make it interesting from a performance comparison, although still not perfect in terms of comparing to SLI/CrossFire.

Also, any chance asking both NVIDIA and Oxide if they intend to collaborate in adding CR/ROV shading to Ashes of the Singularity?
I ask this because in a Q&A you did, they infer they are independent with regards to why they implemented async compute and are curious of the benefits of using various DX12 solutions, so in that case I would expect them to also look to implement CR/ROV

http://www.anandtech.com/show/10067/ashes-of-the-singularity-revisited-beta/2
Question Anandtech asked was:
"Does Oxide/Stardock have some sort of business deal with any IHV with regards to Async Compute? Is Oxide promoting this feature because of some kind of marketing deal?"

Thanks

CSI PC · Mar 3, 2016

Well interestingly just to add, looks like Just Cause 3 is the first to use both CR and ROV, although ironically seems this collaboration was with Intel rather than NVIDIA - could be a good thing if Intel becomes very active with these features and developers.
http://www.dsogaming.com/news/just-...ing-dx12-pc-exclusive-dx12-features-revealed/

Cheers

Deleted member 13524 · Mar 3, 2016

https://www.reddit.com/r/Amd/commen...ologies_group_qa_is_happening_here_on/d0md4ct

I want to be clear that there is no graphics architecture on the market today that is 100% compliant with everything DX12 or Vulkan have to offer. For example: we support Async Compute, NVIDIA does not. NVIDIA supports conservative raster, we do not.

Shots fired.

Alessio1989 · Mar 3, 2016

CSI PC said:
Well interestingly just to add, looks like Just Cause 3 is the first to use both CR and ROV, although ironically seems this collaboration was with Intel rather than NVIDIA - could be a good thing if Intel becomes very active with these features and developers.
http://www.dsogaming.com/news/just-...ing-dx12-pc-exclusive-dx12-features-revealed/

Cheers

As for rendering features, Intel (with Skylake) has actually the most advanced GPU speaking about of D3D12 rendering features. I also personally find Intel drivers the most stable and complete so far. So it is not a big surprise.

Ryan Smith · Mar 3, 2016

CSI PC said:
Ryan,
from your discussions regarding DX12 with various developers and AMD/NVIDIA; do you know if we can expect to see implicit multiadapter in any games?

Nothing in particular. However I would be shocked if we don't see it happen with a AAA multiplatform game at some point. NVIDIA will want to get SLI support for a major game, and implicit will be the least intrusive method to get there.

Also, any chance asking both NVIDIA and Oxide if they intend to collaborate in adding CR/ROV shading to Ashes of the Singularity?

I'll ask next time I talk to Oxide. But I wouldn't expect anything with the game set to ship in 3 weeks.

CSI PC · Mar 3, 2016

Alessio1989 said:
As for rendering features, Intel (with Skylake) has actually the most advanced GPU speaking about of D3D12 rendering features. I also personally find Intel drivers the most stable and complete so far. So it is not a big surprise.

Yeah not a complete surprise, but it is quite ironic it is not NVIDIA as these features are more associated (rightly or wrongly) with them.
Still a positive for gamers IMO as we need to see what all features of DX12 bring to the table in real games, and glad Intel is stepping up to make this more than just a fight between AMD and NVIDIA on who controls the performance of gaming.

Also thanks Ryan, yeah I have my doubts they will implement CR/ROV even if they said they were interested in all DX12 features that can improve gaming and not just tied to async compute.
If Intel worked with Avalanche on Just Cause 3 for CR/ROV, maybe they also reached out to Oxide (maybe an angle you can also check).

Cheers and thanks

Alessio1989 · Mar 3, 2016

CSI PC said:
Yeah not a complete surprise, but it is quite ironic it is not NVIDIA as these features are more associated (rightly or wrongly) with them.
Still a positive for gamers IMO as we need to see what all features of DX12 bring to the table in real games, and glad Intel is stepping up to make this more than just a fight between AMD and NVIDIA on who controls the performance of gaming.

Currently Skylake is the only GPU supporting tier 3 features of CR, while Maxwell 2.0 is left to tier 1 (though it should have a 1/256 pixel precision). With CR, things become really interesting with Tier 2.

CSI PC · Mar 4, 2016

Alessio1989 said:
Currently Skylake is the only GPU supporting tier 3 features of CR, while Maxwell 2.0 is left to tier 1 (though it should have a 1/256 pixel precision). With CR, things become really interesting with Tier 2.

Agreed which is why I said rightly or wrongly, the other aspect is that NVIDIA seemed to be more active in presenting the benefits of both CR/ROV going back awhile - this is where it is probably perceived by many it seems (if can face reading various comments section on tech sites lol) as an NVIDIA "pushed it into DX12" in a way to combat AMD.
Again more of a perception thing, and another good reason for Intel to be active with CR/ROV collaborations.

Cheers

Ext3h · Mar 4, 2016

CSI PC said:
and glad Intel is stepping up to make this more than just a fight between AMD and NVIDIA on who controls the performance of gaming.

Are Intel GPUs competitive in terms of performance per watt?
Respectively are they still competitive if you attribute for the efficiency improvement from the advanced manufacturing process, assuming characteristics which are on par with the 14/16nm industry standards?
And even if they are efficient, are they are also cost efficient in terms of die size per performance?

It's not much of a surprise that Intels hardware is the most feature complete one, but so far the issues with Intel GPUs were mostly raw performance. And if the efficiency isn't on par, we can't expect these GPUs to grow until they provide any real alternative to dedicated GPUs (or AMDs APUs) any time soon, at least not for any AAA title.

3dilettante · Mar 4, 2016

Ext3h said:
Are Intel GPUs competitive in terms of performance per watt?

There are implementations of Intel processors with integrated graphics that are.

Respectively are they still competitive if you attribute for the efficiency improvement from the advanced manufacturing process, assuming characteristics which are on par with the 14/16nm industry standards?
And even if they are efficient, are they are also cost efficient in terms of die size per performance?

This is difficult to tease out, since Intel does not make discrete GPUs and the confounding factors of product tier, product trade-offs, and device platform design can complicate things.
Getting all those conditionals at once at a bargain price is unlikely, much like it is difficult to find a product using the more graphics-aligned AMD APUs that isn't notably hobbled by cost-cutting.

It's not much of a surprise that Intels hardware is the most feature complete one, but so far the issues with Intel GPUs were mostly raw performance.

This hasn't always been the case. That Intel has caught up and exceeded the graphics experts in various aspects was noticed.

And if the efficiency isn't on par, we can't expect these GPUs to grow until they provide any real alternative to dedicated GPUs (or AMDs APUs) any time soon, at least not for any AAA title.

For lower-end discrete, barring compatibility and driver issues, the better and more recent ones do.
Versus AMD APUs, the same is mostly true, perhaps more so since the CPU portion of AMD's APUs is frequently a more important downside than an Intel shortfall in GPU performance.

And Intel's GPUs have grown, given the range from 24-32-48 EUs and a wide array of core and EDRAM segmentation that can yield good results.
The best ones don't come cheap, typically.

lanek · Mar 4, 2016

3dilettante said:
There are implementations of Intel processors with integrated graphics that are.

This is difficult to tease out, since Intel does not make discrete GPUs and the confounding factors of product tier, product trade-offs, and device platform design can complicate things.
Getting all those conditionals at once at a bargain price is unlikely, much like it is difficult to find a product using the more graphics-aligned AMD APUs that isn't notably hobbled by cost-cutting.

This hasn't always been the case. That Intel has caught up and exceeded the graphics experts in various aspects was noticed.

For lower-end discrete, barring compatibility and driver issues, the better and more recent ones do.
Versus AMD APUs, the same is mostly true, perhaps more so since the CPU portion of AMD's APUs is frequently a more important downside than an Intel shortfall in GPU performance.

And Intel's GPUs have grown, given the range from 24-32-48 EUs and a wide array of core and EDRAM segmentation that can yield good results.
The best ones don't come cheap, typically.

For me, its not a question or the actual raw performance, i use CPU that just dont have Integrated GPU's on my gaming system.. High end perf ( 6 cores Ivybridge ) + multiple dedicated gpu`s.... I dont say that im not interested to see good results on APU's, but where will they stand against dedicated gpu's + 8-12 cores cpu ?

Ext3h · Mar 4, 2016

lanek said:
I dont say that im not interested to see good results on APU's, but where will they stand against dedicated gpu's + 8-12 cores cpu ?

In a very odd position. Because what they lack in raw performance and dedicated memory bandwidth, they usually make up in entirely different domains, by CPU-GPU latency and zero-copy HSA features.

Which makes your second question (which you edited away) not as trivial as you might think. It still depends on how soon the different HSA initiatives will pick up, cross platform that is. Currently, it's still rather unintuitive to work with such a platform, from a developers point of view. It's getting better, but we are not quite there, as you still need to compile your application with at least 3 different compilers if you want to cover all 3 vendors. Cross vendor setups are messy, to say the least, and require nasty abstraction which effectively undoes the recent improvements.

Using a heterogenous architecture per DX12 would be possible, but probably not as efficient as you might think, respectively not in that way. Take CR for an example, it allows a number of smart tricks, but your IGP simply lacks in raw throughput to make it worth it. Where you do profit though, is if you offload portions onto the IGP which require frequent synchronisation with the application, as this becomes a lot cheaper in terms of latency, compared to the dedicated GPU.

The real question is though:
Is it worth to optimize for heterogeneous architectures? IHMO, it's not, at least not until you are developing a major engine which reaches a sufficient number of systems which have just such a configuration. And even then you have to evaluate if there are any tasks you can safely offload, without running into other limitations. Effectively, you are probably going for an horizontal cut of your render pipeline at a few predetermined breaking points, based on raw performance, but not based on differences in capabilities or timing characteristics. Simply because you can't account for the latter ones to be fulfilled by ANY device in an average system.

DX12 Performance Discussion And Analysis Thread

CarstenS

Moderator

CSI PC

CSI PC

Ethatron

Ryan Smith

Silent_Buddha

CarstenS

Moderator

Alessio1989

CSI PC

CSI PC

Deleted member 13524

Guest

Alessio1989

Ryan Smith

CSI PC

Alessio1989

CSI PC

Ext3h

3dilettante

lanek

Ext3h

Similar threads