DX12 Performance Discussion And Analysis Thread

3dilettante · Mar 14, 2016

To add further context: http://gpuopen.com/dcc-overview/ describes the addition of delta compression to the color block.

Jawed · Mar 14, 2016

The AMD APP SDK has a memory bandwidth tester.

CarstenS · Mar 14, 2016

Ext3h said:
Can you post the kernel source as a reference to that chart?

I cannot, AFAIK it's source is not open and I don't have access to it.

fellix · Mar 14, 2016

Ahh, that's the old GPCBenchmark. Carsten, can you post some numbers from the local memory sub-test with Fiji and Hawaii?

By the way, Fiji doubles the L2 size because of the doubled count of the memory controllers -- 32*64KB partitions = 2048KB, the bandwidth should also scale proportionally.

Jawed · Mar 14, 2016

CodeXL will capture all the kernels running on the GPU and decompile them into meaningful OpenCL.

CarstenS · Mar 15, 2016

fellix said:
Ahh, that's the old GPCBenchmark. Carsten, can you post some numbers from the local memory sub-test with Fiji and Hawaii?

That test IMHO is quite erratic, so take this with another extra dose of salt. Erratic in the sense that the results can vary a couple of hundred GB/s from run to run. I took the best out of ~10 tries, so here you go.

fellix said:
By the way, Fiji doubles the L2 size because of the doubled count of the memory controllers -- 32*64KB partitions = 2048KB, the bandwidth should also scale proportionally.

The Fiji block diagram seems to imply otherwise:
http://www.hotchips.org/wp-content/...-GPU-Epub/HC27.25.520-Fury-Macri-AMD-GPU2.pdf

--
@Jawed: Good input. If no one beats me to it, as soon as i can find the time.

fellix · Mar 15, 2016

CarstenS said:
That test IMHO is quite erratic, so take this with another extra dose of salt. Erratic in the sense that the results can vary a couple of hundred GB/s from run to run. I took the best out of ~10 tries, so here you go.

Thanks.
Indeed, it is erratic. I noticed that the application doesn't trigger the highest P-state or boost clock on my 980Ti. I have to find a way to run the tests with power management off somehow.

CarstenS · Mar 15, 2016

Yep, that's a problem with the Geforce cards. Radeons react quickly enough though for the very short duration of each test run.

Clukos · Mar 15, 2016

fellix said:
Thanks.
Indeed, it is erratic. I noticed that the application doesn't trigger the highest P-state or boost clock on my 980Ti. I have to find a way to run the tests with power management off somehow.

You can always flash a custom bios

That's what i did to keep the voltage stable under 3D load (and effectively "disable" GPU boost) because the way Nvidia have it set up created big fluctuations in games where the card wasn't being pushed enough and i was getting driver crashes.

CSI PC · Mar 15, 2016

Here is a good list of conferences pertaining to DX12/gaming for GDC:
http://www.gdconf.com/news/get_advanced_graphics_tips_fro.html

Some of those presentations look to be very interesting from what was mentioned about yesterdays.
Cheers

Clukos · Mar 15, 2016

Interesting slide from GDC this year

Full article: http://www.dualshockers.com/2016/03...-nvidia-and-amd-cards-lots-of-details-shared/

Alessio1989 · Mar 15, 2016

want more leaked slides Q_Q

CSI PC · Mar 15, 2016

I think they touch on something that is interesting regarding AMD and that is the GPU memory management; how this diverges from the Fury range using HBM and the lower cards with greater memory albeit GDDR5.
I assume developers need to consider as part of their optimisation how to handle the dynamic memory solution with the Fury range in a more aggressive way, and the approach from the lower cards that are not as bandwidth efficient but benefit from extra memory.

Cheers

CSI PC · Mar 15, 2016

Alessio1989 said:
want more leaked slides Q_Q

We could do with the slides from some of the other presentations that I linked the URL for, they also seemed to have some very interesting real world experiences.
If missed first time: http://www.gdconf.com/news/get_advanced_graphics_tips_fro.html
Cheers

CSI PC · Mar 15, 2016

And tomorrow should be very interesting as it is more aligned with Intel/Avalanche-Just Cause 3 so should also focus on CR/ROV that Intel collaborated on with Avalanche (I mentioned in a post awhile back).
https://software.intel.com/en-us/event/gdc/2016/sponsored-talks

Cheers

Clukos · Mar 15, 2016

From the Quantum Break presentation

DX11 drivers are able to circumvent HW pitfalls. We’re matching DX11 GPU perf on Maxwell + AMD.
CPU perf: Sure DX12 can be much faster, but if your engine design is such that you don’t swamp the API with draw calls, the actual API overhead might not be significant in your overall CPU cost. We saved ~10% overall renderer time

Full presentation here: Developing The Northlight Engine: Lessons Learned

Jawed · Mar 15, 2016

CarstenS said:
@Jawed: Good input. If no one beats me to it, as soon as i can find the time.

The original website is dead as far as I can tell, so I don't have the test.

I dare say the bandwidth tester in the APP SDK will be more carefully coded, etc.

Clukos · Mar 16, 2016

From MS presentation

Faster porting between Xbox and PC?

Alessio1989 · Mar 16, 2016

Shader Model 6? The shader model shipped with DirectX 12 is 5.1 which is essentially SM 5.0 + direct resource indexing (oh, and Root Signature via HLSL).. Yeah, we do not have a true new shader model since SM 4.0...

Adored · Mar 16, 2016

Clukos said:
From MS presentation

Faster porting between Xbox and PC?

This is something that Dave has been talking about for months.

https://forum.beyond3d.com/threads/...nd-analysis-thread.57188/page-10#post-1869146
https://forum.beyond3d.com/threads/...and-analysis-thread.57188/page-4#post-1867679

DX12 Performance Discussion And Analysis Thread

3dilettante

Jawed

CarstenS

Moderator

fellix

Jawed

CarstenS

Moderator

fellix

CarstenS

Moderator

Clukos

Bloodborne 2 when?

CSI PC

Clukos

Bloodborne 2 when?

Alessio1989

CSI PC

CSI PC

CSI PC

Clukos

Bloodborne 2 when?

Jawed

Clukos

Bloodborne 2 when?

Alessio1989

Adored

Similar threads