DX12 Performance Discussion And Analysis Thread

Here's the results from my Titan X SLI: Titan X with display, Titan X without display and 2x Titan X with SLI enabled.
 

Attachments

  • Titan X results.zip
    4.4 KB · Views: 51
Used the new one on my 980TI. Took 14 minutes to run and the Nvidia 355.82 driver crashed at the async compute batch 455(when it got neat to 3000ms/batch).
Also added Afterburner log during the run if it interest anyone. From a quick glance uses more CPU and the GPU switched from 100 to 0% usage when going from one async compute batch to another.
Yep, same for me, 355.60 drivers. Fresh Win 10, new GTX 970.

This is how GPU utilization looked like:
ei0wWLM.png
 

Attachments

  • GTX_970_run_not_finished.zip
    114.8 KB · Views: 18
Last edited:
Used the new one on my 980TI. Took 14 minutes to run and the Nvidia 355.82 driver crashed at the async compute batch 455(when it got neat to 3000ms/batch).
Also added Afterburner log during the run if it interest anyone. From a quick glance uses more CPU and the GPU switched from 100 to 0% usage when going from one async compute batch to another.

Sorry, can't edit. Removed my OC but it still crashed during "Graphics, compute single commandlist". So it is reproducible, second time that the driver crashes when reaching a value between 2900-3000ms.
 
Nice test!

FirePro W8100 (basically an underclocked Radeon R9 290/390, up to 824MHz core) data attached. Up to 505ms compute only and 302ms graphics+compute at iterations 512 respectively.
 

Attachments

  • FirePro_W8100_perf.log.zip
    210.8 KB · Views: 14
Update to FirePro W8100 post above. Interpreted the data wrong originally. Summary:

Compute only:1. 35.47ms ~ 512. 503.18ms
Graphics only: 34.07ms (49.25G pixels/s)
Graphics + compute: 1. 35.33ms (47.49G pixels/s) ~ 512. 505.24ms (3.32G pixels/s)
Graphics, compute single commandlist: 1. 68.41ms (24.52G pixels/s) ~ 512. 302.77ms (5.54G pixels/s)
 
Heres my results. Had to Ctrl+C out of it in the single command list operations due to it starting to take too long. Figured how far it got was plenty.
 

Attachments

  • 980ti.zip
    24.8 KB · Views: 26
Are we looking at a driver response timeout?
Just disable TDR then (through registry); it's what I had to do when debugging cuda kernels.

Thanks, did this. Was then able to finish without driver crash. But I had to restart and now the performance is different(it seems) from my previous runs, faster. Now the GPU remains mostly at 100% when in "Graphics, compute single commandlist" part, instead of switching constantly between 100% and 0%, as it also did for the 970. Do not know why now it runs better...

980TI, 355.82 no crash:

Compute only:1. 5.67ms ~ 512. 76.11ms
Graphics only: 16.77ms (100.06G pixels/s)
Graphics + compute: 1. 21.15ms (79.34G pixels/s) ~ 512. 97.38ms (17.23G pixels/s)
Graphics, compute single commandlist: 1. 20.70ms (81.05G pixels/s) ~ 512. 2294.69ms (0.73G pixels/s)
 

Attachments

  • perf.zip
    342.5 KB · Views: 27
Not that I want to inflict instability on someone's machine, but does the old behavior return if the TDR is restored?
The software might be trying to placate Windows.
 
I'm struggling to see how NVidia is failing by any sensible metric when Graphics + compute completes in 92ms on GTX980Ti and 444ms on Fury X. Or compute only which is 76 versus 468ms. AMD, whatever it's doing, is just broken.

Or maybe Fiji is just spending 25.9ms sleeping, then waking up momentarily to execute a kernel that should take about 8 microseconds.

At least we're seeing some steps on AMD.

3dilettante: wouldn't it be interesting if active TDR is slowing down these tests...
 
Last edited:
Because why not.


HD 7950, 15.7.1 drivers

Compute only:1. 30.68ms ~ 512. 245.03ms
Graphics only: 62.50ms (26.84G pixels/s)
Graphics + compute: 1. 58.89ms (28.49G pixels/s) ~ 512. 245.48ms (6.83G pixels/s)
Graphics, compute single commandlist: 1. 89.52ms (18.74G pixels/s) ~ 512. 303.73ms (5.52G pixels/s)
 

Attachments

  • HD 7950 (15.7.1).zip
    61 KB · Views: 10
  • HD 7950 (15.8 Beta).zip
    60.7 KB · Views: 7
R9 290X:

Compute only: 1. 27.75ms ~~~~ 512. 413.43ms
Graphics only: 26.78ms
Compute + Graphics: 1. 28.54ms ~~~~ 512. 441.78ms
Compute + Graphics, single command list: 1. 54.28ms ~~~~ 512. 250.74ms

Afterburner:

NW6gcXs.png
 

Attachments

  • perf.zip
    189.1 KB · Views: 7
Would someone please explain what the differences between the first and the second test are?
 
I'm struggling to see how NVidia is failing by any sensible metric when Graphics + compute completes in 92ms on GTX980Ti and 444ms on Fury X. Or compute only which is 76 versus 468ms. AMD, whatever it's doing, is just broken.

Hmm... I agree, while the rest of the system is obviously different just looking at the latest two posts:

HD 7950, compute only takes 245.03ms at 512 while the R290X takes 413.43ms?
HD 7950, single cmd list is worse than async by about 15% but the R290X is almost 95% faster with a single cmd list?
 
R9 290x
Compute only:1. 25.95ms ~ 512. 408.44ms
Graphics only: 29.70ms (56.49G pixels/s)
Graphics + compute: 1. 26.51ms (63.29G pixels/s) {67.70 G pixels/s} ~ 512. 388.92ms (4.31G pixels/s){62.44 G pixels/s}
Graphics, compute single commandlist: 64.23ms (26.12G pixels/s) [25.60] {43.63 G pixels/s} ~ 512. 236.06ms (7.11G pixels/s){54.64 G pixels/s}
 
Back
Top