DX12 Performance Discussion And Analysis Thread

Discussion in 'Rendering Technology and APIs' started by A1xLLcqAgt0qc2RyMz0y, Jul 29, 2015.

  1. Devnant

    Newcomer

    Joined:
    Sep 3, 2015
    Messages:
    10
    Likes Received:
    7
    I think that happened half way during the single command list run. GPU starts going from 0 to 100% when the single command list starts.
     
    Razor1 and Jawed like this.
  2. CSI PC

    Veteran

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Pharma, thanks for the heads up on the edit count, good to know.

    PadyEos, do you have 3rd party software installed such as MSI Afterburner or anything comparable (even if CPU orientated), or is it pretty clean from that perspective?
    I guess most would have some kind of 3rd party performance related software installed, and most would probably be the MSI Afterburner - yeah very unlikely correlation I know but a variable that cannot be ignored.
    Cheers
     
  3. HyperMatrix

    Joined:
    Sep 4, 2015
    Messages:
    3
    Likes Received:
    0
    Compute only:
    1. 9.76ms
    2. 9.75ms
    3. 9.75ms
    4. 9.75ms
    5. 9.13ms
    6. 8.87ms
    7. 8.87ms
    8. 8.87ms
    9. 8.81ms
    10. 8.49ms
    11. 8.49ms
    12. 8.49ms
    13. 8.51ms
    14. 8.48ms
    15. 8.49ms
    16. 8.51ms
    17. 8.48ms
    18. 8.49ms
    19. 8.51ms
    20. 8.52ms
    21. 8.50ms
    22. 8.52ms
    23. 8.48ms
    24. 8.49ms
    25. 8.49ms
    26. 8.49ms
    27. 8.48ms
    28. 8.51ms
    29. 9.18ms
    30. 8.53ms
    31. 8.50ms
    32. 16.92ms
    33. 19.02ms
    34. 19.02ms
    35. 19.02ms
    36. 19.02ms
    37. 19.03ms
    38. 19.02ms
    39. 19.02ms
    40. 19.02ms
    41. 19.02ms
    42. 19.02ms
    43. 21.17ms
    44. 19.02ms
    45. 19.02ms
    46. 19.02ms
    47. 19.02ms
    48. 19.02ms
    49. 19.02ms
    50. 21.16ms
    51. 19.02ms
    52. 19.02ms
    53. 19.02ms
    54. 19.02ms
    55. 19.02ms
    56. 19.03ms
    57. 19.02ms
    58. 19.02ms
    59. 19.02ms
    60. 19.02ms
    61. 19.02ms
    62. 21.18ms
    63. 19.03ms
    64. 27.43ms
    65. 27.43ms
    66. 27.43ms
    67. 29.59ms
    68. 27.44ms
    69. 27.43ms
    70. 27.43ms
    71. 27.44ms
    72. 29.57ms
    73. 27.43ms
    74. 27.43ms
    75. 27.43ms
    76. 29.59ms
    77. 27.43ms
    78. 27.43ms
    79. 27.43ms
    80. 29.59ms
    81. 27.44ms
    82. 27.43ms
    83. 27.43ms
    84. 27.44ms
    85. 27.45ms
    86. 27.44ms
    87. 27.44ms
    88. 27.44ms
    89. 29.59ms
    90. 27.44ms
    91. 27.44ms
    92. 27.50ms
    93. 27.44ms
    94. 29.58ms
    95. 27.44ms
    96. 35.84ms
    97. 35.84ms
    98. 37.98ms
    99. 37.97ms
    100. 35.84ms
    101. 35.87ms
    102. 35.85ms
    103. 35.85ms
    104. 37.99ms
    105. 35.85ms
    106. 35.85ms
    107. 35.85ms
    108. 37.99ms
    109. 35.85ms
    110. 35.85ms
    111. 38.00ms
    112. 35.85ms
    113. 35.85ms
    114. 38.00ms
    115. 35.85ms
    116. 35.85ms
    117. 35.85ms
    118. 35.87ms
    119. 35.85ms
    120. 35.85ms
    121. 38.00ms
    122. 35.85ms
    123. 35.85ms
    124. 35.85ms
    125. 37.99ms
    126. 35.85ms
    127. 35.85ms
    128. 46.40ms
    Graphics only: 14.24ms (117.80G pixels/s)
    Graphics + compute:
    1. 22.65ms (74.07G pixels/s)
    2. 22.53ms (74.46G pixels/s)
    3. 22.46ms (74.69G pixels/s)
    4. 22.53ms (74.47G pixels/s)
    5. 22.61ms (74.20G pixels/s)
    6. 22.42ms (74.83G pixels/s)
    7. 22.56ms (74.37G pixels/s)
    8. 22.55ms (74.40G pixels/s)
    9. 22.48ms (74.64G pixels/s)
    10. 22.42ms (74.83G pixels/s)
    11. 22.59ms (74.28G pixels/s)
    12. 22.56ms (74.35G pixels/s)
    13. 22.65ms (74.06G pixels/s)
    14. 22.41ms (74.86G pixels/s)
    15. 22.46ms (74.70G pixels/s)
    16. 22.42ms (74.83G pixels/s)
    17. 22.57ms (74.34G pixels/s)
    18. 22.45ms (74.72G pixels/s)
    19. 22.62ms (74.15G pixels/s)
    20. 22.63ms (74.14G pixels/s)
    21. 22.41ms (74.86G pixels/s)
    22. 22.52ms (74.48G pixels/s)
    23. 22.55ms (74.41G pixels/s)
    24. 22.61ms (74.22G pixels/s)
    25. 22.52ms (74.50G pixels/s)
    26. 22.49ms (74.61G pixels/s)
    27. 22.53ms (74.48G pixels/s)
    28. 22.43ms (74.79G pixels/s)
    29. 22.58ms (74.29G pixels/s)
    30. 22.62ms (74.16G pixels/s)
    31. 22.57ms (74.32G pixels/s)
    32. 30.86ms (54.37G pixels/s)
    33. 30.86ms (54.37G pixels/s)
    34. 30.99ms (54.13G pixels/s)
    35. 32.97ms (50.89G pixels/s)
    36. 31.00ms (54.12G pixels/s)
    37. 30.99ms (54.14G pixels/s)
    38. 30.85ms (54.39G pixels/s)
    39. 30.83ms (54.43G pixels/s)
    40. 31.11ms (53.93G pixels/s)
    41. 30.95ms (54.21G pixels/s)
    42. 30.89ms (54.31G pixels/s)
    43. 31.02ms (54.09G pixels/s)
    44. 30.92ms (54.25G pixels/s)
    45. 30.96ms (54.19G pixels/s)
    46. 30.91ms (54.28G pixels/s)
    47. 33.06ms (50.75G pixels/s)
    48. 31.01ms (54.11G pixels/s)
    49. 30.92ms (54.25G pixels/s)
    50. 31.01ms (54.11G pixels/s)
    51. 30.87ms (54.35G pixels/s)
    52. 30.90ms (54.30G pixels/s)
    53. 30.89ms (54.31G pixels/s)
    54. 31.00ms (54.12G pixels/s)
    55. 30.86ms (54.36G pixels/s)
    56. 30.99ms (54.14G pixels/s)
    57. 30.91ms (54.28G pixels/s)
    58. 31.03ms (54.06G pixels/s)
    59. 30.99ms (54.13G pixels/s)
    60. 30.89ms (54.31G pixels/s)
    61. 31.01ms (54.10G pixels/s)
    62. 31.00ms (54.12G pixels/s)
    63. 32.93ms (50.94G pixels/s)
    64. 39.36ms (42.63G pixels/s)
    65. 39.26ms (42.73G pixels/s)
    66. 39.40ms (42.58G pixels/s)
    67. 39.35ms (42.64G pixels/s)
    68. 39.36ms (42.62G pixels/s)
    69. 39.26ms (42.73G pixels/s)
    70. 39.22ms (42.77G pixels/s)
    71. 39.35ms (42.64G pixels/s)
    72. 39.30ms (42.69G pixels/s)
    73. 39.18ms (42.82G pixels/s)
    74. 39.23ms (42.76G pixels/s)
    75. 41.36ms (40.56G pixels/s)
    76. 39.34ms (42.64G pixels/s)
    77. 39.39ms (42.59G pixels/s)
    78. 39.55ms (42.42G pixels/s)
    79. 39.34ms (42.64G pixels/s)
    80. 39.38ms (42.61G pixels/s)
    81. 39.37ms (42.62G pixels/s)
    82. 41.31ms (40.61G pixels/s)
    83. 39.38ms (42.60G pixels/s)
    84. 39.41ms (42.57G pixels/s)
    85. 39.34ms (42.64G pixels/s)
    86. 39.36ms (42.62G pixels/s)
    87. 39.34ms (42.64G pixels/s)
    88. 39.46ms (42.52G pixels/s)
    89. 39.18ms (42.82G pixels/s)
    90. 39.22ms (42.77G pixels/s)
    91. 41.43ms (40.50G pixels/s)
    92. 39.38ms (42.60G pixels/s)
    93. 39.33ms (42.65G pixels/s)
    94. 39.36ms (42.62G pixels/s)
    95. 39.23ms (42.77G pixels/s)
    96. 47.59ms (35.25G pixels/s)
    97. 47.69ms (35.18G pixels/s)
    98. 47.77ms (35.12G pixels/s)
    99. 47.82ms (35.08G pixels/s)
    100. 49.79ms (33.70G pixels/s)
    101. 47.82ms (35.09G pixels/s)
    102. 47.85ms (35.06G pixels/s)
    103. 47.80ms (35.10G pixels/s)
    104. 47.78ms (35.12G pixels/s)
    105. 49.93ms (33.60G pixels/s)
    106. 47.68ms (35.18G pixels/s)
    107. 49.93ms (33.60G pixels/s)
    108. 47.74ms (35.14G pixels/s)
    109. 47.73ms (35.15G pixels/s)
    110. 47.72ms (35.16G pixels/s)
    111. 47.81ms (35.09G pixels/s)
    112. 47.69ms (35.18G pixels/s)
    113. 47.75ms (35.14G pixels/s)
    114. 47.72ms (35.16G pixels/s)
    115. 49.98ms (33.57G pixels/s)
    116. 47.80ms (35.10G pixels/s)
    117. 49.90ms (33.62G pixels/s)
    118. 47.80ms (35.10G pixels/s)
    119. 47.76ms (35.13G pixels/s)
    120. 47.78ms (35.11G pixels/s)
    121. 47.72ms (35.15G pixels/s)
    122. 47.78ms (35.11G pixels/s)
    123. 49.81ms (33.68G pixels/s)
    124. 47.76ms (35.12G pixels/s)
    125. 47.83ms (35.07G pixels/s)
    126. 47.68ms (35.19G pixels/s)
    127. 47.77ms (35.12G pixels/s)
    128. 58.32ms (28.77G pixels/s)
     
  4. pharma

    Veteran

    Joined:
    Mar 29, 2004
    Messages:
    4,894
    Likes Received:
    4,549
    Think you need to run the updated version.
    https://forum.beyond3d.com/posts/1869354/
     
  5. Mindtaker

    Newcomer

    Joined:
    Mar 31, 2015
    Messages:
    16
    Likes Received:
    30
    There's mine, GTX 980 - 355.84 drivers.
     

    Attached Files:

  6. pharma

    Veteran

    Joined:
    Mar 29, 2004
    Messages:
    4,894
    Likes Received:
    4,549
    Newer developer drivers ...
     
  7. Sinistar

    Sinistar I LIVE
    Regular Subscriber

    Joined:
    Aug 11, 2004
    Messages:
    660
    Likes Received:
    74
    Location:
    Indiana
    I saw some running the earlier test, so I went back and tried it.

    I get 52 ms the whole test.

    Edit: will rerun, I forgot I was running "Log.cmd" at the same time.
    Edit2: came out the same
     

    Attached Files:

    #487 Sinistar, Sep 4, 2015
    Last edited: Sep 4, 2015
  8. Mindtaker

    Newcomer

    Joined:
    Mar 31, 2015
    Messages:
    16
    Likes Received:
    30
    Yes, to compare
     
    Razor1 and pharma like this.
  9. Darius

    Newcomer

    Joined:
    Sep 27, 2013
    Messages:
    37
    Likes Received:
    30
    Ok so I recorded a session in WPA and opened it in GPUview, and if I'm interpreting it correctly it may be doing a little async on my 980 Ti? It's hard to tell. It would be very helpful if someone with a Radeon could do this as well so we can compare.
     
  10. Sinistar

    Sinistar I LIVE
    Regular Subscriber

    Joined:
    Aug 11, 2004
    Messages:
    660
    Likes Received:
    74
    Location:
    Indiana
    I recorded the older test, well I ran the newer one first, and the merge file was around 10 Gig.
     
  11. HyperMatrix

    Joined:
    Sep 4, 2015
    Messages:
    3
    Likes Received:
    0

    Attached Files:

  12. Sinistar

    Sinistar I LIVE
    Regular Subscriber

    Joined:
    Aug 11, 2004
    Messages:
    660
    Likes Received:
    74
    Location:
    Indiana
    GPUView zoomed into section with graphics, and compute, older version.
    [​IMG]
     
    Jawed, Fantasma, Razor1 and 1 other person like this.
  13. pharma

    Veteran

    Joined:
    Mar 29, 2004
    Messages:
    4,894
    Likes Received:
    4,549
    Think turning TDR off eliminates the crash. Check out some previous posts on this page regarding TDR.
     
  14. Darius

    Newcomer

    Joined:
    Sep 27, 2013
    Messages:
    37
    Likes Received:
    30
    Excellent, I'm compiling mine now. Can you post a zoom in on the other 3 conditions, that way we have everything?
     
    pharma likes this.
  15. HyperMatrix

    Joined:
    Sep 4, 2015
    Messages:
    3
    Likes Received:
    0
    Yeah I read that afterwards, but I figured 483 was close enough, and did a good enough job showing how useless Maxwell 2 is in that metric. :p
     
  16. Nub

    Nub
    Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    10
    Likes Received:
    18
    Hmm, I've added the "time saved by async" data into the scatter plot mode, in orange, as can be seen here:

    [​IMG]

    I don't think a comparison feature is useful in this context. The test tool was never meant to be a benchmarking tool, so comparing numbers to numbers of different cards is pointless and misleading and may just start more pissing contests.

    As long as you can see whether the orange values are mostly at 0 or tend to stay close to the blue line, that's enough information for our purpose.

    As for plotting multiple charts side by side... it's a web page so just open multiple windows :wink:
     
    pharma likes this.
  17. Nub

    Nub
    Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    10
    Likes Received:
    18
    Correction: blue or red line, whichever is lower.
     
  18. Sinistar

    Sinistar I LIVE
    Regular Subscriber

    Joined:
    Aug 11, 2004
    Messages:
    660
    Likes Received:
    74
    Location:
    Indiana
    the older test only has 2 sections.
    Here is the compute only section.
    [​IMG]
     
    drSeehas, Fantasma and pharma like this.
  19. Forceman

    Newcomer

    Joined:
    Dec 23, 2010
    Messages:
    11
    Likes Received:
    10
    Am I reading this wrong, because it looks like a lot of Maxwell cards save a lot of time using async? And the Furys spend a lot of time saving virtually nothing? And then the 280X saves more time than anything?
     
  20. Darius

    Newcomer

    Joined:
    Sep 27, 2013
    Messages:
    37
    Likes Received:
    30
    Capture2.PNG

    Ok, so a very different picture on the 980 Ti. The way the graph works is that only the block on the bottom is being actively worked on, any blocks stacked on top are queued. The colors correspond to the same work.

    So in the device context you can see the two separate jobs - the top, longer one is the compute from the test app, and the one below it is the graphics load from the test app. Now here's the weird thing - both green blocks are pushed into the 3D queue, and they don't run asynchronously at all - the graphics runs first, then the compute. So what's that brown stuff in the compute queue? It's DWM.exe, the desktop compositor, completely separate from the test. I'm no graphics programmer, but doesn't that seem backwards? That the "compute" job is going in the 3D queue. But the DWM, which I presume would be more graphics related....is going in the compute queue? The DWM corresponds with the flip, so looking at the fury graph, it's probably the light blue block that also corresponds with the flip...and it's in the 3D queue where I would assume it belongs.

    So taking a closer look at a section where only the compute job is running:

    Captur3.PNG


    The asyncompute.exe compute job comes in 4 separate bursts. The brown DWM compute spike overlaps with the first of a new batch of four. The last three are precisely 36ms. But the first is a little longer at 45ms, but the DWM compute spike is 15ms. If they were run serially, it should have been 51ms. And this pattern is repeatable, the sum is always larger than the actual run time. So it looks like there may be some asynchronous behavior here....but not with what we expect it to?

    Now maybe someone else can explain why the compute test isn't going into the compute queue, but for the thing that IS going into that queue, unless I'm reading this entirely incorrectly...is running asynchronously.
     
    drSeehas, Razor1, pharma and 2 others like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...