DX12 Performance Discussion And Analysis Thread

Discussion in 'Rendering Technology and APIs' started by A1xLLcqAgt0qc2RyMz0y, Jul 29, 2015.

  1. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,503
    Likes Received:
    420
    Location:
    Varna, Bulgaria
    I can't get the latency reading below 12ms. :(
     
  2. huebie

    Newcomer

    Joined:
    Apr 10, 2012
    Messages:
    29
    Likes Received:
    5
    Did i misinterpret something or is the status quo that AC works fine on maxwell? What have changed so far? I had lost track of this discussion so a short summary would be too kind. :)
     
  3. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    15,476
    Likes Received:
    2,662
    Hope this hasnt been already posted
     
    vLaDv, Razor1 and Lightman like this.
  4. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,733
    Likes Received:
    2,563
    Location:
    Finland
    Apparently the current situation is "yes and it's a disaster"
    Yes: It can run concurrent graphics & compute tasks
    It's a disaster: You partition the GPU resources, let's say you do 50/50 for graphics and compute, and your graphics task finishes before compute task, and you have 50% of your resources idling and waiting for the compute to finish up, you can't change that on the fly, and instead have to do expensive context switch to change the partitioning
     
  5. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,293
    Location:
    Helsinki, Finland
    Could you also post your "Latency (compute starts 10ms after graphics)" results. I am really interested in that particular case on Intel GPUs.
     
  6. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,875
    Likes Received:
    2,181
    Location:
    Germany
    Here's HD530 (def. clocks, but DDR4-3000) in an overclocked i7-6700K.
     

    Attached Files:

  7. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,875
    Likes Received:
    2,181
    Location:
    Germany
    And since I know no one else will bother, here's GTX 580. Yep, _5_80.
    BTW - MDolenc: Does your program require 64-Bit windows or a certain amount of dedicated video memory? It does not run on my cheap-ass tablet with Atom Z3735G and x86-W10.
     

    Attached Files:

    pharma and fellix like this.
  8. MDolenc

    Regular

    Joined:
    May 26, 2002
    Messages:
    690
    Likes Received:
    425
    Location:
    Slovenia
    It's 64bit yes and will require at least 128MB video memory. Let me know if you want to check that one out. :) But it would probably be better to shorten the graphics part for that one, it takes half a second per run on not so slow integrated Intel GPUs.
     
  9. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,875
    Likes Received:
    2,181
    Location:
    Germany
    Na, it's ok. No one's really interested in that 4-EU-crap anyway I guess. It displays the browser window alright and shows some YT vids.
     
    Lightman likes this.
  10. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    591
    Likes Received:
    300
    It would be interesting to see how the high priority queue will react on GCN Gen 4 GPUs: both latency and total rendering time. PS: NDA for RX 480 ends tomorrow :D
     
    CSI PC and Lightman like this.
  11. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    365
    Likes Received:
    319
    While at it, Fiji should be tested again as well. And then compare 480 results against Fury with recent driver.
     
  12. Dygaza

    Newcomer

    Joined:
    Aug 27, 2015
    Messages:
    40
    Likes Received:
    39
    Here are stock Fury-X results for comparision as requested. A lot of variation in latency test. I ran it twice and both were the same (browsers closed).
     

    Attached Files:

    fellix likes this.
  13. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    AMD HD7970 (1050mhz) ( not quite sure aboutt the results as i have 2 installed ( but CFX was disabled )
     

    Attached Files:

  14. Kaarlisk

    Regular Newcomer Subscriber

    Joined:
    Mar 22, 2010
    Messages:
    293
    Likes Received:
    49
    OOps. That was not intentional.
     

    Attached Files:

  15. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,293
    Location:
    Helsinki, Finland
    600+ milliseconds of waiting = no high prio queue support at all (queue it after the GPU is idle). Ouch!

    High priority queues are definitely NOT working properly on Intel GPUs. Intel has UMA and shared caches making low latency GPGPU a perfect use case. Too bad the latency completely tanks when the GPU is rendering at the same time.

    Of course there's also a possibility to use the Intel iGPU solely for gameplay GPGPU tasks and discrete GPU solely for rendering. DX12 explicit multiadapter makes this possible. But if you are using the same shared scene data structures in the GPGPU code and in the rendering code, you need to duplicate them and maintain the state of both (copy modifications between the memory pools). Makes things complicated. And even this doesn't solve the case where the consumer only has an Intel iGPU or if he/she has a 6+ core Xeon/i7 (no iGPU, only discrete).

    It seems that high prio compute queues are NOT yet ready for shipping games. AMD has been ready since 2011. Nvidia is now ready with Maxwell and Pascal (Maxwell suffers some penalty, but gets the job done). Hopefully Intel could fix their high prio queues with a new driver (to match Maxwell's functionality). People seem to be blinded by concurrent execution. It is solely a GPU performance gain. High priority queues on the other hand enable games to offload game logic to the GPU, allowing completely new gameplay. Some modern console games do this already, making it hard to port them to PC.
     
    #1455 sebbbi, Jun 29, 2016
    Last edited: Jun 29, 2016
    Lightman and chris1515 like this.
  16. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    591
    Likes Received:
    300
    GCN Gen 4 should provide better support for high priority compute queue. I guess the same feature will be on the next iteration of Microsoft and Sony consoles (though I hope in a new rasterizer too!). It would be a lot interesting to see how much important will become such feature. But yes, we are still far from having three priority options on engine queues (actually D3D12 exposes only two priority-value, normal and high. Low/background priority is missing).
     
    Lightman likes this.
  17. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    365
    Likes Received:
    319
    In fact, that part is only software. GCN3, especially Fiji, *used to* have worse support than it does now. With a recent driver, Fiji uses an entirely different MEC firmware, feature-wise very similar to what is known about Polaris. I'm not exactly sure about Tonga, I can't remember whether Tonga already had sufficient memory for the full MEC firmware, or only a cut down version.
     
    #1457 Ext3h, Jun 29, 2016
    Last edited: Jun 29, 2016
  18. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    I see on polaris that Hws feature can be updated via micro code, but i can imagine it was allready the case before.
     
  19. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,875
    Likes Received:
    2,181
    Location:
    Germany
  20. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    365
    Likes Received:
    319
    It was. However the maximum possible size of the micro code differs significantly. Only since Fiji, there is sufficient memory available to pack all the desired functionality into a single firmware image.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...