HAGS: Hardware Accelerated GPU Scheduling *Newt Thread*

Discussion in 'Architecture and Products' started by iroboto, Jun 24, 2020.

  1. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,535
    Likes Received:
    2,220
    The fellow (PhazDelta) who tested Hardware Scheduling in RDR2 stated he had other issues with Win 2004 rolled back to Win 1909.
    It fixed his issues (slowdown in GPU usage) and decided to rerun RDR2 to see if any changes with HAGS. Nvidia's customer support rep noticed his postings and asked him to fill a bug rep.

    451.48 on win 1909 Vulkan api using GTX 1080
    [​IMG]
     
  2. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    7,618
    Likes Received:
    3,682
    Location:
    Pennsylvania
    @BRiT can we split off HAGS discussion? Probably deserves it's own thread, with an obligatory reference to hideous witches?
     
    BRiT and pharma like this.
  3. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    7,618
    Likes Received:
    3,682
    Location:
    Pennsylvania
    I'm confused here. Isn't 2004 required for HAGS?
     
  4. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,535
    Likes Received:
    2,220
    I'm not sure, I just know the drivers did not enable the feature.

    I will ask the PhazDelta to complete his 1909 test and run w/o HAGS.
     
  5. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,535
    Likes Received:
    2,220
    Here are some tests done with an i5-4570 and a GTX 1060 on W10 2004.
    Basically an apples to oranges comparison since he's using two different drivers, but frametime chart comparison is interesting, seems less "noisy".

    Pc Specs :
    GPU : Msi Gaming X 1060 6Gb
    CPU : I5 4570
    RAM : 8GB Ram
    PSU : Antec 500W Platinum
    MoBo : Acer

     
  6. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,535
    Likes Received:
    2,220
    Yes, 2004 is required. PhazDelta responded said the 1909 test was w/o HAGS since the option was not available.
     
    Malo likes this.
  7. T2098

    Joined:
    Jun 15, 2020
    Messages:
    4
    Likes Received:
    11
    The cards with limited VRAM look to be the real winners here. I wish I hadn't given away my 1060 3GB or I'd be running some tests of my own, particularly in VR which I expect to see a massive performance increase.



    The above video shows up to 3x (!!!) performance increases at certain resolutions in GTAV with the 1060 3GB.
     
  8. T2098

    Joined:
    Jun 15, 2020
    Messages:
    4
    Likes Received:
    11
    From watching the video again it's not quite as dramatic of a change - with HAGS on there are still situations where the framerate drops from ~30-40fps to ~10fps, but there does seem to be less of them, and in a few specific scenes the HAGS on scenario eliminates those sustained drops entirely.

    Looks like HAGS just does a better job of managing VRAM usage if it's full up.
     
  9. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,464
    Likes Received:
    831
    Location:
    France
    Will amd cards with hbcc benefit of that too, since afaik, when you enable hbcc, more vram management is left to the gpu/driver ?
     
    Per Lindstrom likes this.
  10. Per Lindstrom

    Newcomer Subscriber

    Joined:
    Oct 16, 2018
    Messages:
    33
    Likes Received:
    29
    Lightman, Malo and BRiT like this.
  11. BRiT

    BRiT Verified (╯°□°)╯
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    15,537
    Likes Received:
    14,090
    Location:
    Cleveland
    Any chance of seeing HAGS with embedded GPUs?
     
  12. Remij

    Newcomer

    Joined:
    May 3, 2008
    Messages:
    99
    Likes Received:
    120
    New DX Dev Blog update regarding Hardware Accelerated GPU Scheduling

    https://devblogs.microsoft.com/directx/hardware-accelerated-gpu-scheduling/

     
  13. OlegSH

    Regular Newcomer

    Joined:
    Jan 10, 2010
    Messages:
    388
    Likes Received:
    331
    8th page of this doc:
    https://developer.nvidia.com/sites/.../GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf

    Command Lists #2
    Each ‘ExecuteCommandLists’ has a fixed CPU overhead
    Underneath this call triggers a flush
    So batch up command lists.
    Try to put at least 200 μs of GPU work in each ‘ExecuteCommandLists’, preferably 500μs
    Submit enough work to hide OS scheduling latency
    Small calls to ‘ExecuteCommandLists’ complete faster than the OS scheduler can submit new ones


    So HW shcedulling should minimize the OS scheduling overhead for small ECL calls in theory
    Wonder how this feature deals with recent low-latency features of NVIDIA and AMD drivers.
    The new ultra low latency modes minimize CPU queue size and have some perf overhead, wonder whether HW GPU schedulling can fix the issue, while still keeping latency low.
     
    pjbliverpool and Dictator like this.
  14. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    384
    Likes Received:
    389
    Is there even any indicator HW scheduling is even expected to reduce fixed submission costs, or is this effectively "only" permitting semaphores and the merging of independent contexts / queues to be shifted from OS scheduler to lower end of driver / hardware?

    In the later case it's still expected to get rid of pipeline stalls introduced by bad scheduling choices on semaphores on the OS side. E.g. when you have a tight, bidirectional dependency between queues you end up with a lot of latency on every blocking (unfulfilled) semaphore, due to work not even having been *really* submitted to device queue yet, as semaphores on Windows had been a pure software construct managed by OS, handled entirely even before priority scheduling.
     
  15. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,535
    Likes Received:
    2,220
    Hardware Accelerated GPU Scheduling
    June 30, 2020
    https://devblogs.microsoft.com/directx/hardware-accelerated-gpu-scheduling/
     
    Kyyla, T2098, Lightman and 2 others like this.
  16. Per Lindstrom

    Newcomer Subscriber

    Joined:
    Oct 16, 2018
    Messages:
    33
    Likes Received:
    29
    From your link.
    "The goal of the first phase of hardware accelerated GPU scheduling is to modernize a fundamental pillar of the graphics subsystem and to set the stage for things to come… but that’s going to be a story for a another time ."

    Same score in Time Spy, with and without HAGS for me.
    https://www.3dmark.com/compare/spy/12774709/spy/12268374
    Think this technology need some time to mature, before we really can enjoy the benefits.
     
    #36 Per Lindstrom, Jul 1, 2020
    Last edited: Jul 1, 2020
    pharma likes this.
  17. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,081
    Likes Received:
    2,952
    Location:
    Finland
    To my understanding it's not that much about the technology itself but the way current applications are built for the old system, hiding the latency like mentioned in MS's devblog
     
  18. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    802
    Likes Received:
    826
    Location:
    55°38′33″ N, 37°28′37″ E
    Anti-lag reduces latency by reducing buffering of commands on the CPU side. Hardware scheduling IHMO goes even further by freeing the CPU from micromanaging the buffers, thus reducing driver overhead.

    They've said it explicitly in the blog post: "the new scheduler reduces the overhead of GPU scheduling".

    WDDM driver is mostly not reentrant, so each driver call is served on the FIFO (first come - first served) base. Thus eliminating a realtime-priority driver thread that processes GPU command buffers would reduce processing overhead.


    BTW they've also said that it's only "the first phase of hardware accelerated GPU scheduling is to modernize a fundamental pillar of the graphics subsystem and to set the stage for things to come"... WDDM 3.0?
     
    #38 DmitryKo, Jul 2, 2020
    Last edited: Jul 2, 2020
    pharma and Krteq like this.
  19. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    802
    Likes Received:
    826
    Location:
    55°38′33″ N, 37°28′37″ E
    BTW, WDDM 2.7 reports the state of the hardware-accelerated GPU scheduler by three separate on/off caps bits: HwSchSupported, HwSchEnabled, HwSchEnabledByDefault and this potentially allows some bizarre combinations like combining 'not supported' with 'enabled' and 'enabled by default'.

    WDDM 2.9 reports the enabled/disabled state with HwSchEnabled bit, but it uses a generic HwSchSupportState field which reports the details of the feature implemenation in the driver: not supported (always off), experimental, stable, and always on (feature is required for the driver to work).

    It seems like these generic DXGK_FEATURE_SUPPORT_* states would replace individual version-specific feature caps bits for future revisions of WDDM interfaces (see d3dkmdt.h in the latest Insider Preview SDK 20161).
     
    #39 DmitryKo, Jul 2, 2020
    Last edited: Jul 2, 2020
    Dictator, pharma and Krteq like this.
  20. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    10,887
    Likes Received:
    2,062
    its a big focus for Windows 10 X and why it was delayed. They are trying to modernize windows so it will work better on lower end processors. I got to try a neo with 10x and with regular 10. The responsiveness of the os was much improved. Sadly 10x seems delayed into the new year
     
    ethernity likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...