Direct3D feature levels discussion

Discussion in 'Rendering Technology and APIs' started by DmitryKo, Feb 20, 2015.

  1. Per Lindstrom

    Newcomer Subscriber

    Joined:
    Oct 16, 2018
    Messages:
    51
    Likes Received:
    51
    Location:
    Sweden
    I got "DISPATCH_BOUNDARY (1)".
    Could it be that your Hardware-accelerated scheduler is disabled?
    Code:
    Windows 10 version 2004 (build 19041.329 vb_release) x64
    
    ADAPTER 0
    "AMD Radeon RX 5700 XT"
    VEN_1002, DEV_731F, SUBSYS_E4111DA2, REV_C1
    Dedicated video memory : 8148.5 MB (8544296960 bytes)
    Total video memory : 40883.3 MB (42869293056 bytes)
    Video driver version : 27.20.1017.4017
    WDDM version : KMT_DRIVERVERSION_WDDM_2_7 (2700)
    Hardware-accelerated scheduler : Enabled, supported
    GraphicsPreemptionGranularity : DXGI_GRAPHICS_PREEMPTION_PRIMITIVE_BOUNDARY (1)
    ComputePreemptionGranularity : DXGI_COMPUTE_PREEMPTION_DISPATCH_BOUNDARY (1)
    Maximum feature level : D3D_FEATURE_LEVEL_12_1 (0xc100)
    DoublePrecisionFloatShaderOps : 1
    OutputMergerLogicOp : 1
    MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_16_BIT (2) (0b0000'0010)
    TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_3 (3)
    ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
    PSSpecifiedStencilRefSupported : 1
    TypedUAVLoadAdditionalFormats : 1
    ROVsSupported : 1
    ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_3 (3)
    StandardSwizzle64KBSupported : 0
    CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
    CrossAdapterRowMajorTextureSupported : 0
    VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 1
    ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_2 (2)
    MaxGPUVirtualAddressBitsPerResource : 44
    MaxGPUVirtualAddressBitsPerProcess : 44
    Adapter Node 0:         TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1, HeapSerializationTier: 0, ProtectedResourceSession.Support: 1, ProtectedResourceSessionTypeCount: 1 D3D12_PROTECTED_RESOURCES_SESSION_HARDWARE_PROTECTED
    HighestShaderModel : D3D12_SHADER_MODEL_6_5 (0x0065)
    WaveOps : 1
    WaveLaneCountMin : 32
    WaveLaneCountMax : 64
    TotalLaneCount : 2560
    ExpandedComputeResourceStates : 1
    Int64ShaderOps : 1
    RootSignature.HighestVersion : D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
    DepthBoundsTestSupported : 1
    ProgrammableSamplePositionsTier : D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
    ShaderCache.SupportFlags : D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY | AUTOMATIC_INPROC_CACHE | AUTOMATIC_DISK_CACHE (15) (0b0000'1111)
    CopyQueueTimestampQueriesSupported : 1
    CastingFullyTypedFormatSupported : 1
    WriteBufferImmediateSupportFlags : D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY (15) (0b0000'1111)
    ViewInstancingTier : D3D12_VIEW_INSTANCING_TIER_1 (1)
    BarycentricsSupported : 0
    ExistingHeaps.Supported : 1
    MSAA64KBAlignedTextureSupported : 1
    SharedResourceCompatibilityTier : D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_2 (2)
    Native16BitShaderOpsSupported : 1
    AtomicShaderInstructions : 0
    SRVOnlyTiledResourceTier3 : 1
    RenderPassesTier : D3D12_RENDER_PASS_TIER_0 (0)
    RaytracingTier : D3D12_RAYTRACING_TIER_NOT_SUPPORTED (0)
    AdditionalShadingRatesSupported : 0
    PerPrimitiveShadingRateSupportedWithViewportIndexing : 0
    VariableShadingRateTier : D3D12_VARIABLE_SHADING_RATE_TIER_NOT_SUPPORTED (0)
    ShadingRateImageTileSize : 0
    BackgroundProcessingSupported : 0
    MeshShaderTier : D3D12_MESH_SHADER_TIER_NOT_SUPPORTED (0)
    SamplerFeedbackTier : D3D12_SAMPLER_FEEDBACK_TIER_NOT_SUPPORTED (0)
    DirectML maximum feature level : DML_FEATURE_LEVEL_2_0 (0x2000)
    Metacommands enumerated : 4
    Metacommands [parameters per stage]: Conv (Convolution) [84][1][6], Conv (Convolution) [108][5][6], GEMM (General matrix multiply) [67][1][6], GEMM (General matrix multiply) [91][5][6]
    
     
    #1081 Per Lindstrom, Jul 8, 2020
    Last edited: Jul 8, 2020
  2. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    881
    Likes Received:
    1,022
    Location:
    55°38′33″ N, 37°28′37″ E
    I've tweaked the output for the WDMM 2.7 path to report hardware-accelerated scheduling as Enabled by default, Enabled, Disabled, and Not supported - so 'Enabled' / 'Disabled' now imply the feature is supported in the driver, whereas 'Not supported' means there is no driver support.
    Please download the tool again.

    That's possible - I have a different driver installed, the WDDM 2.9 beta for WSL2 which doesn't support hardware scheduler.
     
    #1082 DmitryKo, Jul 8, 2020
    Last edited: Jul 9, 2020
  3. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    881
    Likes Received:
    1,022
    Location:
    55°38′33″ N, 37°28′37″ E
    #1083 DmitryKo, Jul 9, 2020
    Last edited: Jul 9, 2020
  4. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,206
    Likes Received:
    1,601
    Location:
    msk.ru/spb.ru
    201xx builds are for 21H1 release, no?
     
  5. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    881
    Likes Received:
    1,022
    Location:
    55°38′33″ N, 37°28′37″ E
    #1085 DmitryKo, Jul 9, 2020
    Last edited: Jul 9, 2020
  6. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,206
    Likes Received:
    1,601
    Location:
    msk.ru/spb.ru
  7. Ike Turner

    Veteran Regular

    Joined:
    Jul 30, 2005
    Messages:
    2,105
    Likes Received:
    2,290
     
    DavidGraham, iroboto and BRiT like this.
  8. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,789
    Likes Received:
    3,958
    Location:
    Finland
    I thought DirectX 12 Ultimate was supposed to be available already, all those Direct3D 12 features included, but they imply it's only upcoming and currently in insider builds?
     
  9. Ryan Smith

    Regular

    Joined:
    Mar 26, 2010
    Messages:
    629
    Likes Received:
    1,131
    Location:
    PCIe x16_1
    The features are all there. But Microsoft didn't actually add the feature level enumeration to Win10 2004.

    https://devblogs.microsoft.com/directx/announcing-directx-12-ultimate/#comment-92

    "We will be adding a 12_2 feature level in the API in the next update to Windows after 20H1. For now, all the features that make up DirectX 12 Ultimate are implemented ready for games to start using, but the feature level enum itself is not yet implemented,"
     
    PSman1700, Kej, Malo and 4 others like this.
  10. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    306
    Likes Received:
    345
    The best still has yet to come for D3D12/DXC. Shader model 6.6 introduces true bindless resources with the GetResourceFromHeap method.

    I also heard from a former Intel engineer that some Intel HW is a closer match to a descriptor table rather than actual GPU addresses like we would see on the other hardware (AMD/NV) so D3D12 started with descriptor indexing rather than pointers or texture handles ...
     
    jlippo and Malo like this.
  11. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,206
    Likes Received:
    1,601
    Location:
    msk.ru/spb.ru
  12. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    18,795
    Likes Received:
    21,097
  13. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,206
    Likes Received:
    1,601
    Location:
    msk.ru/spb.ru
    The question I have (and why I've posted this here) - considering that it routes the decompression onto GPU h/w will it be a part of some feature level? And what GPUs will support this?
     
    BRiT likes this.
  14. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,021
    Likes Received:
    15,765
    Location:
    The North
    For now we only know the RTX series will support it. We don't know why just those. GPU decompression over compute is supported on a lot of GPUs. If you use parquet format for data (I tend towards this from hadoop) RapidsAI supports GPU decompress of different algorithms for parquet:

    This adds an interface for using GPU-accelerated reading and converting of Parquet to cuDF.

    List of changes:

    • adds a simple GPU-accelerated Parquet-to-cuDF reader
    • adds a top-level read_parquet interface and associated python bindings
    • adds an engine parameter to select between pyarrow and cudf implementations
    • adds to the existing parameterized pytest to check against pyarrow reference
    • adds GPU-accelerated decompression for Brotli, Gzip and Snappy compressed data
    I use snappy. Which is about 5:1 compression for my data sets.

    None of this is related to direct storage though, I just wanted to point out that it's normal for GPUs to do this type of work. This is likely something else.
     
    #1094 iroboto, Sep 2, 2020
    Last edited: Sep 2, 2020
    BRiT and pharma like this.
  15. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    605
    Likes Received:
    320
    would be nice if they will support lzx or xpress and made them as default compression in ntfs, with automatic re-compression on file changes.
     
  16. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,206
    Likes Received:
    1,601
    Location:
    msk.ru/spb.ru
    GeForce RTX 3080

    Code:
    Direct3D 12 feature checker (May 2020) by DmitryKo (x64)
    https://forum.beyond3d.com/posts/1840641/
     
    Windows 10 version 2004 (build 19041.508 vb_release) x64
     
    ADAPTER 0
    "NVIDIA GeForce RTX 3080"
    VEN_10DE, DEV_2206, SUBSYS_22061569, REV_A1
    Dedicated video memory : 10078.0 MB (10567548928 bytes)
    Total video memory : 18237.0 MB (19122927616 bytes)
    Video driver version : 27.21.14.5616
    WDDM version : KMT_DRIVERVERSION_WDDM_2_7 (2700)
    Hardware-accelerated scheduler : Disabled, supported
    GraphicsPreemptionGranularity : DXGI_GRAPHICS_PREEMPTION_PIXEL_BOUNDARY (3)
    ComputePreemptionGranularity : DXGI_COMPUTE_PREEMPTION_DISPATCH_BOUNDARY (1)
    Maximum feature level : D3D_FEATURE_LEVEL_12_1 (0xc100)
    DoublePrecisionFloatShaderOps : 1
    OutputMergerLogicOp : 1
    MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_16_BIT (2) (0b0000'0010)
    TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_3 (3)
    ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
    PSSpecifiedStencilRefSupported : 0
    TypedUAVLoadAdditionalFormats : 1
    ROVsSupported : 1
    ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_3 (3)
    StandardSwizzle64KBSupported : 0
    CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
    CrossAdapterRowMajorTextureSupported : 0
    VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 1
    ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_2 (2)
    MaxGPUVirtualAddressBitsPerResource : 40
    MaxGPUVirtualAddressBitsPerProcess : 40
    Adapter Node 0:     TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1, HeapSerializationTier: 0, ProtectedResourceSession.Support: 1, ProtectedResourceSessionTypeCount: 1 D3D12_PROTECTED_RESOURCES_SESSION_HARDWARE_PROTECTED
    HighestShaderModel : D3D12_SHADER_MODEL_6_5 (0x0065)
    WaveOps : 1
    WaveLaneCountMin : 32
    WaveLaneCountMax : 32
    TotalLaneCount : 8704
    ExpandedComputeResourceStates : 1
    Int64ShaderOps : 1
    RootSignature.HighestVersion : D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
    DepthBoundsTestSupported : 1
    ProgrammableSamplePositionsTier : D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
    ShaderCache.SupportFlags : D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY (3) (0b0000'0011)
    CopyQueueTimestampQueriesSupported : 1
    CastingFullyTypedFormatSupported : 1
    WriteBufferImmediateSupportFlags : D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY | VIDEO_DECODE | VIDEO_PROCESS | VIDEO_ENCODE (127) (0b0111'1111)
    ViewInstancingTier : D3D12_VIEW_INSTANCING_TIER_3 (3)
    BarycentricsSupported : 1
    ExistingHeaps.Supported : 1
    MSAA64KBAlignedTextureSupported : 1
    SharedResourceCompatibilityTier : D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_2 (2)
    Native16BitShaderOpsSupported : 1
    AtomicShaderInstructions : 0
    SRVOnlyTiledResourceTier3 : 1
    RenderPassesTier : D3D12_RENDER_PASS_TIER_0 (0)
    RaytracingTier : D3D12_RAYTRACING_TIER_1_1 (11)
    AdditionalShadingRatesSupported : 1
    PerPrimitiveShadingRateSupportedWithViewportIndexing : 1
    VariableShadingRateTier : D3D12_VARIABLE_SHADING_RATE_TIER_2 (2)
    ShadingRateImageTileSize : 16
    BackgroundProcessingSupported : 1
    MeshShaderTier : D3D12_MESH_SHADER_TIER_1 (10)
    SamplerFeedbackTier : D3D12_SAMPLER_FEEDBACK_TIER_0_9 (90)
    DirectML maximum feature level : DML_FEATURE_LEVEL_2_0 (0x2000)
    Metacommands enumerated : 7
    Metacommands [parameters per stage]: Conv (Convolution) [84][1][6], CopyTensor [3][1][31], MVN (Mean Variance Normalization) [67][1][6], GEMM (General matrix multiply) [67][1][6], Conv (Convolution) [108][5][6], GEMM (General matrix multiply) [91][5][6], MVN (Mean Variance Normalization) [91][5][6]
     
    The only difference with Turing so far (with Turing on an older driver though) is PerPrimitiveShadingRateSupportedWithViewportIndexing : 1
     
  17. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,206
    Likes Received:
    1,601
    Location:
    msk.ru/spb.ru
    I've checked Turing (2080) on 456.38 driver and it still shows "PerPrimitiveShadingRateSupportedWithViewportIndexing : 0" so this seems like an additional Ampere only feature for now.
     
  18. oscarbg

    Newcomer

    Joined:
    Sep 2, 2009
    Messages:
    35
    Likes Received:
    13
    hoping to see some Intel Xe (gen12) report..

    also next week some AMD RDNA2 6800/6900 report..
     
  19. CarstenS

    Legend Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,360
    Likes Received:
    3,096
    Location:
    Germany
    RDNA2: Radeon 6800 non-XT:
    What seems interesting:
    - GraphicsPreemptionGranularity worse than RTX 30
    - ComputePreemptionGranularity worse than RTX 30
    - TiledResourcesTier better than RTX 30
    - PSSpecifiedStencilRefSupported better than RTX 30
    - MaxGPUVirtualAddressBitsPerResource better than RTX 30
    - MaxGPUVirtualAddressBitsPerProcess better than RTX 30
    - HighestShaderModel better than RTX 30
    - TotalLaneCount "worse" (i.e. less) than RTX 3080/3090
    - ShaderCache.SupportFlags AUTOMATIC_INPROC_CACHE | AUTOMATIC_DISK_CACHE (15)
    - AdditionalShadingRatesSupported worse than RTX 30
    - ShadingRateImageTileSize smaller (i.e. better) than RTX 30
    - BackgroundProcessingSupported worse than RTX 30
    - SamplerFeedbackTier better than RTX 30
    - Metacommands less than RTX 30
     
    #1099 CarstenS, Nov 18, 2020
    Last edited: Nov 18, 2020
    DmitryKo, oscarbg, tinokun and 3 others like this.
  20. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,206
    Likes Received:
    1,601
    Location:
    msk.ru/spb.ru
    Both are worse though.

    Same if you're testing with experimental features enabled.
     
    pharma likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...