Direct3D feature levels discussion

Discussion in 'Rendering Technology and APIs' started by DmitryKo, Feb 20, 2015.

  1. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    1,458
    Likes Received:
    202
    Location:
    msk.ru/spb.ru
    Volta / Turing's async execution is similar to GCN / RDNA's. It's not that big of an uplift over Pascal though as NV's multiprocessors aren't idling too often in graphics in the first place.
     
  2. JoeJ

    Regular Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    841
    Likes Received:
    938
    Sounds great :D
    It really matters if your workloads are not like those huge brute force tasks seen in games ;)
     
    DavidGraham likes this.
  3. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    365
    Likes Received:
    319
    Queues in Vulkan are not necessarily same as queues on GPU though, let alone the distinct engine types which are fed by each queue. They are merely to be read as the domain in which barriers are to be evaluated in submission order. As long as none of the synchronization constraints (barriers, semaphores) are violated, work may be scheduled wherever the driver sees fit. If the driver honors your choice by issuing work submitted on 3D / compute / copy queue only to specialized engines, then that's already voluntarily.

    Especially for work submitted to the copy queue, there are several cases where you actually end up with a kernel launch under the hood if the copy engine doesn't support the request format conversion.
    Other way around, submissions to the same logical queue in Vulkan may very well be split round-robin to several device side queues in order to increase out-of-order-execution depth, as long as no ordering constraints are violated. Except this doesn't happen with any Vulkan driver as far as I can tell, so far.

    It mostly boils down to where the driver vendor draws the line between "implicit semantic the developer may assume even though not backed by specifition", and "optimization within bounds of specified observable effects".

    And that isn't even limited to GCN. What you are seeing there, is pretty much just out-of-order-execution for kernel launches from the same device side queue, which is legal whenever there is no barrier in the queue. Which NVidias GPUs do as well, albeit historically with the catch that they could only do so if it's a compatible kernel configuration, due to the design limitation of having to pre-configure SMs for a number of kernel properties and then getting stuck with that configuration. If that wasn't a feature, you couldn't have related features like efficient render passes for small geometry (->no full device launch for fragment shader) either.

    Also, no, can't do that in DX11. Not unless you are strict about what you bind as write / read only, and everything you have bound as writable is distinct between launches. Two kernel launches having write access to the same resources imply a barrier. Only exception is for draw calls due to having the output merger in there giving defined behavior even for overlapping launches.

    The other half, as mentioned, is that Vulkan queues are really just a synchronization domain:
    In comparison, it's actually funny that on NVidia's own cuda API, you can't express that overlapped launches are legal for a single Cuda stream. Which is why they pretty much abandoned that "stream" concept recently in favor of explicit dependency annotations, too. Which are then baked into chunks of kernels which can be launched together without barriers.
     
    #1023 Ext3h, Nov 16, 2019
    Last edited: Nov 16, 2019
    JoeJ likes this.
  4. Per Lindstrom

    Newcomer Subscriber

    Joined:
    Oct 16, 2018
    Messages:
    24
    Likes Received:
    18
    Windows 1903, Radeon 5700 XT, Driver version 19.12.2, some errors.
    Direct3D 12 feature checker (July 2019) by DmitryKo (x64)
    https://forum.beyond3d.com/posts/1840641/

    Windows 10 version 1909 (build 18363.535 19h1_release) x64

    ADAPTER 0
    "AMD Radeon RX 5700 XT"
    VEN_1002, DEV_731F, SUBSYS_E4111DA2, REV_C1
    Dedicated video memory : 8151.4 MB (8547397632 bytes)
    Total video memory : 40886.3 MB (42872410112 bytes)
    Video driver version : 26.20.15002.61
    Maximum feature level : D3D_FEATURE_LEVEL_12_1 (0xc100)
    DoublePrecisionFloatShaderOps : 1
    OutputMergerLogicOp : 1
    MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_16_BIT (2) (0b0000'0010)
    TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_3 (3)
    ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
    PSSpecifiedStencilRefSupported : 1
    TypedUAVLoadAdditionalFormats : 1
    ROVsSupported : 1
    ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_3 (3)
    StandardSwizzle64KBSupported : 0
    CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
    CrossAdapterRowMajorTextureSupported : 0
    VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 1
    ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_2 (2)
    MaxGPUVirtualAddressBitsPerResource : 44
    MaxGPUVirtualAddressBitsPerProcess : 44
    Adapter Node 0: TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1, HeapSerializationTier: 0, ProtectedResourceSession.Support: 1Failed to query protected resource session type count
    Error 80070057: Felaktig parameter.

    Failed to query protected resource session types
    Error 80070057: Felaktig parameter.

    Failed to query maximum shader model
    Error 80070057: Felaktig parameter.

    WaveOps : 1
    WaveLaneCountMin : 32
    WaveLaneCountMax : 64
    TotalLaneCount : 2560
    ExpandedComputeResourceStates : 1
    Int64ShaderOps : 1
    RootSignature.HighestVersion : D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
    DepthBoundsTestSupported : 1
    ProgrammableSamplePositionsTier : D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
    ShaderCache.SupportFlags : D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY | AUTOMATIC_INPROC_CACHE | AUTOMATIC_DISK_CACHE (15) (0b0000'1111)
    CopyQueueTimestampQueriesSupported : 1
    CastingFullyTypedFormatSupported : 1
    WriteBufferImmediateSupportFlags : D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY (15) (0b0000'1111)
    ViewInstancingTier : D3D12_VIEW_INSTANCING_TIER_1 (1)
    BarycentricsSupported : 0
    ExistingHeaps.Supported : 1
    MSAA64KBAlignedTextureSupported : 1
    SharedResourceCompatibilityTier : D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_1 (1)
    Native16BitShaderOpsSupported : 1
    AtomicShaderInstructions : 0
    SRVOnlyTiledResourceTier3 : 1
    RenderPassesTier : D3D12_RENDER_PASS_TIER_0 (0)
    RaytracingTier : D3D12_RAYTRACING_TIER_NOT_SUPPORTED (0)
    AdditionalShadingRatesSupported : 0
    PerPrimitiveShadingRateSupportedWithViewportIndexing : 0
    VariableShadingRateTier : D3D12_VARIABLE_SHADING_RATE_TIER_NOT_SUPPORTED (0)
    ShadingRateImageTileSize : 0
    BackgroundProcessingSupported : 0
    Failed to query feature data 7
    Error 80070057: Felaktig parameter.

    Metacommands enumerated : 4
    Metacommands [parameters per stage]: Conv (Convolution) [84][1][6], Conv (Convolution) [108][5][6], GEMM (General matrix multiply) [67][1][6], GEMM (General matrix multiply) [91][5][6]
     
  5. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    722
    Likes Received:
    643
    Location:
    55°38′33″ N, 37°28′37″ E
    These features are only available on Windows 20H1 - my tool assumes build 18363 belongs to 20H1 branch, but Microsoft assigned it to Windows 1909 since August... will fix it soon.
     
    Per Lindstrom and BRiT like this.
  6. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,183
    Likes Received:
    3,286
    Windows 10 20H1 (Version 2004) includes WDDM 2.7.

    Available in Windows 10 Insider with Nvidia Driver 450.12, and Intel 27.20.100.7859, & AMD 27.20.1000.8009 or newer in insider builds starting from 10.0.19041.84.

    • Hardware-accelerated GPU scheduling: It allows the video card to directly manage its video memory[49], which in turn significantly improves the performance of the minimum and average FPS, and thereby reducing latency. It works regardless of the API used for games and applications such as DirectX/Vulkan/OpenGL. (According to observations at the current time before the release of Windows 10 version 2004, the option requires hardware support for the Shader Model not lower than version 6.3, which can be found through AIDA64, but not GPU-Z, as it displays not reliable information) It is supported by Nvidia Geforce video cards starting from the 10th series, as well as integrated graphics from Intel HD 500 or later in both cases. But it is not supported by AMD cards, since the level of functions of the shader model is not updated and remains at 5.1 in the hardware and 6.2 for the latest cards, which is not enough to support to enable this option. And also it is worth noting that the forced change of the option through the registry keys does not affect for unsupported cards. It is also possible that this technology is associated with the description of this patent.
    • Shader Model 6.5
    • DirectX 12 Raytracing Tier 1.1
    • DirectX 12 Mesh Shader
    • DirectX 12 Sampler Feedback: Texture Streaming, Texture-Space Shading

    https://en.wikipedia.org/wiki/Windows_Display_Driver_Model#WDDM_2.7
     
    Dictator, jlippo, Malo and 2 others like this.
  7. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    7,441
    Likes Received:
    3,461
    Location:
    Pennsylvania
    What hardware features in particular is AMD missing for shader model 6.3?
     
  8. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,726
    Likes Received:
    2,552
    Location:
    Finland
    Hm? GCN 2.0 and newer supports Shader Model 6.3 since 18.10.1 drivers
     
  9. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,325
    Likes Received:
    1,952
    I imagine features related to DX12 raytracing acceleration. Not sure, but can AMD gpu's now utilize the DXR fallback layer if required?
     
  10. JoeJ

    Regular Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    841
    Likes Received:
    938
    Guess no.
    Just yesterday i installed newest Visual Studio to try a DX12 demo of somebody, but it failed at initializing DXR fallback with Vega56 GPU.
     
    Dictator and pharma like this.
  11. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    7,441
    Likes Received:
    3,461
    Location:
    Pennsylvania
    What does hardware-accelerated GPU scheduling have to do with raytracing capabilities?
     
  12. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,325
    Likes Received:
    1,952
    No direct correlation to raytracing capabilities, though I'd be surprised if hardware acceleration raytracing was excluded from AMD's roadmap.
     
  13. JoeJ

    Regular Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    841
    Likes Received:
    938
    I have been told, NVs RTX was a sum of 2 things: Fine grained sheduling introduced with Volta, and Turing RT cores.
    The sheduling is likely to switch between various programs like generation / hit shaders and recursion, and rerouting / shuffling rays between them to improve coherency if that's a thing. Also task shaders may utilize it. (just guessing)

    So, probably those sheduling options can be used for other things as well and MS is now utilizing this?
    Mentioning video memory also hints dynamic allocation at a finer, potentially programmable level, maybe?

    It smells like there is a bunch of revolutionary options available, potentially fixing the second class citizen co-prozessor status that GPUs currently still have. But it has to be programmable and exposed...
     
    pharma likes this.
  14. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    591
    Likes Received:
    300
    stop posting this bullcrap. please..
    The fallback layer was primary intended for debug and development pruposese and is no more updated to lack of interest by IHVs and 3rd party software developers.
    Nothing related to WDDM 2.7, however it would be nice to have barycentrics support without AGS..
     
    #1034 Alessio1989, Feb 28, 2020
    Last edited: Feb 28, 2020
  15. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,183
    Likes Received:
    3,286
    ???
     
  16. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    591
    Likes Received:
    300
    nothing against you... but those "statements" about hw scheduling are 90% fake.
     
  17. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,183
    Likes Received:
    3,286
    Could you elaborate more? is the feature imaginary ? or it does something entirely different than the above description?
     
  18. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    591
    Likes Received:
    300
    There is a change (undocumented for public) in the gpu driver scheduler but none of those claims are true or correlated.

    All other new d3d12 features are already known and explained on the DirectX team dev blog: https://devblogs.microsoft.com/directx/
     
    #1038 Alessio1989, Feb 28, 2020
    Last edited: Feb 28, 2020
    BRiT likes this.
  19. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    7,441
    Likes Received:
    3,461
    Location:
    Pennsylvania
    I'm sure we'll see an article on the usual clickbait sites soon about it.
     
    CarstenS likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...