Direct3D feature levels discussion

Discussion in 'Rendering Technology and APIs' started by DmitryKo, Feb 20, 2015.

  1. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    7,787
    Likes Received:
    1,510
    Location:
    Finland
    FP16 is supposed to be available on Polaris (same speed as FP32) and Vega (twice the speed via RPM). R9 380X is GCN3
     
    Heinrich4 likes this.
  2. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,730
    Likes Received:
    1,957
    Location:
    Germany
    According to Anandtech, even first-iteration Tonga (R9 285) had FP16-instructions. No mention though, to what extend. The slide says "New 16 bit floating point and integer instructions for low power GPU compute and media processing."
    https://www.anandtech.com/show/8460/amd-radeon-r9-285-review/2
    The slide in question:
    https://images.anandtech.com/doci/8460/GCN12ISA.png

    I need to dig up my older DXDiag-logs, but IIRC, there were drivers which reported back "16/32 Bit" for minimum precision query even on GCN1.2/GCN gen 3 (which is the same, depending on where you start to count) Tonga and Fiji. Not 2× speed of course.
     
    CSI PC likes this.
  3. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    843
    #903 CSI PC, May 1, 2018
    Last edited: May 1, 2018
  4. willardjuice

    willardjuice super willyjuice
    Moderator Veteran Alpha Subscriber

    Joined:
    May 14, 2005
    Messages:
    1,365
    Likes Received:
    215
    Location:
    NY
    GCN3/GCN4 FP16 was useful for register pressure even though it was the same rate as FP32 (you could pack two FP16 elements into a register). I think they also lacked some of the FP conversion instructions Vega supports too (I can't recall and I'm too lazy to check the ISA) which further limited their usefulness in situations involving mixed precision. All in all, I think in practical terms the "FP16 support" on those cards weren't useful. The planets really had to align to show a benefit for GCN3/4. It's probably why amd "retired" support in drivers, more trouble than it was worth.
     
    Heinrich4, Ryan Smith and CSI PC like this.
  5. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    545
    Likes Received:
    337
    Location:
    55°38′33″ N, 37°28′37″ E
    It looks like barycentric intrinsincs are not enabled by default in the Catalyst driver - you have to use AMD GPU Services library to enable video driver extensions then check for AGS_DX12_EXTENSION_INTRINSIC_BARYCENTRICS:

    https://github.com/GPUOpen-LibrariesAndSDKs/Barycentrics12
    https://gpuopen-librariesandsdks.github.io/ags/amd__ags_8h.html
    https://github.com/GPUOpen-LibrariesAndSDKs/AGS_SDK
     
    #905 DmitryKo, May 27, 2018
    Last edited: May 28, 2018
    Lightman likes this.
  6. donjulio

    Joined:
    Dec 15, 2017
    Messages:
    8
    Likes Received:
    17
    I don't know if it makes sense, but I couldn't find a the data for on Intel Core with Radeon RX Vega M Graphics. So here we go:

     
    Heinrich4 and BRiT like this.
  7. Malo

    Malo YakTribe.games
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    6,502
    Likes Received:
    2,539
    Location:
    Pennsylvania
    Definitely doesn't seem to be Vega ISA thereby confirming it's based on Polaris?
     
  8. donjulio

    Joined:
    Dec 15, 2017
    Messages:
    8
    Likes Received:
    17
    That’s why I did run this in the first place as soon as I got in touch with that CPU. But I’m still not sure if this can confirm anything.
     
  9. Ryan Smith

    Regular Subscriber

    Joined:
    Mar 26, 2010
    Messages:
    587
    Likes Received:
    892
    Location:
    PCIe x16_1
    We've previously been able to work out that the graphics core is Polaris based on AMD's Linux driver commits. It's using the gfx8 family branch, which is Tonga/Fiji/Polaris. Vega is gfx9.
     
  10. Locuza

    Newcomer

    Joined:
    Mar 28, 2015
    Messages:
    45
    Likes Received:
    101
    Resource Heap Tier 2 yeah:
    https://www.nvidia.com/content/dam/...ure/NVIDIA-Turing-Architecture-Whitepaper.pdf

    Now in comparison to Pascal also Conservative Rasterization Tier 3 should be in.
     
    Heinrich4 and pharma like this.
  11. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    7,787
    Likes Received:
    1,510
    Location:
    Finland
  12. donjulio

    Joined:
    Dec 15, 2017
    Messages:
    8
    Likes Received:
    17
    Turing/TU102:

    "NVIDIA GeForce RTX 2080 Ti"
    VEN_10DE, DEV_1E07, SUBSYS_12A410DE, REV_A1
    Dedicated video memory : 11048.0 MB (11584667648 bytes)
    Total video memory : 27368.7 MB (28698128384 bytes)
    Video driver version : 24.21.14.1151
    Maximum feature level : D3D_FEATURE_LEVEL_12_1 (0xc100)
    DoublePrecisionFloatShaderOps : 1
    OutputMergerLogicOp : 1
    MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0) (0b0000'0000)
    TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_3 (3)
    ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
    PSSpecifiedStencilRefSupported : 0
    TypedUAVLoadAdditionalFormats : 1
    ROVsSupported : 1
    ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_3 (3)
    StandardSwizzle64KBSupported : 0
    CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
    CrossAdapterRowMajorTextureSupported : 0
    VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 1
    ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_2 (2)
    MaxGPUVirtualAddressBitsPerResource : 40
    MaxGPUVirtualAddressBitsPerProcess : 40
    Adapter Node 0: TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1
    HighestShaderModel : D3D12_SHADER_MODEL_6_1 (0x0061)
    WaveOps : 1
    WaveLaneCountMin : 32
    WaveLaneCountMax : 32
    TotalLaneCount : 4352
    ExpandedComputeResourceStates : 1
    Int64ShaderOps : 1
    RootSignature.HighestVersion : D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
    DepthBoundsTestSupported : 1
    ProgrammableSamplePositionsTier : D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
    ShaderCache.SupportFlags : D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY (3) (0b0000'0011)
    CopyQueueTimestampQueriesSupported : 1
    CastingFullyTypedFormatSupported : 1
    WriteBufferImmediateSupportFlags : D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY | VIDEO_DECODE | VIDEO_PROCESS (63) (0b0011'1111)
    ViewInstancingTier : D3D12_VIEW_INSTANCING_TIER_3 (3)
    BarycentricsSupported : 0
    ExistingHeaps.Supported : 1
     
  13. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    7,787
    Likes Received:
    1,510
    Location:
    Finland
    So still doesn't match Intel IGPs or Vega?
     
  14. Malo

    Malo YakTribe.games
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    6,502
    Likes Received:
    2,539
    Location:
    Pennsylvania
    Aren't you missing a few entries there at the bottom?
     
  15. donjulio

    Joined:
    Dec 15, 2017
    Messages:
    8
    Likes Received:
    17
    That’s what the output file looks like.
     
  16. Malo

    Malo YakTribe.games
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    6,502
    Likes Received:
    2,539
    Location:
    Pennsylvania
    Are you on an older windows or tool version? Your last output file had these:

     
  17. donjulio

    Joined:
    Dec 15, 2017
    Messages:
    8
    Likes Received:
    17
    Nope, same system and tool. Developer Mode enabled. I can check later what’s going on.
     
  18. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    2,597
    Likes Received:
    1,338
    Is it still needed going forward?
     
  19. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    545
    Likes Received:
    337
    Location:
    55°38′33″ N, 37°28′37″ E
    The D3D12CheckFeatureSupport tool has been updated to support render pass tier and raytracing tier in Windows 10 version 1809 (build 17763), as well as protected resource session, shader models 6_3 and 6_4 and experimental tiled resources tier 4.

    It shall also correctly report SM 6_2 and higher on versions 1803 and higher - previously it was only requesting up to 6_1.

    The new features as reported by Microsoft Basic Render Driver (WARP12):
    Code:
    Adapter Node 0:     TileBasedRenderer: 0, UMA: 1, CacheCoherentUMA: 1, IsolatedMMU: 0, HeapSerializationTier: 10, ProtectedResourceSession.Support: 0
    HighestShaderModel : D3D12_SHADER_MODEL_6_3 (0x0063)
    TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_4 (4)
    MSAA64KBAlignedTextureSupported : 1
    Native16BitShaderOpsSupported : 1
    AtomicShaderInstructions : 0
    SRVOnlyTiledResourceTier3 : 1
    RenderPassTier : D3D12_RENDER_PASS_TIER_1 (1)
    RaytracingTier : D3D12_RAYTRACING_TIER_NOT_SUPPORTED (0)
    
    Also, MSAA64KBAlignedTextureSupported parameter supersedes and replaces ReservedBufferPlacementSupported which has been deprecated in the final release of the SDK build 17134.

    Unfortunately, the Direct3D 12 documentation on Microsoft Docs is not being kept up to date and only reflects SDK version 1709 as of now...
     
    #919 DmitryKo, Oct 6, 2018
    Last edited: Oct 20, 2018
    jlippo, Malo, BRiT and 1 other person like this.
  20. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    1,314
    Likes Received:
    7
    Location:
    msk.ru/spb.ru
    Pascal / 1809 + DevMode

    Code:
    Direct3D 12 feature checker (October 2018) by DmitryKo (x64)
    https://forum.beyond3d.com/posts/1840641/
    
    Windows 10 version 1809 (build 17763.1) x64
    Checking for experimental features
    
    ADAPTER 0
    "NVIDIA GeForce GTX 1080"
    VEN_10DE, DEV_1B80, SUBSYS_1B8010DE, REV_A1
    Dedicated video memory : 8079.0 MB (8471445504 bytes)
    Total video memory : 40804.3 MB (42786373632 bytes)
    Video driver version : 25.21.14.1616
    Maximum feature level : D3D_FEATURE_LEVEL_12_1 (0xc100)
    DoublePrecisionFloatShaderOps : 1
    OutputMergerLogicOp : 1
    MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0) (0b0000'0000)
    TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_4 (4)
    ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
    PSSpecifiedStencilRefSupported : 0
    TypedUAVLoadAdditionalFormats : 1
    ROVsSupported : 1
    ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_2 (2)
    StandardSwizzle64KBSupported : 0
    CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
    CrossAdapterRowMajorTextureSupported : 0
    VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 1
    ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_1 (1)
    MaxGPUVirtualAddressBitsPerResource : 40
    MaxGPUVirtualAddressBitsPerProcess : 40
    Adapter Node 0:    TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1, HeapSerializationTier: 0, ProtectedResourceSession.Support: 1
    HighestShaderModel : D3D12_SHADER_MODEL_6_1 (0x0061)
    WaveOps : 1
    WaveLaneCountMin : 32
    WaveLaneCountMax : 32
    TotalLaneCount : 2560
    ExpandedComputeResourceStates : 1
    Int64ShaderOps : 1
    RootSignature.HighestVersion : D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
    DepthBoundsTestSupported : 1
    ProgrammableSamplePositionsTier : D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
    ShaderCache.SupportFlags : D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY (3) (0b0000'0011)
    CopyQueueTimestampQueriesSupported : 1
    CastingFullyTypedFormatSupported : 1
    WriteBufferImmediateSupportFlags : D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY | VIDEO_DECODE | VIDEO_PROCESS (63) (0b0011'1111)
    ViewInstancingTier : D3D12_VIEW_INSTANCING_TIER_2 (2)
    BarycentricsSupported : 0
    ExistingHeaps.Supported : 1
    MSAA64KBAlignedTextureSupported : 1
    SharedResourceCompatibilityTier :  D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_1 (1)
    Native16BitShaderOpsSupported : 0
    AtomicShaderInstructions : 0
    SRVOnlyTiledResourceTier3 : 1
    RenderPassTier :  D3D12_RENDER_PASS_TIER_0 (0)
    RaytracingTier :  D3D12_RAYTRACING_TIER_NOT_SUPPORTED (0)
     
    pharma and fellix like this.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...