Direct3D feature levels discussion

Radeon R9 380X

Same result in experimental mode.
I think we will never see again FP16 on pre-vega GPUs.
I also do not understand why AMD still does not exposes barycentric semantic in D3D12since SM 6.1 is now supported (on D3D11,OpenGL and Vulkan all GCN should support it).
FP16 is supposed to be available on Polaris (same speed as FP32) and Vega (twice the speed via RPM). R9 380X is GCN3
 
According to Anandtech, even first-iteration Tonga (R9 285) had FP16-instructions. No mention though, to what extend. The slide says "New 16 bit floating point and integer instructions for low power GPU compute and media processing."
https://www.anandtech.com/show/8460/amd-radeon-r9-285-review/2
The slide in question:
https://images.anandtech.com/doci/8460/GCN12ISA.png

I need to dig up my older DXDiag-logs, but IIRC, there were drivers which reported back "16/32 Bit" for minimum precision query even on GCN1.2/GCN gen 3 (which is the same, depending on where you start to count) Tonga and Fiji. Not 2× speed of course.
 
GCN3/GCN4 FP16 was useful for register pressure even though it was the same rate as FP32 (you could pack two FP16 elements into a register). I think they also lacked some of the FP conversion instructions Vega supports too (I can't recall and I'm too lazy to check the ISA) which further limited their usefulness in situations involving mixed precision. All in all, I think in practical terms the "FP16 support" on those cards weren't useful. The planets really had to align to show a benefit for GCN3/4. It's probably why amd "retired" support in drivers, more trouble than it was worth.
 
Would be nice to know if BarycentricsSupported are supported with 23.20.768.12 on newer Polaris or Vega cards similar to exposed on Vulkan.
I also do not understand why AMD still does not exposes barycentric semantic in D3D12since SM 6.1 is now supported (on D3D11,OpenGL and Vulkan all GCN should support it).

It looks like barycentric intrinsincs are not enabled by default in the Catalyst driver - you have to use AMD GPU Services library to enable video driver extensions then check for AGS_DX12_EXTENSION_INTRINSIC_BARYCENTRICS:

https://gpuopen.com/gaming-product/barycentrics12-dx12-gcnshader-ext-sample/
https://github.com/GPUOpen-LibrariesAndSDKs/Barycentrics12
https://gpuopen-librariesandsdks.github.io/ags/amd__ags_8h.html
https://github.com/GPUOpen-LibrariesAndSDKs/AGS_SDK
 
Last edited:
I don't know if it makes sense, but I couldn't find a the data for on Intel Core with Radeon RX Vega M Graphics. So here we go:

Windows 10 version 1803 (build 17101.1000) x64

ADAPTER 0
"AMD Radeon R9 200 Series"
VEN_1002, DEV_67B0, SUBSYS_30801462, REV_00
Dedicated video memory : 4072.4 MB (4270227456 bytes)
Total video memory : 12207.5 MB (12800466944 bytes)
Video driver version : 23.20.15017.4003
Maximum feature level : D3D_FEATURE_LEVEL_12_0 (0xc000)
DoublePrecisionFloatShaderOps : 1
OutputMergerLogicOp : 1
MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0) (0b0000'0000)
TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_2 (2)
ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
PSSpecifiedStencilRefSupported : 1
TypedUAVLoadAdditionalFormats : 1
ROVsSupported : 0
ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_NOT_SUPPORTED (0)
StandardSwizzle64KBSupported : 0
CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
CrossAdapterRowMajorTextureSupported : 0
VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 1
ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_2 (2)
MaxGPUVirtualAddressBitsPerResource : 40
MaxGPUVirtualAddressBitsPerProcess : 40
Adapter Node 0: TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1, HeapSerializationTier: 0
HighestShaderModel : D3D12_SHADER_MODEL_5_1 (0x0051)
WaveOps : 0
WaveLaneCountMin : 64
WaveLaneCountMax : 64
TotalLaneCount : 2816
ExpandedComputeResourceStates : 1
Int64ShaderOps : 0
RootSignature.HighestVersion : D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
DepthBoundsTestSupported : 1
ProgrammableSamplePositionsTier : D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
ShaderCache.SupportFlags : D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY | AUTOMATIC_INPROC_CACHE | AUTOMATIC_DISK_CACHE (15) (0b0000'1111)
CopyQueueTimestampQueriesSupported : 1
CastingFullyTypedFormatSupported : 1
WriteBufferImmediateSupportFlags : D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY (15) (0b0000'1111)
ViewInstancingTier : D3D12_VIEW_INSTANCING_TIER_NOT_SUPPORTED (0)
BarycentricsSupported : 0
ExistingHeaps.Supported : 1
ReservedBufferPlacementSupported : 0
SharedResourceCompatibilityTier : D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_0 (0)
Native16BitShaderOpsSupported : 0
AtomicShaderInstructions : 0
 
Definitely doesn't seem to be Vega ISA thereby confirming it's based on Polaris?
 
Resource Heap Tier 2 yeah:
Page 60 said:
DX12 introduced the ability to allow resource views to be directly accessed by shader programs without requiring an explicit resource binding step.
Turing extends our resource support to include bindless Constant Buffer Views and Unordered Access Views, as defined in Tier 3 of DX12’s Resource Binding Specification.
Turing’s more flexible memory model also allows for multiple different resource types (such as textures and vertex buffers) to be co-located within the same heap, simplifying aspects of memory management for the app.
Turing supports Tier 2 of resource heaps.
https://www.nvidia.com/content/dam/...ure/NVIDIA-Turing-Architecture-Whitepaper.pdf

Now in comparison to Pascal also Conservative Rasterization Tier 3 should be in.
 
Turing/TU102:

"NVIDIA GeForce RTX 2080 Ti"
VEN_10DE, DEV_1E07, SUBSYS_12A410DE, REV_A1
Dedicated video memory : 11048.0 MB (11584667648 bytes)
Total video memory : 27368.7 MB (28698128384 bytes)
Video driver version : 24.21.14.1151
Maximum feature level : D3D_FEATURE_LEVEL_12_1 (0xc100)
DoublePrecisionFloatShaderOps : 1
OutputMergerLogicOp : 1
MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0) (0b0000'0000)
TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_3 (3)
ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
PSSpecifiedStencilRefSupported : 0
TypedUAVLoadAdditionalFormats : 1
ROVsSupported : 1
ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_3 (3)
StandardSwizzle64KBSupported : 0
CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
CrossAdapterRowMajorTextureSupported : 0
VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 1
ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_2 (2)
MaxGPUVirtualAddressBitsPerResource : 40
MaxGPUVirtualAddressBitsPerProcess : 40
Adapter Node 0: TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1
HighestShaderModel : D3D12_SHADER_MODEL_6_1 (0x0061)
WaveOps : 1
WaveLaneCountMin : 32
WaveLaneCountMax : 32
TotalLaneCount : 4352
ExpandedComputeResourceStates : 1
Int64ShaderOps : 1
RootSignature.HighestVersion : D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
DepthBoundsTestSupported : 1
ProgrammableSamplePositionsTier : D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
ShaderCache.SupportFlags : D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY (3) (0b0000'0011)
CopyQueueTimestampQueriesSupported : 1
CastingFullyTypedFormatSupported : 1
WriteBufferImmediateSupportFlags : D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY | VIDEO_DECODE | VIDEO_PROCESS (63) (0b0011'1111)
ViewInstancingTier : D3D12_VIEW_INSTANCING_TIER_3 (3)
BarycentricsSupported : 0
ExistingHeaps.Supported : 1
 
That’s what the output file looks like.
Are you on an older windows or tool version? Your last output file had these:

ReservedBufferPlacementSupported : 0
SharedResourceCompatibilityTier : D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_0 (0)
Native16BitShaderOpsSupported : 0
AtomicShaderInstructions : 0
 
The D3D12CheckFeatureSupport tool has been updated to support render pass tier and raytracing tier in Windows 10 version 1809 (build 17763), as well as protected resource session, shader models 6_3 and 6_4 and experimental tiled resources tier 4.

It shall also correctly report SM 6_2 and higher on versions 1803 and higher - previously it was only requesting up to 6_1.

The new features as reported by Microsoft Basic Render Driver (WARP12):
Code:
Adapter Node 0:     TileBasedRenderer: 0, UMA: 1, CacheCoherentUMA: 1, IsolatedMMU: 0, HeapSerializationTier: 10, ProtectedResourceSession.Support: 0
HighestShaderModel : D3D12_SHADER_MODEL_6_3 (0x0063)
TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_4 (4)
MSAA64KBAlignedTextureSupported : 1
Native16BitShaderOpsSupported : 1
AtomicShaderInstructions : 0
SRVOnlyTiledResourceTier3 : 1
RenderPassTier : D3D12_RENDER_PASS_TIER_1 (1)
RaytracingTier : D3D12_RAYTRACING_TIER_NOT_SUPPORTED (0)

Also, MSAA64KBAlignedTextureSupported parameter supersedes and replaces ReservedBufferPlacementSupported which has been deprecated in the final release of the SDK build 17134.

Unfortunately, the Direct3D 12 documentation on Microsoft Docs is not being kept up to date and only reflects SDK version 1709 as of now...
 
Last edited:
Pascal / 1809 + DevMode

Code:
Direct3D 12 feature checker (October 2018) by DmitryKo (x64)
https://forum.beyond3d.com/posts/1840641/

Windows 10 version 1809 (build 17763.1) x64
Checking for experimental features

ADAPTER 0
"NVIDIA GeForce GTX 1080"
VEN_10DE, DEV_1B80, SUBSYS_1B8010DE, REV_A1
Dedicated video memory : 8079.0 MB (8471445504 bytes)
Total video memory : 40804.3 MB (42786373632 bytes)
Video driver version : 25.21.14.1616
Maximum feature level : D3D_FEATURE_LEVEL_12_1 (0xc100)
DoublePrecisionFloatShaderOps : 1
OutputMergerLogicOp : 1
MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0) (0b0000'0000)
TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_4 (4)
ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
PSSpecifiedStencilRefSupported : 0
TypedUAVLoadAdditionalFormats : 1
ROVsSupported : 1
ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_2 (2)
StandardSwizzle64KBSupported : 0
CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
CrossAdapterRowMajorTextureSupported : 0
VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 1
ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_1 (1)
MaxGPUVirtualAddressBitsPerResource : 40
MaxGPUVirtualAddressBitsPerProcess : 40
Adapter Node 0:    TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1, HeapSerializationTier: 0, ProtectedResourceSession.Support: 1
HighestShaderModel : D3D12_SHADER_MODEL_6_1 (0x0061)
WaveOps : 1
WaveLaneCountMin : 32
WaveLaneCountMax : 32
TotalLaneCount : 2560
ExpandedComputeResourceStates : 1
Int64ShaderOps : 1
RootSignature.HighestVersion : D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
DepthBoundsTestSupported : 1
ProgrammableSamplePositionsTier : D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
ShaderCache.SupportFlags : D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY (3) (0b0000'0011)
CopyQueueTimestampQueriesSupported : 1
CastingFullyTypedFormatSupported : 1
WriteBufferImmediateSupportFlags : D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY | VIDEO_DECODE | VIDEO_PROCESS (63) (0b0011'1111)
ViewInstancingTier : D3D12_VIEW_INSTANCING_TIER_2 (2)
BarycentricsSupported : 0
ExistingHeaps.Supported : 1
MSAA64KBAlignedTextureSupported : 1
SharedResourceCompatibilityTier :  D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_1 (1)
Native16BitShaderOpsSupported : 0
AtomicShaderInstructions : 0
SRVOnlyTiledResourceTier3 : 1
RenderPassTier :  D3D12_RENDER_PASS_TIER_0 (0)
RaytracingTier :  D3D12_RAYTRACING_TIER_NOT_SUPPORTED (0)
 
Back
Top