Compute pipelines have no access to graphics states
ACE can operate in parallel with Graphics CP, ie ACE have access to shader engine with rasterizer via Graphics CP >> GDS for mixed mode.
Compute pipelines have no access to graphics states
Once I have some confirmed information out of the various GPU vendors, yes. Right now everyone is being very hush-hush, I cannot get confirmation on anything except that Maxwell 2 is FL12_1.Ryan, are you planning on writing an article regarding feature level support of the various D3D12 supported architectures? It seems almost nobody outside of this forum even knows that D3D12 support != D3D12 feature level compliance. It think it'd get a lot of hits
I don't follow the logic.AMD will update the driver later for implemented support for tiled resources tier 3 on DX12, because GCN1.0 support Texture3D as well.
Even if Conservative Rasterization and Rasterizer Ordered Views can be implemented with shaders, I honestly don't see how it is possible to program the fixed-function rasterizer stage from computing units...Software Rasterization was planned for GCN1.0, but AMD can use ACE for Hardware Conservative Rasterization and ROVs.
ACE is a dedicated scheduler which operates independently of the CPU host (i.e. ExecuteIndirect and asynchronous Render/Compute/Copy).ACE can operate in parallel with Graphics CP, ie ACE have access to shader engine with rasterizer via Graphics CP >> GDS for mixed mode.
Nobody even knows about feature levels in Direct3D 11 - and the fact that Windows Phone 8.x hardware is mostly feature level 9_1/9_3 an very rarely 10_1 or 11_0, so it's actually Direct3D 11.2 on top of feature level 9_1... guess which part is cited in most reviews? WDDM 2.0 now supports virtual CPU memory address space - which should be quite common on mobile GPU parts that lack their own dedicated graphics memory. This means that Windows Phone 10 devices could have WDDM 2.0 drivers and Direct3D 12 on top of feature level 9_1... my brain already hurts over this proposition.Ryan, are you planning on writing an article regarding feature level support of the various D3D12 supported architectures? It seems almost nobody outside of this forum even knows that D3D12 support != D3D12 feature level compliance. It think it'd get a lot of hits
Operating in parallel is one thing. Accessing graphics state is another thing, or why would you think the compute queues have to be separated from the graphics queue in the multi-engine model? Yeah, ACEs have access to the same set of shader engine, but rasterisers? I wouldn't be so sure.ACE can operate in parallel with Graphics CP, ie ACE have access to shader engine with rasterizer via Graphics CP >> GDS for mixed mode.
I gave some of the info in my presentation at GDC: https://software.intel.com/sites/de...ndering-with-DirectX-12-on-Intel-Graphics.pdf
Note that what the driver returns at this point is fairly arbitrary across all implementations... that's literally just querying caps bits that the driver sets, it's not as if it's testing the features or anything so there are both cases where something that will be supported just isn't flicked on yet and other cases where things are set that may not even work yet. So while the stuff posted so far looks roughly accurate for those architectures (obviously tiled resources is not correct for GCN), do take it all with a grain of salt at this point
Anyways for Haswell/Broadwell it's roughly:
- Feature level 11_1
- Tier 1 binding
- ROVs, doubles, OM logic ops are supported
- No conservative raster, additional typed UAV formats, standard swizzle or ASTC
- Half precision (fp16) is supported on Broadwell, but not Haswell
Stuff I don't remember for sure off the top of my head:
- I believe both will ultimately support Tier 1 tiled resources, but may not be reported yet
- PS specified stencil ref is probably not supported on Haswell, don't remember if it is on Broadwell
In any case it's the basic set of ~DX11-level features + ROVs on Haswell/Broadwell. Those architectures obviously predate the interesting design changes in DX12 so it's more a question of fitting the new API onto existing hardware than designing hardware for the new API (ex. see what we have to do with resource binding in the presentation above). Definitely stay tuned and check again in the near future once new architectures come out
Yeah, ACEs have access to the same set of shader engine, but rasterisers?
It looks like GCN 1.1/1.2 and Xbox One do support feature level 12_0 - i.e. at least Resource Binding Tier 2, Tiled Resources Tier 2, and Typed UAV Load with additional formats!
ADAPTER 0
"AMD Radeon R9 200 Series (Engineering Sample - WDDM v2.0)"
VEN_1002, DEV_67B0, SUBSYS_30801462, REV_00
Dedicated video memory : 3221225472 bytes
Direct3D 12 is supported
Maximum feature level : D3D_FEATURE_LEVEL_11_1 (0xb100)
DoublePrecisionFloatShaderOps : 1
OutputMergerLogicOp : 1
MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_NONE (0)
TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_2 (2)
ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
PSSpecifiedStencilRefSupported : 1
TypedUAVLoadAdditionalFormats : 1
ROVsSupported : 0
ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_NOT_SUPPORTED (0)
MaxGPUVirtualAddressBitsPerResource : 38
StandardSwizzle64KBSupported : 0
ASTCProfile : D3D12_ASTC_PROFILE_NOT_SUPPORTED (0)
CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_NOT_SUPPORTED (0)
CrossAdapterRowMajorTextureSupported : 0
Adapter Node 0: TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0
you're talking about your R9 290X but the video memory is 3221225472 bytes. This is a 280X?
D3DKMTQueryAdapterInfo
, IDXGIAdapter4::GetDesc3
, D3D12EnableExperimentalFeatures
, ID3D12Device::CheckFeatureSupport
and ID3D12Device5::EnumerateMetaCommands
to check the supported Direct3D 12 options for every graphics adapter in the system.d3d12core.dll
from bin\x64
(or bin\arm64
) folder of the NuGet package to the D3D12\
subfolder. Supported on Windows 10 version 1909 (build 18363.1350) and later. Enable the Developer mode in Windows Settings - Update & Security - For developers
. Run the Agile build of the tool with checkfeatures_agile.cmd.d3d10warp.dll
from bin\x64
(or bin\arm64
) folder to the tool folder.D3D12_ERROR_INVALID_REDIST
- download and install the required version of the Direct3D 12 Agility SDK runtime.Direct3D 12 feature checker (March 2024) by DmitryKo (x64) (Agility SDK v613)
https://forum.beyond3d.com/posts/1840641/
Windows 10 version Dev (build 21390.2025 co_release) x64
Checking for experimental features SM6
ADAPTER 0
"AMD Radeon RX 7800 XT"
VEN_1002, DEV_747E, SUBSYS_53271849, REV_C8
Dedicated video memory : 16172.1 MB (16957689856 bytes)
Total video memory : 32518.2 MB (34097752064 bytes)
BIOS string : 113-APM6766SL-100
Video driver version : 31.0.24014.1006
WDDM version : KMT_DRIVERVERSION_WDDM_3_0 (3000)
Virtual memory model : GPUMMU
Hardware-accelerated scheduler : Disabled, DXGK_FEATURE_SUPPORT_EXPERIMENTAL (1)
GraphicsPreemptionGranularity : DXGI_GRAPHICS_PREEMPTION_PRIMITIVE_BOUNDARY (1)
ComputePreemptionGranularity : DXGI_COMPUTE_PREEMPTION_DMA_BUFFER_BOUNDARY (0)
Maximum feature level : D3D_FEATURE_LEVEL_12_2 (0xc200)
DoublePrecisionFloatShaderOps : 1
OutputMergerLogicOp : 1
MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_16_BIT (2) (0b0000'0010)
TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_4 (4)
ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_3 (3)
PSSpecifiedStencilRefSupported : 1
TypedUAVLoadAdditionalFormats : 1
ROVsSupported : 1
ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_3 (3)
StandardSwizzle64KBSupported : 0
CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
CrossAdapterRowMajorTextureSupported : 0
VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 1
ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_2 (2)
MaxGPUVirtualAddressBitsPerResource : 47
MaxGPUVirtualAddressBitsPerProcess : 48
Adapter Node 0: TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1, HeapSerializationTier: 0, ProtectedResourceSession.Support: 1, ProtectedResourceSessionTypeCount: 1 D3D12_PROTECTED_RESOURCES_SESSION_HARDWARE_PROTECTED
HighestShaderModel : D3D12_SHADER_MODEL_6_9 (0x0069)
WaveOps : 1
WaveLaneCountMin : 32
WaveLaneCountMax : 64
TotalLaneCount : 3840
ExpandedComputeResourceStates : 1
Int64ShaderOps : 1
RootSignature.HighestVersion : D3D_ROOT_SIGNATURE_VERSION_1_2 (3)
DepthBoundsTestSupported : 1
ProgrammableSamplePositionsTier : D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
ShaderCache.SupportFlags : D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY | AUTOMATIC_INPROC_CACHE | AUTOMATIC_DISK_CACHE | DRIVER_MANAGED_CACHE | SHADER_CONTROL_CLEAR | SHADER_SESSION_DELETE (127) (0b0111'1111)
CopyQueueTimestampQueriesSupported : 1
CastingFullyTypedFormatSupported : 1
WriteBufferImmediateSupportFlags : D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY (15) (0b0000'1111)
ViewInstancingTier : D3D12_VIEW_INSTANCING_TIER_1 (1)
BarycentricsSupported : 1
ExistingHeaps.Supported : 1
MSAA64KBAlignedTextureSupported : 1
SharedResourceCompatibilityTier : D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_2 (2)
Native16BitShaderOpsSupported : 1
AtomicShaderInstructions : 0
SRVOnlyTiledResourceTier3 : 1
RenderPassesTier : D3D12_RENDER_PASS_TIER_0 (0)
RaytracingTier : D3D12_RAYTRACING_TIER_1_1 (11)
AdditionalShadingRatesSupported : 0
PerPrimitiveShadingRateSupportedWithViewportIndexing : 1
VariableShadingRateTier : D3D12_VARIABLE_SHADING_RATE_TIER_2 (2)
ShadingRateImageTileSize : 8
BackgroundProcessingSupported : 1
MeshShaderTier : D3D12_MESH_SHADER_TIER_1 (10)
SamplerFeedbackTier : D3D12_SAMPLER_FEEDBACK_TIER_1_0 (100)
UnalignedBlockTexturesSupported : 1
MeshShaderPipelineStatsSupported : 1
MeshShaderSupportsFullRangeRenderTargetArrayIndex : 0
AtomicInt64OnTypedResourceSupported : 1
AtomicInt64OnGroupSharedSupported : 1
DerivativesInMeshAndAmplificationShadersSupported : 0
WaveMMATier : D3D12_WAVE_MMA_TIER_1_0 (10)
VariableRateShadingSumCombinerSupported : 1
MeshShaderPerPrimitiveShadingRateSupported : 0
AtomicInt64OnDescriptorHeapResourceSupported : 1
DisplayableTexture : 0
DisplayableTexture.SharedResourceCompatibilityTier : D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_0 (0)
MSPrimitivesPipelineStatisticIncludesCulledPrimitives : 0
EnhancedBarriersSupported : 1
RelaxedFormatCastingSupported : 1
UnrestrictedBufferTextureCopyPitchSupported : 1
UnrestrictedVertexElementAlignmentSupported : 1
InvertedViewportHeightFlipsYSupported : 1
InvertedViewportDepthFlipsZSupported : 1
TextureCopyBetweenDimensionsSupported : 1
AlphaBlendFactorSupported : 1
AdvancedTextureOpsSupported : 1
WriteableMSAATexturesSupported : 1
IndependentFrontAndBackStencilRefMaskSupported : 1
TriangleFanSupported : 1
DynamicIndexBufferStripCutSupported : 1
DynamicDepthBiasSupported : 1
GPUUploadHeapSupported : 1
NonNormalizedCoordinateSamplersSupported : 1
ManualWriteTrackingResourceSupported : 0
RenderPassesValid : 1
MismatchingOutputDimensionsSupported : 1
SupportedSampleCountsWithNoOutputs : 29
PointSamplingAddressesNeverRoundUp : 1
RasterizerDesc2Supported : 1
NarrowQuadrilateralLinesSupported : 1
AnisoFilterWithPointMipSupported : 1
MaxSamplerDescriptorHeapSize : 67108864
MaxSamplerDescriptorHeapSizeWithStaticSamplers : 67108864
MaxViewDescriptorHeapSize : 33554432
ComputeOnlyCustomHeapSupported : 0
ComputeOnlyWriteWatchSupported : 1
RecreateAtTier : D3D12_RECREATE_AT_TIER_NOT_SUPPORTED (0)
WorkGraphsTier : D3D12_WORK_GRAPHS_TIER_1_0 (10)
ExecuteIndirectTier : D3D12_EXECUTE_INDIRECT_TIER_1_0 (10)
SampleCmpGradientAndBiasSupported : 1
ExtendedCommandInfoSupported : 1
Metacommands enumerated : 9
Metacommands [parameters per stage]: Conv (Convolution) [84][1][6], Conv (Convolution) [108][5][6], GEMM (General matrix multiply) [67][1][6], GEMM (General matrix multiply) [91][5][6], GEMM (General matrix multiply) [91][5][6], DSTORAGE [4][0][11], MHA (Multi-Head Attention) [299][13][16], MVN (Mean Variance Normalization) [91][5][6], MVN (Mean Variance Normalization) [67][1][6]
Wave Matrix Multiply Accumulate
[M]x[N] TYPE -> [K] TYPE
16x16 BYTE -> x16 INT32 (1)
16x16 FLOAT16 -> x16 FLOAT16, FLOAT (6)
16x16 FLOAT -> x16 NONE (0)
WaveMMA operations supported : 3
Not implemented: D3D12_FEATURE_PREDICATION | HARDWARE_COPY
Direct3D12 MSDN documentation has been updated with the release of Visual Studio 2015 RC with Windows SDK build 10069, which require the latest Windows 10 build 10074. There are some minor changes to the API and the SDK now defines feature levels 12_0 and 12_1.
Unfortunately many things are broken now - Direct3D12 device can't be created on the R290X anymore returning E_INVALIDARG. DxCapsView performs erratically, now reporting Conservative Rasterization support (???) but only under x64 version, not x86, and not reporting PS-Specified Stencil Ref anymore. Also feature levels 9_x and 10_x fail to create on any adapter. So there are probably some breaking changes which require updated drivers, but maybe someone would have a better luck with Haswell/Broadwell or Kepler/Maxwell.
Direct3D 12 Feature Checker (May 2015) by DmitryKo
https://forum.beyond3d.com/posts/1838269/
ADAPTER 0
"NVIDIA GeForce GTX 980"
VEN_10DE, DEV_13C0, SUBSYS_236819DA, REV_A1
Dedicated video memory : 3221225472 bytes
Total video memory : 4294901760 bytes
Created Direct3D 12 device at feature level 11_0
Maximum feature level : D3D_FEATURE_LEVEL_11_1 (0xb100)
DoublePrecisionFloatShaderOps : 1
OutputMergerLogicOp : 1
MinPrecisionSupport : D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0)
TiledResourcesTier : D3D12_TILED_RESOURCES_TIER_3 (3)
ResourceBindingTier : D3D12_RESOURCE_BINDING_TIER_2 (2)
PSSpecifiedStencilRefSupported : 0
TypedUAVLoadAdditionalFormats : 1
ROVsSupported : 1
ConservativeRasterizationTier : D3D12_CONSERVATIVE_RASTERIZATION_TIER_1 (1)
MaxGPUVirtualAddressBitsPerResource : 38
StandardSwizzle64KBSupported : 0
CrossNodeSharingTier : D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
CrossAdapterRowMajorTextureSupported : 0
VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation : 0
ResourceHeapTier : D3D12_RESOURCE_HEAP_TIER_2 (2)
Adapter Node 0: TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0
ADAPTER 1
"Microsoft Basic Render Driver"
VEN_1414, DEV_008C, SUBSYS_00000000, REV_00
Dedicated video memory : 0 bytes
Total video memory : 4276551680 bytes
Failed to create Direct3D 12 device at feature level 11_0
Error 887A0004: Este sistema no admite la interfaz de dispositivo o el nivel de característica especificados.
FINISHED running on 2015-05-02 15:37:12
2 display adapters enumerated
Do you run it on build 10.0.10074 with Nvidia driver 352.63?