Direct3D feature levels discussion

You'd need non-trivial modifications to the shading language to support shader generating descriptors (unless you want to go through memory, which defeats the purpose). AFAIK Mantle currently only supports ~HLSL with minor extensions.
HLSL doesn't need big extensions to support scalar unit registers and operations. The simplest implementation would be to add a "wave_invariant" keyword. This keyword could be used at variable declarations ( wave_invariant int4 myInteger; ) to make these variables scalar registers. Data loads to variables of this type could be optimized to use scalar loads. Assignments from vector registers would take the first lane, assignments from scalar to vector register would broadcast the value to all lanes. And the HLSL sampling and load instruction would of course need another overload that takes a int4 scalar (or two of them) as a parameter (to support generated resource descriptors). Simple as that. The shader would obviously only work on certain wave width (all AMD hardware is 64 wide). Similar thing has been the case with CUDA since the beginning (32 wide warps) and all optimized code assumes that. There have been no compatiblity problems so far.

GCN scalar unit has full integer instruction set for 32 bit scalars, and reduced instruction set fo 64 bit scalars (no multiply, but all the necessary bitwise operations are present to construct a resource descriptor solely by scalar ALU code). I haven't programmed Mantle, so I don't know whether it allows developers to write shaders directly by GCN microcode. I haven't followed the PC AMD GPU low level microcode details that closely to say whether this would break compatibility between the different GCN versions.
 
Last edited:
Latest Nvidia driver (Alessio1989's link) provides support for older gpu's. As posted by someone:

WBUmd4O.png
 
@DmitryKo

About Feature levels wiki page, I am pretty sure that all GPUs now should be able to use UAV across all shader stages, even on FL 11.0 (Tier1 of resource binding requires at least 8 UAV slots across all shader stages). I am not sure why this is not allowed in D3D11 (like through a cap-bit) , maybe it is related to the internal API design or to WDDM 1.x.
Constant buffer offsetting and partial updates should also allowed on all GPUs by API design.
I am not sure how many differences there are between FL 11.0 and FL 11.1 on D3D12, one is TIR (target-independent rasterization) support for sure.
 
I've already made some of the changes you proposed, and also simplified the Feature levels section in the main Direct3D article and forked all the more technical stuff into a new article: https://en.wikipedia.org/wiki/Feature_levels_in_Direct3D

Regarding UAVs in every stage, this would be quite a non-trivial change, since feature level 11_0 in Direct3D12 would be a superset of the same level in Direct3D11... which brings us back to the question whether "LEVEL_A" etc. would be a better naming scheme for Direct3D 12.

These changes need to be better explained and confirmed by Microsoft - until there are appropriate blog posts or MSDN updates, I'd leave the Wiki page as is, or we risk inciting another flame war with Nvidia fanboys...
 
To know what the hardware can, regardless DX, you can check in OpenGL for image_object (2D UAVs) and storage_buffer (1D UAVs) limits per pipeline stage. If the driver shows what the hardware can is another question. :)
 
I did check "GL_MAX_*_IMAGE_UNIFORMS" limits from ARB/shader_image_load_store - there are 48 "combined" and 8 per each stage for all NVidia GPUs.

For ARB/shader_storage_buffer_object, GL_MAX_*_SHADER_STORAGE_BLOCKS is 96 "combined" and 16 per each stage for all NVidia GPUs. The same is 16 "combined" and 16 per each stage for AMD GPUs. GL_MAX_SHADER_STORAGE_BUFFER_BINDINGS is 96 for Nvidia and 16 for AMD. GL_MAX_COMBINED_SHADER_OUTPUT_RESOURCES is 16 for NVidia and 40 for AMD.
Not sure what would you make from all of this.

I was unable to find any extension or cap named "image_object".
 
Last edited:
So i was just checking, it is wddm 1.3. I didn't make any pic but if need it i will do one tomorrow.

I spoke to manuelG and he told me, the Fermi DX12 driver will be ready when win 10 RTM will be released .
 
Last edited by a moderator:
Build 10158 released, comes with some DX\WDDM bugfixes. We are not far from RTM now... New drivers are expected soon (hopefully..). The 20th July VS2015 (and WinSDK 10) should reach the RTM too.
 
Build 10158 released, comes with some DX\WDDM bugfixes. We are not far from RTM now... New drivers are expected soon (hopefully..). The 20th July VS2015 (and WinSDK 10) should reach the RTM too.

The new version running fine here. So around 20th of July Fermi DX12 driver :D That would be great!
Anyway, i made the pic with the new build.
full
 
Back
Top