Direct3D feature levels discussion

DegustatoR · Mar 26, 2024

arandomguy said:
Would Intel want to completely hand over XeSS at that level to MS as in ceding all control of it going forward?

Why would Intel need to do that? AMD doesn't "cede control" of FSR2, they even continue improving it in 3.1 presumably beyond the 2.2.2 spec which went into DXSR "standard implementation". Nothing stops MS from licensing and adding current XeSS DP4a version in the exact same way, the only difference would be the licensing model.

arandomguy said:
Let's look at Nvida for instance, does Nvidia want to push DLSS as the base model?

Nvidia would want that but they'd need to make DLSS compatible with other h/w for that and the choice of what they want more is theirs to make.

One thing to note in all this is that DXSR won't solve the problem of developers still using IHV's APIs because a) there's still FG which isn't a part of that and will still require IHV provided SDK implementation and b) a new upscaler which won't fit into DXSR API (say it would need a different set of inputs from the game) will go the same route DLSS/XeSS/FSR took previously. Trying to sell DXSR as if it will solve all these issues is misleading. And that's not even mentioning the fact that there are these pesky Vulkan and Linux.

troyan · Mar 26, 2024

Cappuccino said:
FSR2 is the most widely supported. While technically XeSS Dp4a edition runs on most cards it also comes with a pretty decent performance penalty in my experience.

My hot take (at least, in hardware enthusiast circles): FSR2 isn't THAT bad and 95% of the time I forget I'm running it vs something like DLSS. I actually tested this on myself today in Warzone, I forgot I switched from DLSS to FSR2 to try it out. The only way I eventually remembered was the game actually doesn't expose a sharpening slider for FSR2 so I noticed my image seemed more sharpened than usual (I usually run DLSS with 0 sharpening). Granted this is at 4k Performance on a TV about 6 feet away, so your milage may vary if you are running 1440p on a monitor less than a foot away.

FSR 2 is widely supported because it is the most outdated one. Just something runs on a 10 years old GPU doesnt make it usefull for the future. Compute denoisers run on anything, too, and yet RayReconstruction is superior and has better optimiziation potential. Within 6 months RR has beaten state of the art denoisers.

trinibwoy · Mar 26, 2024

DegustatoR said:
One thing to note in all this is that DXSR won't solve the problem of developers still using IHV's APIs because a) there's still FG which isn't a part of that and will still require IHV provided SDK implementation and b) a new upscaler which won't fit into DXSR API (say it would need a different set of inputs from the game) will go the same route DLSS/XeSS/FSR took previously. Trying to sell DXSR as if it will solve all these issues is misleading. And that's not even mentioning the fact that there are these pesky Vulkan and Linux.

That’s true but you gotta start somewhere. DXR 1.0 didn’t solve every problem either. At least with DXSR there is an avenue to introduce improvements to the common api if the IHVs play nice with each other.

Cappuccino · Mar 26, 2024

troyan said:
FSR 2 is widely supported because it is the most outdated one. Just something runs on a 10 years old GPU doesnt make it usefull for the future. Compute denoisers run on anything, too, and yet RayReconstruction is superior and has better optimiziation potential. Within 6 months RR has beaten state of the art denoisers.

Something being outdated doesnt make it useless. It is good baseline tech for those that don’t use Nvidia as you can essentially use it on everything, including Xbox.

Something being superior doesn’t make it the default if it doesn’t have broad hardware compatibility.

DavidGraham · Mar 28, 2024

Regarding DXSR.

https://twitter.com/x/status/1773052851574440133

https://twitter.com/x/status/1773052854695047257

https://twitter.com/x/status/1773052857425490076

https://twitter.com/x/status/1773052859992400131

https://twitter.com/x/status/1773052864971059668

trinibwoy · Mar 28, 2024

Interesting. FSR is the default implementation but the api doesn’t mandate all of the inputs that FSR needs to work effectively. Weird setup.

Jay · Mar 28, 2024

trinibwoy said:
Interesting. FSR is the default implementation but the api doesn’t mandate all of the inputs that FSR needs to work effectively. Weird setup.

It doesn't mandate what FSR doesn't mandate.
Not every game will benefit from the optional inputs and it's for the devs to make use of them when it's worth it.

trinibwoy · Mar 28, 2024

Jay said:
It doesn't mandate what FSR doesn't mandate.
Not every game will benefit from the optional inputs and it's for the devs to make use of them when it's worth it.

Yes but that means there’s a risk that AMD will still need to nudge/help devs to do things “properly”. Not ideal. Nvidia and Intel will care less because their stuff will work with just the mandatory inputs.

eastmen · Mar 29, 2024

trinibwoy said:
Yes but that means there’s a risk that AMD will still need to nudge/help devs to do things “properly”. Not ideal. Nvidia and Intel will care less because their stuff will work with just the mandatory inputs.

AMD can likely mitigate this with xbox. Likely devs going forward are just going to put FSR on their xbox games.

Jay · Mar 29, 2024

trinibwoy said:
Yes but that means there’s a risk that AMD will still need to nudge/help devs to do things “properly”. Not ideal. Nvidia and Intel will care less because their stuff will work with just the mandatory inputs.

All of them have optional parameters, and can be implemented badly even if they didn't.
If you don't need something and it's not optional then dev sending in crap won't be helpful.
I think the article wording is pretty bad around that as it makes it sound like it's a bad implementation if you don't use every parameter regardless if its required or not.

trinibwoy · Mar 29, 2024

Jay said:
All of them have optional parameters, and can be implemented badly even if they didn't.
If you don't need something and it's not optional then dev sending in crap won't be helpful.
I think the article wording is pretty bad around that as it makes it sound like it's a bad implementation if you don't use every parameter regardless if its required or not.

Which optional parameters are used by dlss and xess?

oscarbg · Apr 11, 2024

Hi,
just a question for Dimitrko and NPU people:

seeing https://forum.beyond3d.com/threads/direct3d-feature-levels-discussion.56575/page-70#post-2331017
and also
seeing in https://learn.microsoft.com/en-us/windows/win32/direct3d12/core-feature-levels:
"The overall driver model for compute-only devices is the Microsoft Compute Driver Model (MCDM)"
and Compute-only device== MCDM device,
shown as new D3D12 feature level D3D_FEATURE_LEVEL_1_0_CORE

EDIT: I see also now a new DXCORE_ADAPTER_ATTRIBUTE_D3D12_GENERIC_ML in addition to DXCORE_ADAPTER_ATTRIBUTE_D3D12_CORE_COMPUTE and also a D3D_FEATURE_LEVEL_1_0_GENERIC
here: https://github.com/microsoft/DirectML/commit/0bd9f4f0c7775a77de8104abdb66ce1f8515f30d
so don't know if NPUs are D3D_FEATURE_LEVEL_1_0_CORE or D3D_FEATURE_LEVEL_1_0_GENERIC with even more or less restrictions one vs the other..

have two questions..
1) DmitryKo can you update your D3D12CheckFeatureSupport tool to run on NPUs? i.e. report Direct3D Core Compute devices information..
in case it's already supported, somebody with a MeteorLake can share D3D12CheckFeatureSupport logs?

interested also what metacommands they expose vs current GPUs..
also would be nice if they expose WMMA (tensor core ops) metacommands for DirectML..

2) seems interesting if NPUs can run general "compute shaders" D3D12 only apps.. if yes, it's time for an DX12peak benchmark like clpeak or vkpeak?
interested in seeing perf of NPUs on clpeak via CLon12 or vkpeak via Dozen (CLon12 and Dozen might need some changes/massaging to support D3D_FEATURE_LEVEL_1_0_CORE)..

i.e. are these devices able to run simple D3D12 "compute shader only" apps with no screen/swapchaing/DXGI creation?
can run,for example:

Use-Direct3D-12-Compute-Shader-in-C-Basic-/D3D12ComputeShaderDemo/main.c at master · zenny-chen/Use-Direct3D-12-Compute-Shader-in-C-Basic-

Use Direct3D 12 Compute Shader in C (Basic). Contribute to zenny-chen/Use-Direct3D-12-Compute-Shader-in-C-Basic- development by creating an account on GitHub.

github.com

or some apps others pointed:

Sample Request: D3D12 Hello Compute · Issue #384 · microsoft/DirectX-Graphics-Samples

Currently the only existing D3D12 compute sample is the D3D12NBody simulation. For those interested in DX12 for purely compute reasons, particularly with no DX12 background, the sample can be more ...

github.com

if yes at least there is some general use to the TOPS provided in NPUs like MeteorLake and later this year by Qualcomm Elite and also AMD XDNA1 or 2 in addition to running DIrectML workloads..

note there is a DirectML NPU sample now (which only filters direct3d devices on the sysmte to core devices to avoid GPUs):

DirectML/Samples/DirectMLNpuInference at master · microsoft/DirectML

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported ...

github.com

DmitryKo · Apr 11, 2024

oscarbg said:
seems interesting if NPUs can run general "compute shaders" D3D12 only apps.. if yes, it's time for an DX12peak benchmark like clpeak or vkpeak?

NPU devices should be able to run compute shaders according to Microsoft Learn documentation on feature level 1_0_CORE, but you have to request this specific level and not any higher feature levels when creating the Direct3D12 device, and also enumerate MCDM adapters using IDXCoreAdapterFactory interface instead of IDXGIFactory.

DmitryKo can you update your D3D12CheckFeatureSupport tool to run on NPUs? i.e. report Direct3D Core Compute devices information.

I'll look into this, however DXCore is a WinRT API - unlike DXGI which is plain Windows API (a.k.a. Win32) with only minimal 'lightweight COM' plumbing - and it's only implemented in Windows 10 version 2004 (build 19041) or higher, unlike DXGI which is available down from Windows Vista.

There could be a few potential problems with that. First, if DXCore API requires some DLL exports to be present in the executable, my app wouldn't even run on earlier versions of Windows.
Second, it may require me to switch to the UWP Console app model, which would add a lot of unnecessary WinRT plumbings to my executable - and also make it unable to run in earlier versions of Windows 10 which do not support the UWP Console API. Therefore I may have to restructure my code and build a separate version of my tool just for NPUs.

It would be much easier if Microsoft just extended DXGI to support these "core" devices, but so far I've found no indications to that end; I don't have an Intel NPU processor to test with.

oscarbg · Apr 14, 2024

DmitryKo said:
NPU devices should be able to run compute shaders according to Microsoft Learn documentation on feature level 1_0_CORE, but you have to request this specific level only and not include any higher feature levels when creating the Direct3D12 device, and also enumerate MCDM adapters using IDXCoreAdapterFactory interface instead of IDXGIFactory.

I'll look into this, however DXCore is a WinRT API - unlike DXGI which is plain Windows API (a.k.a. Win32) - and it's only implemented in Windows 10 version 2004 (build 19041) or higher, unlike DXGI which is available down from Windows Vista.

There could be a few potential problems with that. First, if DXCore API requires some DLL exports to be present in the executable, my app wouldn't even run on earlier versions of Windows.
Second, it may require me to switch to the UWP Console app model, which would add a lot of unnecessary WinRT plumbings to my executable - and also make it unable to run in earlier versions of Windows 10 which do not support the UWP Console API. Therefore I may have to restructure my code and build a separate version of my tool just for NPUs.

It would be much easier if Microsoft just extended DXGI to support these "core" devices, but so far I've found no indications to that end; I don't have an Intel NPU processor to test with.

many thanks for detailed information..
sad to hear (and didn't know) it requires WinRT APIs and not available as a Win32 API like DXGI..
in that case I wouldn't touch your original app and perhaps "fork" as an another one, dxcore_capsviewer or something else..
anyway no pressure!, understand early days of NPUs, almost nobody has hardware to test, etc..

DmitryKo · Apr 15, 2024

oscarbg said:
sad to hear (and didn't know) it requires WinRT APIs and not available as a Win32 API like DXGI..

FYI I've been experimenting with DXCore, and it's actually available for regular Win32 'desktop apps', like most WinRT APIs today - so there is no need to switch to the UWP app model, and you can suppress DXCore.dll exports with LoadLibrary() or delay-load linking.

It's still a typical WinRT API with no strong typing, unlike traditional 'lightweight COM' as implemented in Direct3D and DXGI. For example, adapter information is passed by reference as object 'properties' and you need to query these 'properties' with IDXCoreAdapter::IsPropertySupported() and IDXCoreAdapter::GetPropertySize() before you read them with IDXCoreAdapter::GetProperty() into a dynamically allocated typeless buffer, while the actual type information needs to be derived from WinMD metadata, as typical for .NET/CLI interfaces.

So you have to use C++ features and language projections like C++/CX or C++/WinRT, since you can't infer type information for 'properties' from header files, as it was standard in C-style coding typical for DXGI. There was the C++/Win32 project, a derivative of C++/WinRT which could use WinMD files from the Win32 metadata repository to generate C++ projection interfaces for WIn32 APIs, but this tool has been abandoned - like many other improvements to the desktop Windows platform announced alongside Project Reunion (aka Windows App SDK) back in 2021 - and only C# and Rust projections for Win32 are officially supported...

Anyway, I'll look into how DirectML's DXDispatch tool handles these DXCore 'property' types (in Adapter.cpp and Adapter.h). For now l will add a new command-line option to create the Direct3D 12 device with a minimum feature level 1_0_CORE, and hopefully NPUs would also be visible to DXGI interfaces as a standard graphics adapter - if they are only available through DXCore, that would require code refactoring to mix and match DXCore and DXGI adapters by their LUID indentifiers.

DmitryKo · Apr 17, 2024

I've added the necessary C++ plumbing to query DXCore adapters with IDXCoreAdapter::GetProperty(), and adapter LUIDs do match between DXGI and DXCore "core compute" adapters on the same graphics card (and the WARP12 software renderer) - so there's a good chance NPU devices will be visible as DXGI adapters as well, considering you can create a Direct3D 12 'core' device with minimum feature level 1_0_CORE on a DXGI adapter.

@oscarbg, do you have an actual NPU to test with?

DXCore defines a few 'adapter atrribute' GUIDs to filter devices by type, like DXCORE_ADAPTER_ATTRIBUTE_D3D11_GRAPHICS, DXCORE_ADAPTER_ATTRIBUTE_D3D12_GRAPHICS, then D3D12_CORE_COMPUTE, D3D12_GENERIC_ML, and D3D12_GENERIC_MEDIA; the latest SDK headers define additional 'hardware attributes' like DXCORE_HARDWARE_TYPE_ATTRIBUTE_GPU, COMPUTE_ACCELERATOR, NPU, and MEDIA_ACCELERATOR, but these are not documented on Microsoft Learn yet.

I wonder how actual NPU devices report these attributes; on my RDNA3 video card, the IDXCoreAdapter::IsAttributeSupported method reports D3D11_GRAPHICS, D3D12_GRAPHICS and D3D12_CORE_COMPUTE, and the same for the WARP12 adapter.

AFAIK there is no publically available information about feature level 1_0_GENERIC and its dfferences in comparisin to level 1_0_CORE. Considdering they also added D3D_SHADER_MODEL_NONE, maybe they're linked to the new D3D12_GENERIC_MEDIA attribute above, something like a limited Direct3D 12 device with metacommands and/or video rendering fiunctionality, but no compute shaders and even less processing capabilities?!.

BTW, WinRT APIs use UTF8 encoded char* strings, instead of UTF16 encoded wchar_t strings in Win32 APIs like DXGI and WDDM thunking. Therefore I had to set the .UTF8 locale in C Runtime, which changes legacy string functions like printf_s() / puts() to correctly support non-ASCII UTF8-encoded stings. However UTF8 support in the UCRT was only implemented in Windows 10 version 1803 (build 17164), so my app would break in earlier versions of Windows with a non-English locale.

Enums are straightforward for the most part, though of course Microsoft had to employ some different enums from the Direct3D KMD (kernel mode driver) thunking header (d3dkmdt.h), and their integer values do not match DXGI enums for the same functionality.
But at least the reported memory sizes all use uint64_t, and not size_t integer type as in DXGI which oscillates between 64-bit and 32-bit integer types depending on the target platform...

oscarbg · Apr 19, 2024

@DmitryKo sorry, don't have a device with NPU.. just was thinking if you could post the tool somebody in this forum will run it and report results..

DmitryKo · Apr 19, 2024

OK then, I will stick to DXGI for now, and use DXCore to query hardware attributes and select the minimum feature level, and also add a command-line option to set it. This will be added in the next release of my tool once a new Agility SDK version rolls out, hopefully this June after Bulid 2024 concludes.

If NPUs are only available from DXCore, that would need code refactoring to into full-blown C++ to the tune of D3DX12CheckFeatureSupport. An NPU for testing would be nice too - though Zen5 'Granite Ridge' desktop parts might include one, and I'm long due for a CPU upgrade...

DegustatoR · May 17, 2024

Agility SDK 1.614.0: R9B9G9E5 support for Render Targets and UAVs - DirectX Developer Blog

Today the DirectX team is excited to announce the release of the Agility SDK 1.614.0 which includes full support for R9G9B9E5_SHAREDEXP (999E5) texture format for render targets and UAV read/writes! This new Agility SDK is also fully supported by PIX. To get the latest release head on over to...

devblogs.microsoft.com

Lurkmass · May 17, 2024

AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

AMD Radeon Pro W6800 32 GB Review - Does the new workstation graphics card beat the NVIDIA Quadro RTX 5000? | igor´sLAB (igorslab.de) Guess they didn't have an A5000 on hand. The card is trading blows with the RTX Quadro 5000 (which is a 2080 Super GPU), and loses big time in DXR tests. I don't...

forum.beyond3d.com

RDNA2 supported rendering/blending to RGB9E5 render targets several years ago. I think this is a far more useful feature as opposed to superfluous things like VRS or "Sampler Feedback"

Direct3D feature levels discussion

DegustatoR

troyan

trinibwoy

Meh

Cappuccino

DavidGraham

trinibwoy

Meh

Jay

trinibwoy

Meh

eastmen

Jay

trinibwoy

Meh

oscarbg

Use-Direct3D-12-Compute-Shader-in-C-Basic-/D3D12ComputeShaderDemo/main.c at master · zenny-chen/Use-Direct3D-12-Compute-Shader-in-C-Basic-

Sample Request: D3D12 Hello Compute · Issue #384 · microsoft/DirectX-Graphics-Samples

DirectML/Samples/DirectMLNpuInference at master · microsoft/DirectML

DmitryKo

oscarbg

DmitryKo

DmitryKo

oscarbg

DmitryKo

DegustatoR

Agility SDK 1.614.0: R9B9G9E5 support for Render Targets and UAVs - DirectX Developer Blog

Lurkmass

AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

Direct3D feature levels discussion

Meh

Meh

Meh

Meh

​