DirectX 12 API Preview

Just pasting what Max wrote on that forum for easy reference.

Max said:
Some of the high level details were already revealed to respond to your post but it's quite a jump to get to the API details you probably want to hear. D3D 12 doesn't have strongly typed memory allocations like D3D 11, which strictly limited the dimensionality and usage of memory at creation time. On 12, the main memory allocation parameters are CPU access/cacheability and GPU locality vs CPU locality. Some examples:

Dynamic vertex buffers in 11 would be an application managed ring buffer of memory in 12, allocated with write combined CPU cacheability and CPU locality.

11 style default 2D textures do not have CPU access and have GPU locality. 12 will also expose the ability to map multidimensional GPU local resources, useful for reading out the results of a reduction operation with low-latency for example. In this case it would be write combined CPU access with GPU locality. In the GDC unveil of D3D 12 this was briefly mentioned in a slide, called "map default" or "swizzled texture access" IIRC.

Cacheability and locality will not be mutable properties of memory allocations but 12 will allow memory of those given properties to be retasked for multiple resource types (1D/2D/3D, VB/Texture/UAV/..., width/height/depth, etc). More details later this year....

D3D 12 will have multiple methods for moving data between CPU & GPU, each serving different scenarios/performance requirements. More details later this year... smile.png.

To Alessio1989's reply:

I expect the feature level/cap evolution to remain the same. D3D will expose some new features as independent caps and simultaneously bake sets of common caps together into a new feature level to guarantee support and reduce the implementation/testing matrix for developers. It's the best of both worlds between D3D9 and D3D10+. 9 allowed fine-grained feature additions without forcing hardware vendors to perfectly align on feature set but created an unsupportable mess of combinations. 10 allowed developers to rely on a combination of features but tended to delay API support for hardware features until most GPU vendors had built or nearly built that combination in hardware. 11 & 12 have evolved to have caps for initial exposure with feature levels baking in a common set over time.
 

no NDA required for the event; I guess that means they are ready to share the details? Or the event is just going to be more of what we already know.

Interesting regardless, love that the tickets are free. Sweden ;) ugg, I would consider going if it was held in Toronto ;), this lands on the same day as WWDC. Pretty intense June 2.
 
To answer my own question, slides are now available on AMD's website:
http://developer.amd.com/resources/documentation-articles/conference-presentations/
That particle system... is almost like a carbon copy of my own design, except that their tiles are quite big (32x32 pixels = 1024 pixels). Small particles suffer with this big tiles. Smaller tiles of course increase the binning cost, but conservative rasterization might be the perfect solution for that problem.

I also like that "vertex shader tricks" presentation... for obvious reasons (until we get multidraw on all rendering APIs).
 
Last edited by a moderator:
That particle system... is almost like a carbon copy of my own design, except that their tiles are quite big (32x32 pixels = 1024 pixels). Small particles suffer with this big tiles. Smaller tiles of course increase the binning cost, but conservative rasterization might be the perfect solution for that problem.
Is your system for screen-aligned rects/billboards as well? Certainly that's a chunk of the overdraw in a scene but there's usually a bunch of effects stuff in most games that use actual geometry as well. Conservative rasterization should help there too, although sorting and interpolation becomes an issue there too of course.
 
Is your system for screen-aligned rects/billboards as well? Certainly that's a chunk of the overdraw in a scene but there's usually a bunch of effects stuff in most games that use actual geometry as well. Conservative rasterization should help there too, although sorting and interpolation becomes an issue there too of course.
Conservative rasterization allows stretched particles (etc) without hassle. You can even support triangles (store index to triangle buffer). Triangle buffer contains transform to barycentrics that you use for UV interpolation and inside triangle test. 8x8 tile = 64 pixels = single AMD wave.

For conservative rasterization storage you can for example use bucket lists or perfect hash + linear probing. Normal (32 bit) atomic is enough for both, no need for UAV ordering. With UAV ordering you can do of course more complex stuff.
 
... Cuckoo hashing is also quite good on GPU (especially query, since it's quaranteed O(1) branchless). Inserts are harder, since 32 bit atomics make kicking items out of buckets awkward if payload + hash is larger than 32 bits. I would be happy to get 64 bit atomics (unfortunately most current hardware do not support it, so wide support isn't possible).

Another thing I would like very much is transactional memory for GPU (TSX style). Even a single cache line transaction support would be huge. Lock the cache line in shader (other waves stall for memory operation if they try to access the locked line, GPU latency hiding of course handles this case and runs other waves instead of stalled ones). Commit frees the line. GPU programming model/scheduling should support this with minor changes. To avoid the deadlock case, the shader code shouldn't be allowed to write/read to other cache lines inside the transaction.
 
SIGGRAPH'14: Intel shows DX12 performance gains on MS Surface Pro 3
Over 50% CPU Power Usage Reduction on Surface Pro 3!

To demonstrate the power gains of DirectX 12, Intel locked the framerate of the demo, rendered with DirectX 11 for a period of time and then toggled to DirectX 12 rendering the exact same content for an equal period of time. The graph below clearly indicates that DirectX 12 CPU power consumption was reduced more than 50% when compared to DirectX 11 rendering the exact same content at the same framerate. These power savings mean that your device can run longer and cooler!
 
Similarly Guru3d also mentions the same Intel demo with Microsoft's DX12.

When rendering exactly the same content, DirectX 12's power consumption was impressively over 50% less than its predecessor. A key improvement in DirectX 12 is its ability to intuitively share the workload across multiple cores, and the performance gains we are seeing here show this in action.


In a blog post on MSDN, Microsoft claims: "In some cases, DirectX 12 can take a game that’s otherwise unplayable on DirectX 11 without even increasing the power your device consumes!"

Intel demonstrated this through unlocking the frame rate in their demo to show a 50% increase in FPS (frames per second) using DirectX 12 - despite not drawing any extra power.

All of the gains displayed come directly from switching to DirectX 11 to DirectX 12. The API, through working with hardware partners, has lower-level access than ever before which significantly improves CPU utilisation. Even AMD, who has their own low-level API named 'Mantle', has put their support behind DirectX 12. Nvidia has announced all of its DirectX 11 GPUs will support the latest version.

http://www.guru3d.com/news-story/directx-12-cuts-power-consumption-in-half-and-boosts-fps.html
 
Andy,
According to the TechReport article on this, you did the presentation at Siggraph. (Congratulations!)

From what little you know about Mantle, can you make a comparison between the two, at least in terms of efficiency improvement? Are there major differences in terms of what an engine developer needs to do or are they at the core pretty similar?

Thanks!
 
I'm sure this has been covered before but what exactly does it mean when an IHV claims DX12 support with "feature level 11". Does it mean their driver supports the new CPU side APIs and optimizations but hardware supports only DX11 goodies? And how is it different from old school cap bits?
 
I'm sure this has been covered before but what exactly does it mean when an IHV claims DX12 support with "feature level 11". Does it mean their driver supports the new CPU side APIs and optimizations but hardware supports only DX11 goodies? And how is it different from old school cap bits?
DirectX 11.2 already has some features not supported by DirectX 11.0 hardware (such as tiled resources, min/max blending, etc). I expect these features be supported in the future API as well. Also Microsoft's publicly available slides have mentioned new features such as conservative rasterization, uav ordering and multidraw. It might be that some DirectX 11.0 hardware lacks support for some of these (or some other yet unannounced) new features. It's hard to speculate yet, because the full feature set hasn't been announced.
 
DirectX 11.2 already has some features not supported by DirectX 11.0 hardware (such as tiled resources, min/max blending, etc).
Actually it's not quite that simple, while GeForces are limited to feature level 11_0, at least Keplers support tiled resources from 11.2 (is there even feature level 11_2? or is 11_1 still the max?)
 
at least Keplers support tiled resources from 11.2

I think they support a particular level of that? I might be wrong though. Can't say I'm thrilled with the revival of the whole caps bits stuff (in new clothes of course), but I can see why this is happening (unfortunately).
 
I think they support a particular level of that? I might be wrong though. Can't say I'm thrilled with the revival of the whole caps bits stuff (in new clothes of course), but I can see why this is happening (unfortunately).

Yes, the "first tier" apparently, while GCN (1.1 only?) should support the "second tier".
It's quite a mess at the moment, when you can support DX11.2 without supporting DX11.1 Feature Level 11_1
 
I believe it was the opposite. Every GCN GPU supports tier 1. Tier 2 was a feature available in GCN1.1
 
Back
Top