Next gen lighting technologies - voxelised, traced, and everything else *spawn*

An interesting case is the new Hitman game: They dropped DX12 although it was faster for the previous game (on older GPUs it was twice as fast!). Likely cost to maintain both was too high to be worth it for them.
FIFA 19 dropped it as well, as well as several Xbox exclusives, such as Quantum Break, Sea of Thieves, Sunset Overdrive .. etc.
I wouldn't say dx12 is a complete waste. World of warcraft added a dx12 renderer in its new expansion and in a patch after launch it improved performance even further. The initial implementation was slower on Nvidia hardware for whatever reason so they defaulted to the old dx11 renderer, but amd and Intel defaulted to dx12. With the newest version all default to dx12.
WOW is a CPU bound game, it's almost single threaded at this point. And from the link you posted, it doesn't benefit one iota from DX12. Only when Blizzard activates multi threading does the game gain significant boost to performance. But multi threading is only available in DX12 as Blizzard didn't bother bringing it to DX11.

In fact when DX12 was first introduced to the game (July 2018), NVIDIA's DX11 was 20% faster than AMD's DX12.
https://www.computerbase.de/2018-07...agramm-wow-directx-11-vs-directx-12-1920-1080
https://www.extremetech.com/gaming/273923-benchmarking-world-of-warcrafts-directx-12-support

It took significant time to bring multi threading to DX12 (December 2018), once that happened all GPUs had improved fps. NVIDIA even touted a 25% uplift on a 2080Ti @1080p.
https://www.nvidia.com/en-us/geforce/news/world-of-warcraft-directx-12-performance-update/

I think the idea is that 12/Vulkan was a break away from DX9-11 and whatever legacy items were in there; from that perspective there is benefit. A single API whose behaviours should be properly defined and supported, no weird hacks to get an intended result.
But the weird hacks continued, NVIDIA delivered consistent driver updates that improved DX12 and Vulkan performance, we've seen that in Hitman and Doom. These lower level APIs still rely on the driver for a great deal of their work.
 
Only when Blizzard activates multi threading does the game gain significant boost to performance.
It is most often only mentioned low level has big advantage in combination with CPU multi threading (command list generation in parallel).

But there is another option, IMO much more interesting: Do the entire gfx work on GPU only with pre recorded static command buffers that contain any work that might be necessary, driven by indirect dispatches.
So all the CPU has to do is upload updated data like animated transforms, streaming, or whatever. This is where i see the largest advantage because GPU does not rely on expensive sync with CPU to get work.
So i use low level to be faster on GPU, not to save CPU time.

I've never heard of a game utilizing this approach a lot or at all. In theory you could run whole graphics with a single 'draw call' this way. In practice you still need multiple command lists to feed multiple async queues, and sync between them is still CPU responsibility and thus slow. (assumption based on personal experience)
So what we need is work generation on GPU completely independent from CPU. This is surely already possible, but not exposed yet even to low level APIs.
There is much too less demand from games industry here, because they still think in large chunks of brute force work instead about more fine grained and work efficient approaches. Likely that's why there is more progress here on GPGPU APIs than on game APIs.

with DX12 / Vulkan, there are 10x more chances to shoot yourself in the foot and this is exactly why DX12 is in a bad state in many games.
Not sure. That's why validation exists and this is constantly improved and helps a lot. Low Level surely is more work, but it also is less guessing, trial and error, assumptions.
I think the main reason is simply the cost to maintain multiple APIs. You must support DX11 for Win7 and it works, so why spend lots of time just to get a few FPS more?

VIDIA delivered consistent driver updates that improved DX12 and Vulkan performance, we've seen that in Hitman and Doom.
AFAIK, NV ignores things like resource transition barriers from API (or some similar gfx thingy i'm no expert at). They do it differently, and likely that's the point where they still optimize per game.
 
It is most often only mentioned low level has big advantage in combination with CPU multi threading (command list generation in parallel).

But there is another option, IMO much more interesting: Do the entire gfx work on GPU only with pre recorded static command buffers that contain any work that might be necessary, driven by indirect dispatches.
So all the CPU has to do is upload updated data like animated transforms, streaming, or whatever. This is where i see the largest advantage because GPU does not rely on expensive sync with CPU to get work.
So i use low level to be faster on GPU, not to save CPU time.

I've never heard of a game utilizing this approach a lot or at all. In theory you could run whole graphics with a single 'draw call' this way. In practice you still need multiple command lists to feed multiple async queues, and sync between them is still CPU responsibility and thus slow. (assumption based on personal experience)
So what we need is work generation on GPU completely independent from CPU. This is surely already possible, but not exposed yet even to low level APIs.
There is much too less demand from games industry here, because they still think in large chunks of brute force work instead about more fine grained and work efficient approaches. Likely that's why there is more progress here on GPGPU APIs than on game APIs.


Not sure. That's why validation exists and this is constantly improved and helps a lot. Low Level surely is more work, but it also is less guessing, trial and error, assumptions.
I think the main reason is simply the cost to maintain multiple APIs. You must support DX11 for Win7 and it works, so why spend lots of time just to get a few FPS more?


AFAIK, NV ignores things like resource transition barriers from API (or some similar gfx thingy i'm no expert at). They do it differently, and likely that's the point where they still optimize per game.

Read @sebbbi presentation from 2015 SIGGRAPH

http://advances.realtimerendering.c...siggraph2015_combined_final_footer_220dpi.pdf

GPU driven rendering on console
 
GPU driven rendering
Yeah, a rare exception of a more interesting approach :) Performance was not so perfect - may need some more work, but i'm sure it's the way to go... (Do they still use this engine and has it been moved to DX12? Is it used in currant AC games which perform well?)
On the other hand, Doom still used traditional methods with CPU generated command buffers. Performance was great but one can not compare the content those two games show.
 
Yeah, a rare exception of a more interesting approach :) Performance was not so perfect - may need some more work, but i'm sure it's the way to go... (Do they still use this engine and has it been moved to DX12? Is it used in currant AC games which perform well?)
On the other hand, Doom still used traditional methods with CPU generated command buffers. Performance was great but one can not compare the content those two games show.

I think some console game use this approach.
 
... too come back on topic, i wondered how such approaches are 'compatible' with DXR.
Does it support instancing geometry? Likely, but what about geometry shaders, or the new mesh shaders? I assume restrictions with any approach that generates temporary geometry on demand.

Edit: Forgot about tessellation.

Proposed solution: https://devblogs.nvidia.com/coffee-break-ray-plus-raster-era-begins/
  • The fundamental problem with tessellation and ray tracing is normally you would have to tessellate your mesh, then build an acceleration structure over the tessellation, and then trace against that. This can be fairly expensive. You instead gain great efficiency by doing the subdivision inside your shader instead. For instance, you can intersect over 1M hairs in real time.
Likely you'd need to define a bounding box over all hairs and then use custom intersection shader.
But this does not make much sense for something like tessellated terrain i think, especially if it constantly changes in relation to camera distance.
 
Last edited by a moderator:
So what we need is work generation on GPU completely independent from CPU. This is surely already possible, but not exposed yet even to low level APIs.
Indeed, seems all the gains they can achieve right now is from the CPU, and not achieving stellar results at that either.
I think the main reason is simply the cost to maintain multiple APIs. You must support DX11 for Win7 and it works, so why spend lots of time just to get a few FPS more?
We have multiple developers going out on the record saying things like: DX12 could be futile.
Don’t develop on DirectX 12 just chasing performance gains, says Ubisoft programmer
If you take the narrow view that you only care about raw performance you probably won’t be that satisfied with amount of resources and effort it takes to even get to performance parity with DX11,” explained Rodrigues. “I think you should look at it from a broader perspective and see it as a gateway to unlock access to new exposed features like async compute, multi GPU, shader model 6, etc
https://www.pcgamesn.com/microsoft/ubisoft-dx12-performance

Another developer states that developing for DX12 is hard, and can be worth it or not depending on your point of view, the gains from DX12 can easily be masked on high end CPUs running maxed out settings and you end up with nothing. They also call Async Compute inconsistent and needs further improvements. They also state the DX12 driver is still very much relevant to the scene.

37c6cb8dfd1d.jpg


https://www.techpowerup.com/231079/is-directx-12-worth-the-trouble
 
Indeed, seems all the gains they can achieve right now is from the CPU, and not achieving stellar results at that either.

We have multiple developers going out on the record saying things like: DX12 could be futile.


https://www.pcgamesn.com/microsoft/ubisoft-dx12-performance

Another developer states that developing for DX12 is hard, and can be worth it or not depending on your point of view, the gains from DX12 can easily be masked on high end CPUs running maxed out settings and you end up with nothing. They also call Async Compute inconsistent and needs further improvements. They also state the DX12 driver is still very much relevant to the scene.

37c6cb8dfd1d.jpg


https://www.techpowerup.com/231079/is-directx-12-worth-the-trouble
We're going pretty OT.
There's definitely truths there, though you're linking 2017 opinions, the opinions of the environment may be different now that we're 2 years later, and even more so 2 years after that when next gen launches.

That being said, "porting" your DX11 game to DX12 is not going to yield results. It's still a DX11 pipeline just coded in DX12. You want a pipeline that's developed specifically for DX12 and that's where you will see gains.

DX12 is an API for multiple IHVs to meet. The behaviors and inputs are described by the API and the drivers are meant to make that happen. So yes, drivers and driver performance is still going to be necessary and can improve over time and change to adapt to desired DX12 behaviour through architectural changes, and I can't see a scenario where you can code to the metal on multiple IHVs; you've lost the point of having common interface like DX12.

Yes DX12 is hard. No one said lower level coding would be easy.
 
hmmm... assume you want to make a cross platform game. Consoles, PC, Mac, maybe Linux as well. How many APIs do you have to support?
And now tell me how many GPU IHVs we have.

Maybe it would be better if each IHV creates its own tailored API, and each platform would support just that? Seriously, maybe it would not be a bad idea nowadays anymore.
 
hmmm... assume you want to make a cross platform game. Consoles, PC, Mac, maybe Linux as well. How many APIs do you have to support?
And now tell me how many GPU IHVs we have.

Maybe it would be better if each IHV creates its own tailored API, and each platform would support just that? Seriously, maybe it would not be a bad idea nowadays anymore.

We've been there in the '90s. It wasn't good.
 
We've been there in the '90s. It wasn't good.
It wasn't that bad. Glide was the easiest API i've ever used, haha :)

I know what you mean, but with three times more APIs than vendors we maybe have to admit the alternative has failed even worse.
 
Even if a IHV wanted to, would they even be capable of delivering an API on Apple OS without the support of Apple? I don't think Apple would support fragmentation and allow for any to compete with METAL that wasn't directly released and controlled by Apple.

I also don't ever see Microsoft and Sony having exactly the same APIs for their consoles either, so there will always be those variations.
 
Even if a IHV wanted to, would they even be capable of delivering an API on Apple OS without the support of Apple? I don't think Apple would support fragmentation and allow for any to compete with METAL that wasn't directly released and controlled by Apple.

I also don't ever see Microsoft and Sony having exactly the same APIs for their consoles either, so there will always be those variations.

Probably not. When a group sent out a press release saying that Vulkan was coming to iOS and OSX, I was a bit hopeful.

But it turns out that Vulkan on those devices will just be a wrapper on top of Metal.

Regards,
SB
 
Falcor 3.2.1 was released a while ago.
Falcor 3.0 added support for DirectX Raytracing. As of Falcor 3.1, special build configs are no longer required to enable these features. Simply use the DebugD3D12 or ReleaseD3D12 configs as you would for any other DirectX project. The HelloDXR sample demonstrates how to use Falcor’s DXR abstraction layer.

  • Requirements:
    • Windows 10 RS5 (version 1809)
    • A GPU which supports DirectX Raytracing, such as the NVIDIA Titan V or GeForce RTX (make sure you have the latest driver)
Falcor doesn’t support the DXR fallback layer.
https://developer.nvidia.com/falcor
 
Sadly? At least it pushes their games towards open technologies and standards, something that couldn't be said when they were sponsored by NVIDIA.
Also, AMD sponsoring them doesn't suggest in any way they wouldn't include DXR support in future titles
Sure I like those things, but we are talking about AAA games here - it is not like they are really open anyway.

Say what you will about gameworks or NV sponsoring and performance across IHVs, but we do at least get some neat extra graphical effects then on PC that we do not usually see at all with AMD sponsoring. Something above the consoles in areas beyond shadow res, resolution, or simple sample counts. VXAO, hairworks, etc are pretty neat in that regard!

BTW, regarding the VK q2 PT video, some interesting facts from my conversation with Christoph:
1. It turns out there was is little overhead to ray tracing instead
of rasterization. According to Christoph, tracing the primar visibility
costs around 2ms at 1440p on RTX2080Ti.
2. There is an estimate that RT "contributes to around
50% of the path tracer runtime."
3. The path tracing instead of fixed grid rasterisation makes TAA much more effective, you can shift jitter individual pixels and not the entire grid.
4. In a different DXR framework Christoph noted a 7x performance advantage of the RTX 2080Ti vs. Titan Xp
5. Full screen effects like fish eye rendering or cool "framebuffer distortion" looking bits are implemented by off-setting the ray directions.
 
Last edited:
Say what you will about gameworks or NV sponsoring and performance across IHVs, but we do at least get some neat extra graphical effects then on PC that we do not usually see at all with AMD sponsoring. Something above the consoles in areas beyond shadow res, resolution, or simple sample counts. VXAO, hairworks, etc are pretty neat in that regard!
IIRC first semi accurate hair simulation in games was Tomb Raider reboot, and AMD has developed the tech. (Just to mention - we neither need AMD nor NV to do something like that.)
What i criticize about GameWorks is two things:
Some years earlier NV released free SDKs with examples, including many of their research progress. (The term open source was not so overused back then, but it was just that.)
Nowadays they try to limit their stuff to their own HW exclusively, and it is no longer open sourced.
... surely that's just business, but as a consumer i'm not happy with features limited to certain vendors, knowing there is no technical reason for this. I'm just not sure which to blame, NV or 'lazy' devs ;)

2. There is an estimate that RT "contributes to around
50% of the path tracer runtime."

so 2ms primary visibility, 8ms path tracing, giving 6ms for denoising?

If somebody has the time to turn filtering on/off i'd be curious about the performance difference... (the console command was visible in the DF video IIRC)

5. Full screen effects like fish eye rendering or cool "framebuffer distortion" looking bits are implemented by off-setting the ray directions.
Ha, yeah! It would be easy to make FOV larger than 180 degrees, with little distortion at the center but back side enemies visible at the screen borders. I always wanted to try this. With rasterization you need to blend 2 framebuffers - cumbersome, but with RT work of a minute. (There already was some quake mod with such projections around.)
 
Back
Top