Next gen lighting technologies - voxelised, traced, and everything else *spawn*

I'm not sure if I'm remembering correctly, but I'm pretty sure claybook performance is not affected by the number of edits made to the world sdf. I don't know the correct term. I'm assuming Dreams would be the same, if theys hat's true, then is it likely we'll see a larger ray-marched sdf game, something like a liner third-person title? Is the issue mainly conetent creatiin like modelling and animation? Or are there missing pieces, like high-frequency texturing and materials?

Curious if we'll ever see an alternative to dxr to compare performance.
 
Still, doesn't look anywhere as good as the top tier titles of this gen.

Likely a matter of personal preference and artstyle. I have seen scenes that look pretty close to regular games.
I think the main limitation they have is missing cube maps so no PBR. But this is no tech limitation. Users generating the content probably have no fun placing probes manually.

But it has some interesting features neither rasterization nor current RT can offer:
High frequency diffuse geometry
'Proper' DOF and motion blur (although i have not seen how the noise looks on the real game not YT)

It also supports huge worlds and rapid content generation (but the latter could be used for polys too).

I should have mentioned it earlier - it's really the thing i talk about: No fixed function but same performance than rasterization although more flexibility, and the creative development process going beyond restrictions. According to the paper a trail of failures - that's the spirit!

I'm not sure if I'm remembering correctly, but I'm pretty sure claybook performance is not affected by the number of edits made to the world sdf. I don't know the correct term. I'm assuming Dreams would be the same, if theys hat's true, then is it likely we'll see a larger ray-marched sdf game, something like a liner third-person title? Is the issue mainly conetent creatiin like modelling and animation? Or are there missing pieces, like high-frequency texturing and materials?

Dreams uses SDF only as a intermediate data structure for editing. The result is converted to point clouds before rendering, it is a splatting approach.
On PC still not possible efficiently because 64 bit atomics to framebuffer are not exposed. (UE4 dev has requested for this recently - maybe things have changed.) Would be super interesting for foliage but also distant landscape.

Both SDF (sphere tracing) and splatting performance is affected by scene complexity. SDF (like RT) suffers from diffuse geometry, and splatting has the overdraw problem (like rasterization).

high-frequency texturing and materials?
In Dreams they solve the grainy output with regular TAA. According to the (outdated) paper it's not perfect and requires more work.

This depends a lot on what you prefer. There are two kinds of people:
Those that want smooth images and dislike high frequencies (like me - i reduce game resolution to 1080p because the image quality becomes better due to smoothing, and nowadays TAA is so good it causes no downscale artifacts)
Those that want crispy and sharp images, 4K screens and textures. (they never agree that CG is usually much too sharp)

I assume the first kind likes splatting more than the latter maybe.


With RTX point clouds maybe become very inefficient because of the 'custom mini acceleration structure embedded in BVH boxes' problem. Would be interesting to know... Maybe they add a point primitive?
 
Very interesting discussion here (in german): https://www.forum-3dcenter.org/vbulletin/showthread.php?p=11887094#post11887094

Somebody says BFV with Volta TitanV is as fast (60-100 fps, WQHD, high settings) as with RTX GPU.
I would say this is nonsense, but it agrees with similar results about the Star Wars demo, of which i have posted a screenshot earlier here somewhere.

My conclusion: Shading cost is so high the benefit from RT cores vanishes. (Volta has no RT cores, but it already had the fine grained sheduling, AFAIK)
This may also hint NVs ray batching and compacted shading / tracing processing logic is not as advanced as i think again.
 
Seems Battlefield V is confirmed to be receiving DLSS support, according to VC NVIDIA has some marketing materials claiming RTX 2060 is capable of Med DXR @1080p60. But more with DLSS on.

https://videocardz.com/79505/nvidia-geforce-rtx-2060-pricing-and-performance-leaked
Except that the results don't match up.
First they say it can do unspecified RT 1080p 65 FPS, then they say it can do Ultra RT 1080p 58 FPS or Medium RT 1080p 66 FPS.
We know from past tests on same architecture that High results are close to Ultra, not close to Medium, so 65 FPS can't be High either.

Also 1080p RT Off goes from first being 110 FPS to 90 FPS
 
That Titan is a bigger, more expensive chip. The benefit of the RT cores doesn't vanish - it's enabling a smaller, cheaper die to achieve the same workload as a bigger, faster one (if the data about performance is correct). The more apt interpretation is the ray testing only corresponds to a small part of the raytracing performance costs, with surface shading representing by far the more expensive aspect. However, that'll only apply to reflected surface. Lights and shadows will be bottlenecked by ray tests. Comparing Volta Titan V versus RTX in lighting only scenes (simple coloured blocks with GI) may give very different results.

We need more demos! ;) Honestly, why aren't there some demos similar to the ones we've seen for volumetric lighting? Surely some RTX owners are dabbling?
 
I should have said 'almost vanishes', but both chips are large and one is older gen. Something is wrong here. We need to find out how to utilize HW better.

We need more demos! ;) Honestly, why aren't there some demos similar to the ones we've seen for volumetric lighting? Surely some RTX owners are dabbling?

Yeah. But IIRC it took half or whole a year until i saw a VCT demo after the released paper. RTX is very easy to use and test in comparison, but people have zero experience with denoising, so for 'demos' it will take time.
But i want a reflection test app:

Unified shader vs. 10s or 100 shaders.
Adjustable roughness so we can see effect of divergent directions. (maybe even with advanced tech you never get 32 close hits per material to utilize shader cores fully)
Variable scene, e.g. created from some cubified procedural volume noise, so we can test dense vs. sparse geometry.
Turn shading on/off at any time.

No denoising, zeppelins or shiny robo girls required :)

Probably i need to do this myself. But not now... :(
 
It should be easy to create some generic test scenes for an experienced dev. Probably plenty of in-house experiments out there! What's a bit weird is the lack of stuff on YouTube - people are forever sharing their projects. Stuff like this...


Are the RTX's just too pricey for experimenters to own? Did people get DXR running on 1080s? That would be a good comparison too.

Incidentally, notice the noise on that is static, like the Gran Turismo raytracing video.
 
Devs likely too busy with Mesh Shaders that see an immediate improvement and other microtecture improvement tests, before they get around to more drastic changes required for BVH/RT ?
 
In games, probably. But for standalone experiments, raytracing would be pretty exciting. Should be quick to implement and with very visible results.
 
That Titan is a bigger, more expensive chip. The benefit of the RT cores doesn't vanish - it's enabling a smaller, cheaper die to achieve the same workload as a bigger, faster one (if the data about performance is correct). The more apt interpretation is the ray testing only corresponds to a small part of the raytracing performance costs, with surface shading representing by far the more expensive aspect. However, that'll only apply to reflected surface. Lights and shadows will be bottlenecked by ray tests. Comparing Volta Titan V versus RTX in lighting only scenes (simple coloured blocks with GI) may give very different results.

We need more demos! ;) Honestly, why aren't there some demos similar to the ones we've seen for volumetric lighting? Surely some RTX owners are dabbling?

Titan V is bigger, but don't forget it also has double precision and 128 ROPs.
Yes we need more demos, don't have the time to make some.
The RT cores can do fast BVH for coherent rays.
They must be sitting alsmost idle in BFV if they can do 10 Grays/s but only do like 96 Mray/s (40% of 4 MP x 60 Hz)
Not sure that will improve that much for 1 spp random rays for lighting, but yes shading will be less of a bottleneck.
 
If you follow twitter a lot of devs just don't have personal rtx cards yet. They may not be at liberty to share what happens at work.
 
With something like that, you could play around with sampling frequency and shader complexity and such and generate a load of metrics comparing quality vs tradeoffs. It's be a fun time to be at the forefront of rendering tech!
 
Something is wrong here. We need to find out how to utilize HW better.
Not really, just apply Amdahl's law to RT in BF V and it will be super easy to model perf of something like Titan V

For example, RTX 2080 Ti shows 129 FPS @ 2560x1440 Ultra settings w/o RTX, with RTX Ultra, perf drops to 68 FPS in the same scene.

Based on these results, RTX portion of total avg frame time is 47%, that's BVH construction, geom skinning, RT shading, CS spatial filtering and temporal accumulation for RT, ray-tris/bounding boxes instersection tests. As we know, only instersection tests are accelerated, lets say these take 15% out of total 47%, lets say w/o HW acceleration these will take 10x more time, 15%*10 + 32% = 182% on RTX portion on 2080 Ti w/o HW acceleration.

2080 Ti w/o HW acceleration will be 2.35x slower because 182% + 53% (raster part) = 235%, that's 29 FPS in 2560x1440, so Titan V can be close to 60 FPS in lower resolutions at least in some scenes if avg instersection tests portion of frame is less than <15% on 2080 Ti.
 
Last edited:
Still something wrong, i could use the same math and come to a different conclusion:

129 FPS = 7.75 ms
68 FPS = 14.7 ms

RTX cost = 7 ms
15% RT core = 1 ms

So a non RTX GPU would require 6+1*10 ms RT + 7.7 ms raster = 23.7 ms = 42 fps, but he has 80 fps average on his TitanV (not knowing what level and such...)

What can be wrong?
10 x speed up is more likely a peak value not achieved in practice?
Volta is broken, e.g. RT core <-> shader communication?

Can it be that fixed function RT core is not worth it and they would have achieved the same with more shader cores instead? Is it just that shading, BVH and denoising sums up so much that tracing cores can not really help?

I assume shadow and AO rays would draw a different picture, but even then the question remains.

You may dig up my older post about Star Wars Reflections demo, i think it was in the threat. According to this 2070 is only slightly faster than 1080Ti IIRC (but we know TitanV is about twice as fast with DXR SDK app, because it has that fine grained sheduling stuff already).
Not sure if this is valid - the site has removed the benchmarks shortly after.
Why did they get the Demo for benchmarks, but we do not?

NV has responded to similar thoughts here: http://www.pcgameshardware.de/Battl...ng-wird-wohl-ohne-RT-Kerne-berechnet-1272342/
They dodge the question - they do not explain why both cards are equally fast in practice. They just mention the 10 x speed up with RT cores.

We can not clarify this without knowing more about the tech than just speculative math with FPS.
 
Remedy's RTX presentation for northlight gives some examples comparing TitanV and Turing. For Ambient Occlusion single ray per pixel @1080p they were 5ms on TitanV and 1ms or less on Turing. For shadows single ray per-pixel @1080p was 4ms on TitanV and roughly 1ms or less on Turing. For their ray-traced lighting, they final image composition is about 3-5ms for rays @1080p on TitanV and 1ms or less on Turing. These are ray costs, not including shading. Of course this is before Nvidia made driver improvements etc.

It could be, that if you're clever, maybe you can hide some of the cost of rays by overlapping ray costs with other compute work before you start shading?
 
It could be, that if you're clever, maybe you can hide some of the cost of rays by overlapping ray costs with other compute work before you start shading?

Sure that's what you want to do. BFV dev mentioned compute and RT can not overlap yet, but NV works on it... this was in the eurogamer interview before patch.
But yes and this is a reason why we can not do such simple math as above.
 
I can't be certain on this so don't take it as being accurate but wasn't there talk about not being able to run compute work when running RT operations? Or was that based on early patches and temperature measures? It was something funky, so I wouldn't be surprised if it was found out that it wasn't correct or that it's based on early drivers which are far from optimal.
 
@BRiT @JoeJ

https://www.eurogamer.net/articles/digitalfoundry-2018-battlefield-5-rtx-ray-tracing-analysis

What are planned optimisations for the future?

Yasin Uludağ: One of the optimisations that is built into the BVHs are our use of “overlapped” compute - multiple compute shaders running in parallel. This is not the same thing as async compute or simultaneous compute. It just means you can run multiple compute shaders in parallel. However, there is an implicit barrier injected by the driver that prevents these shaders running in parallel when we record our command lists in parallel for BVH building. This will be fixed in the future and we can expect quite a bit of performance here since it removes sync points and wait-for-idles on the GPU.

We also plan on running BVH building using simultaneous compute during the G-Buffer generation phase, allowing ray tracing to start much earlier in the frame, and the G-Buffer pass. Nsight traces shows that this can be a big benefit. This will be done in the future.

It's more specific than compute overlapping with rt. If you have multiple command lists in your queue, I guess they should be able to execute in parallel, but the nvidia drivers were inserting a resource barrier that force the command lists for their BVH building to run sequentially. I'm curious if this is a general compute bug, or if it's specific to DXR because the command lists contain some command related to DXR. (or something like that)

I'm not sure if this driver fix is complete, or if they've shifted their ray-tracing to start earlier in the frame. Would be really nice if Digital Foundry did a follow up interview at some point to see where they're at with their implementation.
 
It reminds me a lot an the forth and back about async compute. I'm out of date and do not know if they ever fixed this. (I should have tested it when i had the 1070 but forgot... ) :)
 
Back
Top