Next gen lighting technologies - voxelised, traced, and everything else spawn

OlegSH · Nov 13, 2019

Shifty Geezer said:
Hmmm. It seems like they're getting very useable results without needed RTRT hardware.

Just for absolute mirror surfaces - no need for complex denoising (or no need for denoising at all) and rays should be very coherent - no thread divergence, which is great for traversal on wide SIMD processors.
What amount of materials has absolutely smooth surface?

No need to build BVH every frame either since triangles at the bottom of voxel acceleration structure are linked to SVO nodes and all the scene is pretty much static (so the acceleration structure is static and prebuilt in advance), except for a single moving drone.

Remove all these limitations - add support for a wide range of materials like in BF5, add global destruction and tons of moving objects like in BV5 (requires runtime BVH building), add complex denoising for rough materials and performance numbers will change drastically.

Not even talking about geometry complexity, which is visually much higher in BFV

JoeJ · Nov 13, 2019

Download link: https://www.cryengine.com/marketplace/product/neon-noir
Requires CryEngine account, still have one. But 4.4 GB... will take some time...

Scott_Arm · Nov 13, 2019

Rodéric said:
It's odd to see parity between a RX5700XT and a GTX1070, I wonder what that code does... (which bottleneck it hits)

Part of the video they mention that gpus are getting better at branching code. I wonder if some of this is branching performance within compute shaders. It's something you wouldn't typically see as branching is normally avoided on gpus.

JoeJ · Nov 13, 2019

OlegSH said:
Just for absolute mirror surfaces - no need for complex denoising

For rough materials they could fallback to using voxels+mips, good for perf and the prefiltered results likely still need no denoising.

OlegSH said:
No need to build BVH every frame

For dynamic objects they could transform the prebuild BVH. (guess World of Tanks does this)
Remaining skinned characters could just refit the prebuilt tree.

But, just dreaming... personally i gave up considering compute RT.

JoeJ · Nov 13, 2019

Scott_Arm said:
Part of the video they mention that gpus are getting better at branching code. I wonder if some of this is branching performance within compute shaders. It's something you wouldn't typically see as branching is normally avoided on gpus.

Could be a factor because AMDs 64 threads are more likely to diverge than NVs 32. But now Navi has 32 mode too, and the whitpaper says compute is more likely to be compiled for 32 mode, and pixel shaders more likely 64. But that's just guessing and i don't know where the tracing happens.
I do not understand the statement, because nothing has changed on how GPUs do branching (aside from some subgroup optimizations which exist for many years).

I still guess NV is better with random memory access. David Graham speculated Turings concurrent integer / floating point units to be a potential factor in World of Tanks, but this applies to most other shaders too and is not specific to RT. (How do cache sizes differ?)

Deleted member 2197 · Nov 13, 2019

chris1515 said:
I think the demo is using RTX

Sounds like RTX acceleration is not implemented yet, but the demo can run on RTX cards.

Scott_Arm · Nov 13, 2019

JoeJ said:
Could be a factor because AMDs 64 threads are more likely to diverge than NVs 32. But now Navi has 32 mode too, and the whitpaper says compute is more likely to be compiled for 32 mode, and pixel shaders more likely 64. But that's just guessing and i don't know where the tracing happens.
I do not understand the statement, because nothing has changed on how GPUs do branching (aside from some subgroup optimizations which exist for many years).

I still guess NV is better with random memory access. David Graham speculated Turings concurrent integer / floating point units to be a potential factor in World of Tanks, but this applies to most other shaders too and is not specific to RT. (How do cache sizes differ?)

Time stamped:

Not sure what improvements he's referring to.

It's pretty interesting. They have a distance cutoff where they switch between triangle representation for near objects and voxel for more distance objects when they're casting rays. So I imagine when they implement RTX it would probably be for "near" objects only.

OlegSH · Nov 13, 2019

JoeJ said:
For rough materials they could fallback to using voxels+mips, good for perf and the prefiltered results likely still need no denoising.

Even diffuse VXAO in Rise of the Tomb Raider is noticeably inferior to GI in Metro Exodus quality wise.
For less diffuse effects, such as glossy reflections, the difference will be even more noticeable, voxel grid has a fixed resolution after all and not that dynamic voxelization is cheap.
Moreover, glossy reflections require tracing a lot of narrow cones, tracing many cones for stable reflections won't be cheap either.
Most likely, classic RT will be cheaper, better quality wise and easier to implement. There is a reason behind the lack of glossy reflections via cone tracing in modern games.

JoeJ · Nov 13, 2019

Scott_Arm said:
So I imagine when they implement RTX it would probably be for "near" objects only.

If such a fallback makes sense even with RTX present, maybe it's still worth to work on some compute alternatives. But with upcoming LOD support for HW RT, i'm still unwilling to invest personally.

Tried the demo, it runs very well. 1080p at high > 60 fps on Vega 56. Feels solid and smooth, very impressed. Only case perf drops < 60 is the scene showing the few bullets in a puddle but nothing else. Maybe it's reflections of reflections.

Sadly i can not spot the transition between triangles / voxels, and i can not edit a cfg file to show this.

OlegSH said:
Even diffuse VXAO in Rise of the Tomb Raider is noticeably inferior to GI in Metro Exodus quality wise.

Agree, looks crappy - but voxels fail on accurate contact shadows, so the most important feature for AO. CE 'could' fix this now with triangle support.
However, i'm the last who is convinced about voxels.
(Give me some time, then we can argue about classic RT being the best option for GI or not - looots of time

)

---

More voxels with impressive tracing results. Has been discussed already but now announced as an upcoming game:

PSman1700 · Nov 13, 2019

JoeJ said:
But 4.4 GB... will take some time...

Not long on a gigabit connection

fellix · Nov 13, 2019

My GTX 1080Ti scored 9281 points at 1080p Ultra settings.

Kaotik · Nov 13, 2019

fellix said:
My GTX 1080Ti scored 9281 points at 1080p Ultra settings.

Seems to just crash here upon launch, loads fine, after clicking start benchmark it loads up, I can see the framerate counter in the upper left corner, then the window turns black and exits do desktop.
Might need newest drivers? 5700XT on 19.10.2 at the moment
edit: dropping resolution from 1440p to 1080p got it to run, but crashes to desktop when the drone is landing

DavidGraham · Nov 14, 2019

RX 5700XT is at least 10% faster than Radeon VII in Crytek's RT demo. RTX 2080 is about 80% faster. So even copious amount of bandwidth doesn't seem to help Vega.

JoeJ · Nov 14, 2019

DavidGraham said:
So even copious amount of bandwidth doesn't seem to help Vega.

This post has some explantation: https://forum.beyond3d.com/posts/2085699/
(Personally i never understood GPU memory systems so well and why exactly NV seemed better with memory access.)

BTW, recently i learned Kepler did not support atomics to LDS memory and had to emulate by using slow global memory. Also there was no L2 cache yet, IIRC. This may explain why i had so large compute differences between Kepler vs. GCN even when cards performed similar in games.

Rodéric · Nov 15, 2019

Is that demo D3D11? (I see it written in the youtube video) That could explain things regarding AMD GPU performance.

chris1515 · Nov 15, 2019

Rodéric said:
Is that demo D3D11? (I see it written in the youtube video) That could explain things regarding AMD GPU performance.

This is Direct X 11.

This will be interesting to compare the performance next year with the 5.7 version when they will add raytracing hardware support and DirectX12 and Vulkan support. It will probably be available to run the demo on GPU without raytracing hardware with the new API support.

fellix · Nov 15, 2019

Could the low INT32 throughput in both GCN and Navi contribute to the performance deficit here, on top of the memory access issues?

JoeJ · Nov 15, 2019

fellix said:
low INT32 throughput in both GCN and Navi

Why low? It's full rate on both.

The question can only be: Are Turings concurrent int / float units a significant advantage here?
But if we compare 7500XT vs 2070, which are usually on par in games, this also means comparing 10TF against 7.5TF, which probably compensates this ALU advantage already.

JoeJ · Nov 15, 2019

Rodéric said:
Is that demo D3D11? (I see it written in the youtube video) That could explain things regarding AMD GPU performance.

Eventually, assuming shaders wait a lot on memory, async compute could help AMD more than NV.
But i guess Turing has no more limitations here unlike older generations and this advantage is gone?

Ethatron · Nov 15, 2019

JoeJ said:
Eventually, assuming shaders wait a lot on memory, async compute could help AMD more than NV.
But i guess Turing has no more limitations here unlike older generations and this advantage is gone?

Async to what? Reflection is sandwiched between G-Buffer generation and tiled shading. Async to SSDO which is even more expensive than tiled shading?

Next gen lighting technologies - voxelised, traced, and everything else spawn

OlegSH

JoeJ

Scott_Arm

JoeJ

JoeJ

Deleted member 2197

Guest

Scott_Arm

OlegSH

JoeJ

PSman1700

fellix

Kaotik

Drunk Member

DavidGraham

JoeJ

Rodéric

a.k.a. Ingenu

chris1515

fellix

JoeJ

JoeJ

Ethatron

Similar threads

Next gen lighting technologies - voxelised, traced, and everything else *spawn*

Deleted member 2197

Guest

Drunk Member

a.k.a. Ingenu

Similar threads

Next gen lighting technologies - voxelised, traced, and everything else spawn