Next gen lighting technologies - voxelised, traced, and everything else spawn

DavidGraham · Nov 20, 2018

Voxilla said:
That is very much what I thought is would be, isn't it ?

Continue reading further down:

We also had a bug that spawned rays off the leaves of vegetation, trees and the like. This compounded with the aforementioned bounding box stretching issue, where rays were trying to escape OUT while checking for self intersections of the tree and leaves. This caused a great performance dip. This has been fixed in the upcoming patch version and significantly improves performance.

This is what caused the fps dip in those vegetation stages, on top of the bounding box expansion.

JoeJ · Nov 20, 2018

DavidGraham said:
Honestly, and so do most of the RTX opposition so far, coming in from a position of false assumptions and rigid old convictions.

Haha,

'old convictions'... as if raytracing would be something new. I'm not against RT - i criticize with the aim of improvement, and you know this. My critique still holds: Less functionality == less black boxes, but the remainder is still black. Sorry for not being an RTX expert.

DavidGraham · Nov 20, 2018

JoeJ said:
i criticize with the aim of improvement, and you know this.

And my criticism toward your criticism (and others) is that it's based on incomplete knowledge about the underlying tech.

manux · Nov 20, 2018

This is a good read https://devblogs.nvidia.com/vulkan-raytracing/

I guess the acceleration structure and traversing it are black boxes. On the other hand user is in full control what and how of shooting rays.

Who in their right mind would want vendor or even worse chip specific api today? If acceleration structure is opened up only reasonable way for that to happen is for ray tracing to mature and then creating cross vendor api. To me it looks like ray tracing is not yet mature enough for that to happen and a lot of interesting stuff can be done with what is available.

GDC2019 could be where we get first real indication what is wider sentiment about ray tracing amongst developers and ray tracing adoption.

Voxilla · Nov 20, 2018

JoeJ said:
Unfortantely i won't get my hands on RTX anytime soon. Likely i will even completely miss this first gen RTX cards, and this really sucks...

I'm sure, as you are a motivated Dev, Nvidia will kindly provide you one for free

JoeJ · Nov 20, 2018

DavidGraham said:
based on incomplete knowledge about the underlying tech.

Knowledge about black boxed tech is mostly incomplete, hihi

(sry, stopping wasting space here now, couldn't resist)

Voxilla · Nov 20, 2018

DavidGraham said:
Continue reading further down:

This one is also interesting:
DICE: Other quality and performance improvements in development include a hybrid rendering system that uses traditional screen-space reflections where the effect is accurate, only using ray tracing where the technique fails. This should boost performance hopefully improve some of the pop-in issues RT reflections occasionally exhibit right now.

As a bonus, SSR will also give back the reflections of falling leaves, the ray tracing is lacking.
(they are not lacking because they obscure those nice reflections, but falling leaves are a real performance problem for realtime raytracing)

Scott_Arm · Nov 20, 2018

The Microsoft documentation for ray-tracing:

https://github.com/Microsoft/DirectX-Graphics-Samples/tree/master/Samples/Desktop/D3D12Raytracing
http://forums.directxtech.com/index.php?topic=5985.0
http://forums.directxtech.com/index.php?action=dlattach;topic=5985.0;attach=4779 (RS5 docs, if you can log in)

DavidGraham · Nov 21, 2018

NVIDIA Just released an article detailing it's RTX implementation in a game called Justice "a Chinese MMO".

RTX is deployed for reflections and shadows together, reflections will be added to armor, weapons, objects, puddles, rivers, canals, and others. Shadows will be added for translucency, complex interactions and the increase of number of shadow casting lights. There is also real-time ray-traced caustics, as reflections can cast light and shadows as well.

There are video and screenshots comparisons
https://www.nvidia.com/en-us/geforce/news/justice-online-geforce-rtx-ray-tracing-dlss/

OCASM · Nov 21, 2018

DavidGraham said:
NVIDIA Just released an article detailing it's RTX implementation in a game called Justice "a Chinese MMO".

RTX is deployed for reflections and shadows together, reflections will be added to armor, weapons, objects, puddles, rivers, canals, and others. Shadows will be added for translucency, complex interactions and the increase of number of shadow casting lights. There is also real-time ray-traced caustics, as reflections can cast light and shadows as well.

There are video and screenshots comparisons
https://www.nvidia.com/en-us/geforce/news/justice-online-geforce-rtx-ray-tracing-dlss/

Caustics!

Jupiter · Nov 21, 2018

Jupiter said:
One question arises for me. Why does raytracing reduce the performance in the first place?[…]

I searched after it. Raytracing effects like reflections are expensive because the shading of the meeting point is as complex as the pixels. The classic screen space methods already use calculated pixels from previous frames where the shading is free. The transfer to the raytracing data structure via compute shader is also important. Generic shading computing power cannot be replaced by anything else. A better handling of divergence and resources benefits all areas and therefore it is from my point of view not completely true that one has to be sacrificed for the other.

JoeJ · Nov 21, 2018

Jupiter said:
I searched after it. Raytracing effects like reflections are expensive because the shading of the meeting point is as complex as the pixels. The classic screen space methods use already calculated pixels from previous frames where the shading is free. The transfer to the raytracing data structure via compute shader is also important. Generic shading computing power cannot be replaced by anything else. A better handling of divergence and resources benefits all areas and therefore it is from my point of view not completely true that one has to be sacrificed for the other.

I also assume this is the bottleneck with BFV. For example at each hitpoint they need to check all effecting shadowmaps to calculate shading.
Curious how much they gain from fixing the bbox bug and improving foliage, but likely this is the reason they can only trace 20% of screenpixels, while early on we heard 'about 4-8 rays per pixel'. So, reflections are expensive because shading requirements.
Worth to mention again how texture space shading could eliminate this completely, if we sacrifice 'reflections of reflections'.

We will not see this soon, but i'm optimistic on the long run, even for future mid range GPUs...

Scott_Arm · Nov 21, 2018

Jupiter said:
I searched after it. Raytracing effects like reflections are expensive because the shading of the meeting point is as complex as the pixels. The classic screen space methods use already calculated pixels from previous frames where the shading is free. The transfer to the raytracing data structure via compute shader is also important. Generic shading computing power cannot be replaced by anything else. A better handling of divergence and resources benefits all areas and therefore it is from my point of view not completely true that one has to be sacrificed for the other.

Right now the Nvidia drivers don't allow you to run compute shaders and ray-tracing shaders in parallel, so I imagine that's the barrier to their planned implementation of doing reflections in screen space, but spawning ray-traced shaders in the cases where SSR would fail because the ray is reflected off screen.

Terminology is going to get weird distinguishing between the rays cast for SSR via compute shaders vs rays cast from DXR or RTX.

JoeJ · Nov 22, 2018

OCASM said:
2) Couldn't you compute low res cubemaps as well with RT?

I'm still thinking about replacing my compute raytracing with RTX, and after watching the Remedy video from the other thread, i did some over the thumb comparison given their performance numbers.
The result is: I have about the same RT performance using FuryX than they mention for Turing. The comparison is not fair. My geometry is simpler (surfel hierarchy, smallest surfels 10cm), but also my diffuse rays have infinite length - not just a short range. Scene size and coarse complexity is similar.
(I'd still need to run half of my stuff beside RTX, so using RTX here would indeed cause a slow down. Reflections remain my first application for it.)

I'm saying this is to substantiate my reasons of critique. So you understand why i say we no longer need new fixed function stuff, just improve compute and let us implement what we need in the best way possible.
Of course you'll still doubt all this, but you should consider you might be wrong yourself.

... directed to all who think the purpose requires fixed function and justifies black boxes! (not to you personally)

That said (again... won't stop until i have my work generation shaders

), the video has evidence shading is indeed the bottleneck for reflections.
They say they need 1ms for tracing, and 7ms for shading the hit. Exactly the 8ms we see in BFV.
That's awesome! 1ms is better than i've expected. Photorealsim is near, guys, its coming...

(At least a huge step towards)

Shifty Geezer · Nov 22, 2018

How efficient is the triangle intersection in compute? Could some large, low-latency eDRAM/SRAM improve tests or would they still be too broad to fit any size cache and you'll always be limited by bus transfers? From looking at your compute performance, what are the bottlenecks and could they realistically be better addressed through fixed hardware rather than compute tweaks?

DavidGraham · Nov 22, 2018

So about the imminent release of RTX shadows in Tomb Raider: having played through the game on max settings, most shadows are very low resolution by today's standards, many small details or geometry don't receive/cast shadows as a result, and shadows flicker a lot! Most small and point lights don't cast shadows, dynamic lights (fire effects, camp fire, explosions) don't as well, also flashlights don't cast shadows on every object. And of course contact hardening/percentage filtering is completely absent.

What we need ray traced shadows to do is this:

-Add percentage closer filtering/contact hardening/ to shadows (guaranteed)
-Increase shadow resolution and force small geometry to cast shadows (guaranteed)
-Fix shadows flickering (not 100% guaranteed)
-Make every point or small light cast shadows (announced but not guaranteed on every light)
-Force every dynamic light (fire effects, explosions, flares, gun muzzle flashes) to cast shadows (not 100% guaranteed)
-Fix the flash light shadows to include every 3d object (not 100% guaranteed?)

I think if all of these points are covered, we should have a pretty solid shadow system in Tomb Raider, as for the overall visual impact, it wouldn't completely transform the look of the game, definitely for certain scenes within the game, but not across all scenes. As the impact of shadows varies from scene to scene.

JoeJ · Nov 22, 2018

Shifty Geezer said:
How efficient is the triangle intersection in compute? Could some large, low-latency eDRAM/SRAM improve tests or would they still be too broad to fit any size cache and you'll always be limited by bus transfers? From looking at your compute performance, what are the bottlenecks and could they realistically be better addressed through fixed hardware rather than compute tweaks?

I don't know for the triangle because i use discs instead, also bounding spheres instead boxes. (Mainly to make things smaller to fit into LDS. Ray - box test would be very fast with compute, ray - triangle is quite a bunch of instructions and i would assume benefit from tailored instructions here but personally i don't need them for GI)

I'm not sure about RAM, because at the moment i can not see how bandwidth limits affect me. I made the implementation in Vulkan and OpenCL. VK had no profiling tools at the time, and CodeXL for CL does not help here. I assume i see bandwidth limits when implementing 4*4 or 8*8 environment maps, but i have not ported this to GPU yet. My environment maps would not fit there anyways (200-500 MB or more? Not thought about compression...). But likely i would benefit from such RAM for various worklists. Writing them takes half of the time of my tree traversals.

About performance, i see a speedup of 2 for VK vs. CL, resulting from prerecorded command buffers and indirect dispatches available with VK. But if i reduce the workload to updating just 10% (enough for dynamic scene), i see only a speedup of 2, not 10. Also, on CPU the raytracing takes 90% of the time (expected), but on GPU it takes only 30-40% (unexpected). Tiny workloads, mostly about work generation eat up performance here, and most likely the cause is zero work dispatches and unnecessary barriers in the command buffers. This is why i want to generate the work and barriers directly on GPU. (RTX likely has fine grained sheduling under the hood which is already beyond my needs but not exposed. I don't know what AMD already has.) Async compute will also help, but it does not allow for fine grained small workloads - sync across queues quickly kills the benefit. So i plan to do async envrionment map or rendering work...

Of course my stuff would run faster if it would be fixed function, but it is very complex, so only the bottlenecks would make sense. Unsurprisingly that's traversal and tracing - but both are totally different from classical raytracing, and other algorithms would not benefit. No hard shadows or sharp reflections.
So no, even if successful i would not want fixed function. Maybe some kind of ASICs for the future... would make more sense i guess.

My main problem is not performance at all - it's good even on old GCNs. The problem is making automated tools for seamless global parametrization - that's very hard and in research for a decade (== Quadrangulation, often used for finite element analysis). I'm on par with current state of the art, but for games we want very large quads (or texels) representing the top levels of a LOD hierarchy (or mip maps, if you want). So that's the remaining open problem i have to solve before i can start to work on a renderer supporting game models. I hope i can do this within next year... last graphics work was a GL ES1.1 mobile game... so some stuff to learn about rendering since then

JoeJ · Nov 22, 2018

DavidGraham said:
What we need ray traced shadow to do is this:

I have similar thoughts here than for reflections:
With texture space shading and stochastic updates RT can become even cheaper than shadow maps. (With shadow maps you'd still need to render all of them per frame, even if you want only stochastic updates.

Only RT can take the full benefit here.)

But after all those praising i have to mention the drawbacks too:
The surface we need to shade increases by a factor of... about 8? So even if we update only 10% there is no win, just higher memory requirements.

The only solution here is to reduce shading- and so visible texture resolution as well.
Personally i think computer graphics are too sharp anyways and that's no big problem, but some people want 16K textures and flickering pixel crawling high frequency details that hurt your eyes, and a real camera can never produce

I remember the outcry about Quantum Break. To convince such people, we need a real big leap (and a sharpening filter).

So the solution that i propose is far from surely good, and likely nobody will try this soon and keeps working around with other optimizations.

Voxilla · Nov 23, 2018

JoeJ said:
The only solution here is to reduce shading- and so visible texture resolution as well.
Personally i think computer graphics are too sharp anyways and that's no big problem, but some people want 16K textures and flickering pixel crawling high frequency details that hurt your eyes, and a real camera can never produce

That's why rasterization makes use of mip-mapping, bi/tri/ansiotropic, right.
In texture space you still can do this to reduce cached texture memory requirements in the distance.

eloic · Nov 23, 2018

JoeJ said:
I'm still thinking about replacing my compute raytracing with RTX, and after watching the Remedy video from the other thread, i did some over the thumb comparison given their performance numbers.
The result is: I have about the same RT performance using FuryX than they mention for Turing. The comparison is not fair. My geometry is simpler (surfel hierarchy, smallest surfels 10cm), but also my diffuse rays have infinite length - not just a short range. Scene size and coarse complexity is similar.
(I'd still need to run half of my stuff beside RTX, so using RTX here would indeed cause a slow down. Reflections remain my first application for it.)

I'm saying this is to substantiate my reasons of critique. So you understand why i say we no longer need new fixed function stuff, just improve compute and let us implement what we need in the best way possible.
Of course you'll still doubt all this, but you should consider you might be wrong yourself.

... directed to all who think the purpose requires fixed function and justifies black boxes! (not to you personally)

That said (again... won't stop until i have my work generation shaders ), the video has evidence shading is indeed the bottleneck for reflections.
They say they need 1ms for tracing, and 7ms for shading the hit. Exactly the 8ms we see in BFV.
That's awesome! 1ms is better than i've expected. Photorealsim is near, guys, its coming... (At least a huge step towards)

I'm glad to read your enthusiasm. Gook luck with your work and please don't forget to share your findings!

------------------------------------------------------------------------------------------

Next gen lighting technologies - voxelised, traced, and everything else spawn

DavidGraham

JoeJ

DavidGraham

manux

Voxilla

JoeJ

Voxilla

Scott_Arm

DavidGraham

OCASM

Jupiter

JoeJ

Scott_Arm

JoeJ

Shifty Geezer

uber-Troll!

DavidGraham

JoeJ

JoeJ

Voxilla

eloic

Similar threads

Next gen lighting technologies - voxelised, traced, and everything else *spawn*

uber-Troll!

Similar threads

Next gen lighting technologies - voxelised, traced, and everything else spawn