Next gen lighting technologies - voxelised, traced, and everything else *spawn*

Q: Why is NV boycotting its own invention, namely GPGPU? They have new options to generate work on GPUs for raytracing and procedural mesh generation. Why do they not expose this to general purpose compute? Why locking up a fundamental building block like tree traversal?
A: Because they do not want us to be software developers or inventors. They want us to become their USERS, limited and tied to their products.
Because CUDA (and the half-assed OpenCL) development on one side and OpenGL/Vulkan/DirectX on the other side are completely disjunct departments at NVidia.

The have completely different policies about which features to support on which hardware, where and when to apply artificial throttles, and which hardware functions they may access or expose publicly.

Not only the departments are disjunct, their knowledge also is. You can safely assume that the CUDA team does not know how the rasterizer related hardware (scheduling) extensions work. Or how scheduling for graphics works at all. But they did get to expose an API for the tensor cores, while the graphics team didn't get permission to expose them.

If I were to exaggerate, I would say you have data scientist with their exclusive toys and marketing in one department, and proper engineers in the other. While the later try to stick to standardized APIs, where possible.
 
Because CUDA (and the half-assed OpenCL) development on one side and OpenGL/Vulkan/DirectX on the other side are completely disjunct departments at NVidia.

The have completely different policies about which features to support on which hardware, where and when to apply artificial throttles, and which hardware functions they may access or expose publicly.

Not only the departments are disjunct, their knowledge also is. You can safely assume that the CUDA team does not know how the rasterizer related hardware (scheduling) extensions work. Or how scheduling for graphics works at all. But they did get to expose an API for the tensor cores, while the graphics team didn't get permission to expose them.

If I were to exaggerate, I would say you have data scientist with their exclusive toys and marketing in one department, and proper engineers in the other. While the later try to stick to standardized APIs, where possible.

I agree, but i talk about the question 'why did the graphics team did not get permission to expose?', if you want so.
I feel a bit like being a sheep, fed with new fixed function functionality of which i do not really need to solve open graphics problems, but keeps me busy to implement and utilize. I feel expected to present my awesome idea of how to denoise shadows from two lightsources at once at next years GDC... wow!

To solve open problems, it makes no difference if you are a graphics developer or a scientist about quantum physics. You need general purpose functionality so you are not restricted in your choice or invention of algorithms. Because the problems are open, the solution is unknown. If fixed function hardware could solve it, the problem wouldn't be open anymore. Fixed function only gives speedups of an order of magnitude - that's not enough. (Example: Voxel reflections take 2ms - BFV needs 7ms! Do they look so much better?)
 
I disagree with this being creative or cool. This is total bullshit, and my main reason of criticizing RTX.

Remember when people started to use pixel shaders for other purposes? How horrible, inefficient workarounds they had to use just to utilize GPU power?

It already happens again... this is why some NV folks say 'raytracing is the new compute' - pah! We are back to the stone age of GPUs, and of course NV likes to see that, because they took the lead in sending us back in time, and they are the only ones who profit from that.


All this guy is doing is seeking for workarounds to utilize traversal hardware. Don't you think he could do better algorithms if he had direct access to this feature, beyond the RT restrictions and black boxes?


Q: Why is NV boycotting its own invention, namely GPGPU? They have new options to generate work on GPUs for raytracing and procedural mesh generation. Why do they not expose this to general purpose compute? Why locking up a fundamental building block like tree traversal?
A: Because they do not want us to be software developers or inventors. They want us to become their USERS, limited and tied to their products.
Or maybe because it would be much slower, to the point of having to wait many years to achieve results they can now with hardware acceleration.

It'll then be up to Intel and AMD to come up with powerful enough general purpose hardware that's better.
Larrabee...

 
Or maybe because it would be much slower, to the point of having to wait many years to achieve results they can now with hardware acceleration.

Yes... maybe. But we totally don't know if there would be a difference at all.

But what is it that impresses you? Is it the dancing robot demo (which somebody here mentioned runs on multiple GPUs, IIRC? And it's still a small scene too!), is it the BFV reflections, which is still not affordable to the average consumer?
Or is it just the promise of realtime path tracing came a bit closer again?

Seriously, voxel reflections had not been utilized in games with some exceptions, but except from extremely sharp reflections they can compete in quality and are likely faster, if your'e willing to spend effort on solving the slow voxelization problem.
Diffuse interreflection seen in Metro (and another game i can't remember the name) seems limited. Static, some objects excluded, approximated, single bounce, low spatial res. I do not know how it works but it's not better than Cryteks recent voxel GI. (I do not like voxels, but i need an examples.)
So, maybe you are STILL waiting many years, from now on?

The problem then is: During those years with focus on RTX the alternatives might fall short. And much worse, public research on improving RT performance is from now on stalled. Only a fool would research raytracing performance since RTX is here.
Personally the most impressive RT stuff i have seen is still the noisy brigade videos. Vray guys with RTX demo do not come close to that. Nothing RTX has shown comes close. (Not sure why.)


And now another hypothetical scenario - What if Intel would have decided: Yes - time is ready - release Larrabee to the consumers!
You would still wait many years, but this hardware has no black boxes or restrictions. You can do anything with it, not just raytracing. It is so open, you do not even need an API.
What could happen? We do not know, but we have learned: Exposing GPU power to general purpose programming brought us practical Deep Learning. Unexpected, but it happened.

I know the Larrabee example is bad, because many core CPUs often are not efficient in comparison to GPUs. What we need is GPU architecture as open as CPUs for programming.
We do not need fixed function, we need the flexibility to implement ourselves. Only then we can make ideal progress during those years you'll still have to wait. At short RTX gives new options and a push, but for long it adds only restrictions.


Sigh... i sound like a priest and i repeat myself - sorry :) But i'd still like to hear what impresses you, why you are excited... Maybe i need to change my mind a bit too, so share some optimism... :)
 
What interests me is to see if games take similar paths as movies took. Maybe production cost also becomes even bigger part of equation. Maybe lowering production cost can create better graphics some time in future? True day/night cycles in car games/open world games, yes please.
 
Diffuse interreflection seen in Metro (and another game i can't remember the name) seems limited. Static, some objects excluded, approximated, single bounce, low spatial res. I do not know how it works but it's not better than Cryteks recent voxel GI. (I do not like voxels, but i need an examples.)
You need to read some of the other threads discussing RT, particularly the "relevance of Turing for consoles" and/or "predict next gen hardware" threads.

I was arguing in favour of voxelised GI, and eventually we find recent demos that showed it wasn't scaling well, so early, simple demos at realtime framerates from a few years ago weren't resulting in realtime framerates in game. RT lighting could well end up faster.
 
Yes... maybe. But we totally don't know if there would be a difference at all.

But what is it that impresses you? Is it the dancing robot demo (which somebody here mentioned runs on multiple GPUs, IIRC? And it's still a small scene too!), is it the BFV reflections, which is still not affordable to the average consumer?
Or is it just the promise of realtime path tracing came a bit closer again?

Seriously, voxel reflections had not been utilized in games with some exceptions, but except from extremely sharp reflections they can compete in quality and are likely faster, if your'e willing to spend effort on solving the slow voxelization problem.
Diffuse interreflection seen in Metro (and another game i can't remember the name) seems limited. Static, some objects excluded, approximated, single bounce, low spatial res. I do not know how it works but it's not better than Cryteks recent voxel GI. (I do not like voxels, but i need an examples.)
So, maybe you are STILL waiting many years, from now on?

The problem then is: During those years with focus on RTX the alternatives might fall short. And much worse, public research on improving RT performance is from now on stalled. Only a fool would research raytracing performance since RTX is here.
Personally the most impressive RT stuff i have seen is still the noisy brigade videos. Vray guys with RTX demo do not come close to that. Nothing RTX has shown comes close. (Not sure why.)


And now another hypothetical scenario - What if Intel would have decided: Yes - time is ready - release Larrabee to the consumers!
You would still wait many years, but this hardware has no black boxes or restrictions. You can do anything with it, not just raytracing. It is so open, you do not even need an API.
What could happen? We do not know, but we have learned: Exposing GPU power to general purpose programming brought us practical Deep Learning. Unexpected, but it happened.

I know the Larrabee example is bad, because many core CPUs often are not efficient in comparison to GPUs. What we need is GPU architecture as open as CPUs for programming.
We do not need fixed function, we need the flexibility to implement ourselves. Only then we can make ideal progress during those years you'll still have to wait. At short RTX gives new options and a push, but for long it adds only restrictions.


Sigh... i sound like a priest and i repeat myself - sorry :) But i'd still like to hear what impresses you, why you are excited... Maybe i need to change my mind a bit too, so share some optimism... :)
Seeing ray tracing even if in limited form in an actual game at 1080p 60fps is pretty impressive. In terms of effects, we can finally ditch shadowmaps and get accurate shadows, especially area shadows which are pretty much non-existent in videogames and are massively important for realism. PBR shading is too good for the crappy (in comparison) dynamic lighting techniques used in videogames. In order for lighting to look good nowadays you need lightmaps which means static environments. With ray tracing that's no longer the case.

Also, the death of awful screen space effects like SSR and SSAO.

Voxel cone tracing is nice but in order to get the same resolution you see right now for ray tracing it seems to me like you would need massive amounts of memory. Also, it's max resolution is fixed. With ray tracing you can zoom in as much as you want and not lose any quality. Crytek's implementation shows this, it's too low detail (though better than nothing, for sure).

In terms of fully programmable hardware, that's nice but it costs you a lot of speed. Is the trade-off worth it at this point? I don't think so.
 
What interests me is to see if games take similar paths as movies took. Maybe production cost also becomes even bigger part of equation. Maybe lowering production cost can create better graphics some time in future? True day/night cycles in car games/open world games, yes please.
I assume exploding production costs are the main problem for games, so any new tech makes the promise of lowering production costs.
But i do not think RT can hold this promise (assuming NV would not pay / assist to implement): * Need to maintain multiple code paths w/wo RT for many years * Need to learn new stuff like denoising / experiment with solving new problems * But finally, those few engine programmers do not really affect the costs at all, and the benefit for artists is tiny.
Day and Night changes so slowly we could update baked lighting in background without a need for new hardware.

I was arguing in favour of voxelised GI, and eventually we find recent demos that showed it wasn't scaling well, so early, simple demos at realtime framerates from a few years ago weren't resulting in realtime framerates in game.
I see voxels as a failure in general, not only for lighting. But 2ms for reflections is true, existing game on 'weak' PS4 is true as well. And there are better approaches than that.
But actually the resulting FPS with RTX are just as bad if not worse, and VCT idea is 10 years old and is still considered unpractical.

PBR shading is too good for the crappy (in comparison) dynamic lighting techniques used in videogames.
PBS works well with games, but it requires environment data, and denoising can't reconstruct this. So either you loose PBS as seen in denoising papers, or you use RTX to update sparse environment probes in the background.
Because i can compute low res environments at dense locations faster with compute than with RTX i'm personally not impressed. So i reduce my interest to sharp RTX reflections, which is not that important but already has twice the cost than my own GI stuff.

Also, the death of awful screen space effects like SSR and SSAO.
Agree. Also about a FPS drop.

Voxel cone tracing is nice but in order to get the same resolution you see right now for ray tracing it seems to me like you would need massive amounts of memory. Also, it's max resolution is fixed. With ray tracing you can zoom in as much as you want and not lose any quality. Crytek's implementation shows this, it's too low detail (though better than nothing, for sure).
Agree. And i add that also my approach has fixed max resolution (about 10cm on current console gen), which is why i want RT and path tracing on the long run even if i succeed.



In terms of fully programmable hardware, that's nice but it costs you a lot of speed. Is the trade-off worth it at this point? I don't think so.

I don't know either, but i do know introducing compute had ZERO negative performance effect on pixel shaders, although they run on the same shader cores.
There is no reason to think it would be different with RT cores, and even less reason to think so if we talk about work generation from GPU, which is definitively here now but not exposed.

Conclusion: Blackbox has political reasons, not technical ones. So being critical is what we should do, even if we are excited.
 
In terms of fully programmable hardware, that's nice but it costs you a lot of speed. Is the trade-off worth it at this point? I don't think so.
You mention that as an absolute, but there's degrees of programmability and overhead. Replacing the hardware with CPUs would crash performance, but tweaking a memory intersect tester to support alternative memory structures might, for argument's sake, increase total GPU silicon by 1% without affecting the performance of the tests while increasing total throughput notably due to optimisations.

When people talk about programmable hardware, we aren't wanting it all replaced with CPUs, but with well balanced processors that are flexible in their specific area as makes sense.
 
I see voxels as a failure in general, not only for lighting. But 2ms for reflections is true, existing game on 'weak' PS4 is true as well. And there are better approaches than that.
But actually the resulting FPS with RTX are just as bad if not worse, and VCT idea is 10 years old and is still considered unpractical.

I would not write off voxels in general, as for example medical scanners produce 'assets' in voxel format. And these need to be visualized, such as can be seen here, (my own work BTW).
 
Does anyone know if the tracings are native res? Dropping to half res won't be at all apparent and save a lot of time.

This remembers me about an interesting benchmark i've seen on german site here: http://www.pcgameshardware.de

Strangely the results have been removed from there meanwhile, but i did save a pic for discussion with another dev:

bench.JPG

But he interesting part has been written only in the text. According to this, 20X0 models raytrace at 1400p and use DLSS to upscale. But 10X0 raytraces at native 4K (see 'native resolution comment in the picture').
Considering this, the 2070 is not so much faster than the 1080ti:

2070: 19.8 fps = 50ms per frame x (3.8 x 2.1 res) = 399 work score
1080ti: 10.1 fps = 100ms per frame x (1.4 x 2.5 res) = 350 work score

(I'm shocked - did not do the math before - PPS is such a terrible performance measurement! Or did i something wrong with the math?)

However, there must be something really wrong here. Maybe they have removed the results because of that, and second i do not expect great optimization from early work with UE4.
 
And these need to be visualized, such as can be seen here, (my own work BTW).

Impressive! (And good to know some devs here :)

I think voxels are useful in games for highly diffuse geometry like vegetation and many other special cases. But referring UD or Automontage, claiming they will replace triangles in general smells more like luring investors than a reasonable option. So i mean just this and my personal failure with voxels for GI.
 
I see voxels as a failure in general, not only for lighting. But 2ms for reflections is true, existing game on 'weak' PS4 is true as well. And there are better approaches than that.
But actually the resulting FPS with RTX are just as bad if not worse, and VCT idea is 10 years old and is still considered unpractical.

PBS works well with games, but it requires environment data, and denoising can't reconstruct this. So either you loose PBS as seen in denoising papers, or you use RTX to update sparse environment probes in the background.
Because i can compute low res environments at dense locations faster with compute than with RTX i'm personally not impressed. So i reduce my interest to sharp RTX reflections, which is not that important but already has twice the cost than my own GI stuff.

Agree. Also about a FPS drop.

Agree. And i add that also my approach has fixed max resolution (about 10cm on current console gen), which is why i want RT and path tracing on the long run even if i succeed.

I don't know either, but i do know introducing compute had ZERO negative performance effect on pixel shaders, although they run on the same shader cores.
There is no reason to think it would be different with RT cores, and even less reason to think so if we talk about work generation from GPU, which is definitively here now but not exposed.

Conclusion: Blackbox has political reasons, not technical ones. So being critical is what we should do, even if we are excited.
1) Well, I wouldn't draw too many conclusions from the first implementation in a commercial game (as an afterthought). A game designed for Turing might not incur such performance pitfalls.

2) Couldn't you compute low res cubemaps as well with RT? With the benefit that you could do it stochastically and maybe amortize the cost over several frames.

3) It gives games a more cinematic look :LOL:

4) I guess it could still be good enough for games that don't require ultra-realism.

5) I hope you're right but since Turing is all we have right now it's hard to tell.

You mention that as an absolute, but there's degrees of programmability and overhead. Replacing the hardware with CPUs would crash performance, but tweaking a memory intersect tester to support alternative memory structures might, for argument's sake, increase total GPU silicon by 1% without affecting the performance of the tests while increasing total throughput notably due to optimisations.

When people talk about programmable hardware, we aren't wanting it all replaced with CPUs, but with well balanced processors that are flexible in their specific area as makes sense.
Maybe but until I see some hardware like that I'm not getting my hopes up.
 
So much optimizations to be made, not to mention the bugs that need fixing.

such as the bounding boxes expanding insanely far due to some feature implemented for the rasteriser that didn't play well with ray tracing. We only noticed this when it was too late. Basically, whenever an object has a feature for turning certain parts on and off, the turned-off parts would be skinned by our compute shader skinning system for ray tracing exactly like the vertex shader would do for the rasteriser. (Remember we have shader graphs and we convert every single vertex shader automatically to compute and every pixel shader to a hit shader, if the pixel shader has alpha testing, we also make a any hit shader that can call IgnoreHit() instead of the clip() instruction that alpha testing would do). The same problem also happens with destructible objects because that system collapses vertices too.

This has bug has been fixed and will be shipping soon and we can expect every game level and map to see large, significant performance improvements.

We also had a bug that spawned rays off the leaves of vegetation, trees and the like. This compounded with the aforementioned bounding box stretching issue, where rays were trying to escape OUT while checking for self intersections of the tree and leaves. This caused a great performance dip. This has been fixed and significantly improves performance.


Why do you claim to know more than Dice ? They write:
"Status: Currently investigating"
This means we don't know yet the cause.
I hope you are convinced now.
 

Best read so far about RTX - tells me more than reading the API docs :)
Most interesting is the section about ray binning and how they care. This implies the hardware probably does not implement it under the hood like i have assumed. If so, this means two things: Other vendors can catch up more easily, and the implementation is not as black boxed as i thought.
Not sure, however.
Without development experience myself i make a lot assumptions here. Many things may be just wrong!

2) Couldn't you compute low res cubemaps as well with RT? With the benefit that you could do it stochastically and maybe amortize the cost over several frames.

That's exactly what i'm doing. I do RT, but i can not expect improvements from HW RT tied to triangles, bounding box BVH and isolated rays. I could implement my main ideas using RTX, but likely it would be too slow and a waste of resources. Better utilizing it for the missing high frequency stuff i can't deliver. This way RTX also keeps optional and i expect next console gen will not have it anyways.


Unfortantely i won't get my hands on RTX anytime soon. Likely i will even completely miss this first gen RTX cards, and this really sucks... :(
 
I hope you are convinced now.

DICE: Another problem we are having currently in the launch build is with alpha tested geometry like vegetation. If you turn off every single alpha tested object suddenly ray tracing is blazingly fast when it only is for opaque surfaces.

That is very much what I thought it would be, isn't it ?
I'm not too sure the slow raytracing in the presence of foliage is a bug.
See: Effectively Integrating RTX Ray Tracing into a Real-Time Rendering Engine
"Note that the dependent memory access issue typical to all hit shaders can be especially pronounced in the alpha test shader as it is so trivial, and the compiler doesn’t have many opportunities for latency hiding."
 
Last edited:
Back
Top