Polygons, voxels, SDFs... what will our geometry be made of in the future?

Part of the problem with mesh shaders I suppose is the lack of direct support on PS5. Which kind of kills that for cross platform games this entire generation, unless two implementations are done and one in software like Nanite is just to run on PS5.
They didn't implement Nanite's compute rasterizer because the PS5 doesn't support Mesh Shaders. They probably started working on it before there was a Mesh Shader API. The PS5 Primitive Shader and Mesh Shader API are so close supporting both won't be a problem.
 
Yeah - see those seams. Also this cache is noisy and bad quality:
View attachment 5522
They must blur this like crazy for the final shading:
View attachment 5523
So this is where my expected AO alike detail gets lost, but due to noise it never really existed in the first place. So the advantage of surface cache over volume probes remains theoretical for them.

What I don't understand is the use of cards instead of a hemispherical basis function. For example Square Enix's "Virtual Spherical Guassian Lights" is a great paper and only missing fast occlusion. The end results are incredibly compelling, including far more detailed and accurate reflections, and the possibility of doing specular bounces and etc.

A surfel cache of that seems... well the right move. You can place them arbitrarily. Use the SDF capsules already out there for skinned characters, use full spheres for foliage. Ah, well.

They didn't implement Nanite's compute rasterizer because the PS5 doesn't support Mesh Shaders. They probably started working on it before there was a Mesh Shader API. The PS5 Primitive Shader and Mesh Shader API are so close supporting both won't be a problem.

I mean, sure. It was just lucky they already have such a thing is all. Just pointing others are going to have deal with API differences as well.
 
So the micropolygon bit is a lie? Especially with anisotropy trying to maintain sub pixel triangles with a fixed per cluster LOD is a lost cause.
IDK if the term 'micropolygon' striclty means subpixel triangles, and what EPIC has commonicated. And it may depend on HW and data what your final geometry resolution is.
But i think it's fine in any case. Most importantly the system can support low spec HW still efficiently. Tiny triangles are not necessary but just an option.
The cluster end up quite small, and a LOD switch seemingly always happens for an island of clusters. I assume this limits complexity of stitching and it's very clever if so. I don't notice popping switches, maybe if i'd look specifically for it and would disable TAA.
So it's not REYES, but a practical realtime solution.

What I don't understand is the use of cards instead of a hemispherical basis function. For example Square Enix's "Virtual Spherical Guassian Lights" is a great paper and only missing fast occlusion. The end results are incredibly compelling, including far more detailed and accurate reflections, and the possibility of doing specular bounces and etc.
IDK how they store irradiance. I call it 'texels' only because i assume they are grid aligned to the cards, similar to textures. But they surely use some angular basis to support normal mapping and rough reflections. We could peek at the code to find out.

A surfel cache of that seems... well the right move. You can place them arbitrarily.
To me the mapping between probe/surfel and actual surface is still the final open problem, and i fail to come up with an efficient solution.
If you place them arbitrarily, you have to search for the probes affecting a pixel, which is expensive. That's one reason a probe volume like DDGI is very simple and attractive, because this lookup is easy and constant time.

For Epic, using the cards, it's already more complex. To update the probes, they can precompute a best fit point on the surface and trace from there.
But the other way around (shading G buffer) is difficult. They could project from surface along normal to the card and then do a 'texture fetch' to interpolate probes. But this handles only per object box, so would not resolve the seams we see in debug view. To handle multiple boxes, they would need to lookup a global spatial data structure and iterate all found cards. Doing this per pixel has some cost.

Another option would be to render the debug view and filter some kernel around the shaded pixel. But this has cost too, and would miss the angular information to support surface direction, so they would need to render the debug view to something like an SH framebuffer, which is fat then so even more costly. IDK what they do - worth to take a look...

My own plan here initially was to have an UV channel on the surface to address a texture of probes, just like lightmaps. And to prevent seams i worked on quadrangulation of the geometry to generate seamless UVs. That's difficult already, and if we add LOD to the mix, the LOD mechanism has to match the probe hierarchy. It also has to support reduction of topological complexity, e.g. closing small holes if we see the object from larger distance. All this is possible and makes sense, but requires remeshing of the models, resulting in loss of geometrical accuracy and detail. It's also not practical for diffuse geometry like foliage.
So, i think i can use this system only for most opaque stuff (terrain, solid walls of buildings, etc.), but not for detailed, human made geometry (a fence, chairs, detailed furniture...) or foliage.

Thus, i need something more general as well. The obvious solution is to traverse my BVH of surfels for each shaded pixel, but i think this would be more costly than the entire algorithm to compute GI. It's not as expensive as ray tracing, but still a range query on a tree.
To do better, i can make tiles of the Z buffer, calculate bounding box, and do one traversal per tile. That's still very expensive, but likely fast enough. Downside: It's not compatible with RT. Grouping ray hits spatially just to support such shared traversal algorithm does not make much sense, so i can't avoid to do a traversal per hit just to find the already computed GI probes.
Binning all probes into a regular volume grid each frame also seems too expensive.

So, i can sing a song about the difficulties with surface probes. This 'simple' mapping problem seems negligible at first, but it is the main reason i'm years late to make my stuff 'ready for production'. Any ideas welcome! ;)
 
They didn't implement Nanite's compute rasterizer because the PS5 doesn't support Mesh Shaders. They probably started working on it before there was a Mesh Shader API. The PS5 Primitive Shader and Mesh Shader API are so close supporting both won't be a problem.
Is the main difference just the lack of amplification shaders?
 
So the micropolygon bit is a lie? Especially with anisotropy trying to maintain sub pixel triangles with a fixed per cluster LOD is a lost cause.

By the common definition of what a micropolygon is with respect to Reyes it's a marketing fib, but you could say if you run it at a super sampled screen resolution or due to the effects of temporal AA it almost could be true. If you look at the coloured primitive triangle view of UE5 the average screen size of each triangle is maybe around 8x8 pixels in size, down to about 4x4 pixels in places.
 
Is the main difference just the lack of amplification shaders?

You could actually implement something like UE5 hierarchical triangle cluster LOD using domain/tesselation/geometry shader stuff, but it would be nowhere near as efficient as mesh shaders which by design get rid of an important pipeline bottleneck. In fact you could implement all of this well on a Playstation 2.
 
IDK how they store irradiance. I call it 'texels' only because i assume they are grid aligned to the cards, similar to textures. But they surely use some angular basis to support normal mapping and rough reflections. We could peek at the code to find out.

I suspect nVidia can very quickly push features/performance into GPU's specifically to suit this. As hierarchical triangle cluster LOD can be implemented in mesh shaders, then fed to the RTX units. On the Lumen side they could integrate functionality for RTX to better accelerate sampling and runtime modification/caching of an AABB BVH mapped to texture samples for the lighting cache.
 
IDK how they store irradiance. I call it 'texels' only because i assume they are grid aligned to the cards, similar to textures. But they surely use some angular basis to support normal mapping and rough reflections. We could peek at the code to find out.

Do they use it as irradiance though? If you just store radiance you can store a flat representation. Then if you're tracing a hemisphere anyway...

To me the mapping between probe/surfel and actual surface is still the final open problem, and i fail to come up with an efficient solution.
If you place them arbitrarily, you have to search for the probes affecting a pixel, which is expensive. That's one reason a probe volume like DDGI is very simple and attractive, because this lookup is easy and constant time.

For Epic, using the cards, it's already more complex. To update the probes, they can precompute a best fit point on the surface and trace from there.
But the other way around (shading G buffer) is difficult. They could project from surface along normal to the card and then do a 'texture fetch' to interpolate probes. But this handles only per object box, so would not resolve the seams we see in debug view. To handle multiple boxes, they would need to lookup a global spatial data structure and iterate all found cards. Doing this per pixel has some cost.

Another option would be to render the debug view and filter some kernel around the shaded pixel. But this has cost too, and would miss the angular information to support surface direction, so they would need to render the debug view to something like an SH framebuffer, which is fat then so even more costly. IDK what they do - worth to take a look...

My own plan here initially was to have an UV channel on the surface to address a texture of probes, just like lightmaps. And to prevent seams i worked on quadrangulation of the geometry to generate seamless UVs. That's difficult already, and if we add LOD to the mix, the LOD mechanism has to match the probe hierarchy. It also has to support reduction of topological complexity, e.g. closing small holes if we see the object from larger distance. All this is possible and makes sense, but requires remeshing of the models, resulting in loss of geometrical accuracy and detail. It's also not practical for diffuse geometry like foliage.
So, i think i can use this system only for most opaque stuff (terrain, solid walls of buildings, etc.), but not for detailed, human made geometry (a fence, chairs, detailed furniture...) or foliage.

Thus, i need something more general as well. The obvious solution is to traverse my BVH of surfels for each shaded pixel, but i think this would be more costly than the entire algorithm to compute GI. It's not as expensive as ray tracing, but still a range query on a tree.
To do better, i can make tiles of the Z buffer, calculate bounding box, and do one traversal per tile. That's still very expensive, but likely fast enough. Downside: It's not compatible with RT. Grouping ray hits spatially just to support such shared traversal algorithm does not make much sense, so i can't avoid to do a traversal per hit just to find the already computed GI probes.
Binning all probes into a regular volume grid each frame also seems too expensive.

So, i can sing a song about the difficulties with surface probes. This 'simple' mapping problem seems negligible at first, but it is the main reason i'm years late to make my stuff 'ready for production'. Any ideas welcome! ;)

For foliage... non spherical terms make sense almost. You can store one 2d sdf plane for geometry, that plays much nicer with indirect tracing, get distance to plane/distance to 2d sdf. Then materials and radiance are just... textures. A pile of cards attached to a mesh in the a trees case, just cards for something like grass... use the lowest LODs for geometry approximations, since those are just alpha tested planes anyway.

As for skinned complex models... we've got SDF capsules, those are fast. An albedo approximation can just be a spherical function mapped to the shape. Cached radiance is obviously just another spherical function of you need that. Or rather it can look like a voxel collection of SGGX, and the models can looks like SGGX mapped to capsules. As a representation of albedo it's fuzzy, but has apparent detail that's correct over all views, and could easily be enough for indirect lighting. Heck, screw it, maybe just have it all be SGGX stored voxel albedo, static models too.

As for seams... that's tough. Though if you want seamless mapping... the math behind this paper is pretty out there. But the results are great looking, locally injective conformal maps for any mesh! Though that doesn't help with LOD... maybe it really should just all be volumetric? Mip map away! The SGGX paper itself mentions that while this isn't necessarily correct (I think this honestly breaks bekenstein limit from physics) the end result is that you don't really notice.

 
Last edited:
Do they use it as irradiance though?
After some lookup i'm no longer sure i got the terms irradiance vs. radiance correctly at all. Lighting terms are still confusing to me, tbh.
What i meant was they likely store incoming light, not outgoing light in the cache. Advantage is it's easy to have different resolutions of lighting and materials this way.
the math behind this paper is pretty out there. But the results are great looking, though that doesn't help with LOD...
Crane is my favorite researcher about geometry, always checking his site :) Though he never worked on seamless parametrization. Parametrization as shown only serves as a first step to that, because it ignores the integer matching problem coming from discrete texels. Quadrangulation methods address this. Even if we are not interested in quad meshes, we can use those results to get seamless UVs for regular triangle meshes. This paper is pretty good, but lacking control over singularities i had to come up with another solution.
To support LOD, i wanted to calculate very large base quads for low detail, which then can be simply subdivided for high detail. So those big quads would correspond to those UE5 cards, but with a robust bijective mapping to the surface and variable detail.
I think this was the hardest work i've ever done, here's an example of the result:
upload_2021-5-31_8-14-30.png
The various mip levels of UV quads correspond to the surfel hierarchy i use for GI, and mapping from surface is straight forward. Also enables detail / compression options like displacement, which also needs seamless parametrization to work properly beyond just height maps.
Nice, but the maximum possible size of quads depends on topological features, like the ears in this case. This gave me the motivation to use volumetric representation of geometry (just like those voxel trees in your video) to reduce this complexity by merging such features to a single bump, closing small holes, etc.

Well, i'll see how far that goes. Preprocessing times are an issue, for example.
When i came up with that solution for GI many years ago, i really did not expect the need to solve LOD and geometry as well. It sucks how all those problems are related to each other.
 

This is an interesting early look at limitations so far by an artist:
https://www.artstation.com/ronanm/blog/qM77/testing-lumen-vs-raytraced-scenes-in-unreal-5

Also a lot of posts with other issues in their forums that are interesting:
https://forums.unrealengine.com/tag/Lumen

Not a criticism of the tech, but it feels like a lot of people were expecting it to 'just work' without the usual production fiddling (Which they do mention in the docs to be fair).
 
Last edited:
This is an interesting early look at limitations so far by an artist:
https://www.artstation.com/ronanm/blog/qM77/testing-lumen-vs-raytraced-scenes-in-unreal-5

Also a lot of posts with other issues in their forums that are interesting:
https://forums.unrealengine.com/tag/Lumen

Not a criticism of the tech, but it feels like a lot of people were expecting it to 'just work' without the usual production fiddling (Which they do mention in the docs to be fair).

Eh, Lumen is disappointing versus Nanite.

Nanite is limited to static meshes and it's disappointing to hear there won't be a solution for skinned meshes and foliage yet. There's going to be an incredibly distinct mismatch in asset quality there for quite a while it sounds like. What's more disappointing is they sound like they want a "Done" solution like Nanite, rather than "good enough". Seems like you could run a more standard meshlet/cone pipeline on geometry groups that are relatively static to each other, then store than in a tree heirarchy per object; like it would play nicely with their occlusion and instancing pipeline already, and allow for a large increase in asset quality even if the geometry runtime wasn't close to constant and it didn't have the art pipeline benefits of nanite.

As for Lumen, well I was hoping with Epics huge resources they could generally match small team open source engine Godot. Not that the Godot team isn't smart, they're brilliant in places, the GI being a distinctive part of that. But with that example provided already, making it easier to understand what came before, and how much Epic could throw at it... I mean. In comparison Lumen has quite poor performance, much worse sharp reflections, no support for on the fly procedural mesh generation, etc.
 
For a first release of UE5 this is still very good :)

Technology takes time to mature and people need to use the tools more to provide feedback. I’m sure UE will get where it needs to go.

Im actually quite curious to see how Unity will respond to this, or if the 2 engines will completely diverge now in the segments they want to cover
 
For a first release of UE5 this is still very good :)

Technology takes time to mature and people need to use the tools more to provide feedback. I’m sure UE will get where it needs to go.

Im actually quite curious to see how Unity will respond to this, or if the 2 engines will completely diverge now in the segments they want to cover

Unity is going to trail behind in terms of renderer technology but they have an even bigger vision by transitioning their engine into an ECS architecture so that it'll enable more scalable multi-threaded gameplay code which will allow for more real-time interactivity. UE on the other hand is quickly approaching it's limit on their single threaded game framework ...
 
Unity is going to trail behind in terms of renderer technology but they have an even bigger vision by transitioning their engine into an ECS architecture so that it'll enable more scalable multi-threaded gameplay code which will allow for more real-time interactivity. UE on the other hand is quickly approaching it's limit on their single threaded game framework ...
is this what @sebbbi was hired to help out on?
perhaps not.. he's probably working rendering
 
Back
Top