Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

Isn't that a BVH LOD? Have a high detail one for close and a distant one where very low representation is adequate?
I guess with BVH LOD you mean somethnig like DXR TL/BL AS, or UE5 global/local SDF?
Problem is, while sounding reosnable, this only holds for a static viewport, or for a limited amount of time (e.g. a console life cycle).
You can not solve LOD this way. It's just a hack which may be initially impressive, then just good enough, and finally a real issue requiring urgent replacement again.

To solve LOD for real, we need to make detail decisions locally at every level of a hierarchy. And if we take this very literally, it even rules out a combination of points and triangles, for example.
Point splatting distant foliage seems really attractive. But how do we handle the transition to a one-lod triangle mesh, which is still to detailed at the location of the transition, and too coarse for close up views?
Considerng this, is it even worth to invest in splatting foliage, if it can't really solve any problem we already had before?

Or another example: Nanite has this local detail feature we want, but only per model. It does not solve the problem at greater distance, when the amount of models becomes too large again and they should gradually merge.
The solution to this is again just hacks.

The LOD system that i work on solves those problems, at least in theory. But only for a static scene, and at bad compromises regarding quality for close ups. No way around hacks to address this.

I think we'll be doomed to combine hacks for a long, long time. It feels kinda hopeless, really.
You could say that's what makes our work intersting, but i say it's depressing just as much.
 
I think we'll be doomed to combine hacks for a long, long time. It feels kinda hopeless, really.
You could say that's what makes our work intersting, but i say it's depressing just as much.

Is this solved in offline rendering? If so then its just about getting that solution running fast enough?
 
No. There is little need for dynamic LOD in offline rendering.
(which probaly is the reason why certain 'amateurs' forgot about it completely)
So basically you are saying it will go away with enough computing power , problem solved :D
 
But how do we handle the transition to a one-lod triangle mesh, which is still to detailed at the location of the transition, and too coarse for close up views?
Considerng this, is it even worth to invest in splatting foliage, if it can't really solve any problem we already had before?

Or another example: Nanite has this local detail feature we want, but only per model. It does not solve the problem at greater distance, when the amount of models becomes too large again and they should gradually merge.
The solution to this is again just hacks.
Indeed. As I say, it's basically mipmapping in 3D space. We haven't solved continuous 2d texture detail and just merge between two fixed ideal states to approximate the in-between states. Perhaps the solution in 3D starts first in a solution for 2D, or even 1D, progressive resolution? Or perhaps it's impossible as we try to represent the complexities of 4 dimensional space-time in a very finite, slow 2D memory array?!

So basically you are saying it will go away with enough computing power , problem solved :D
No, it's a memory problem. We have an excess of computing power but can't get access to the right data to process. With enough memory bandwidth and zero access latency, it's a solved problem. ;)
 
So basically you are saying it will go away with enough computing power , problem solved :D
Hehe, sure. But you know my agenda: We can no longer rely on increasing compute power. So we need to work harder than implementing NV brute force papers.
Software and hardware industries interests diverge. We are on our own, from now on. :O

Luckily game devs are very experienced with just dodging the problem.

As I say, it's basically mipmapping in 3D space.
That's precisely what i do. :D
It's trivial and boring. I really tried 'advanced' stuff before, like quadrangualtion, if you remember the bunny image. But this was again just a 'mix of two' approach. You can increase lod per quad efficiently, but above that point you can not reduce it easily. And we can't represent complex models with just 'large' quads.
Now i do a voxelization of the scene to multi level regular density grids (so mip maps), then a shitty isosurface extraction, then make a nice quad dominant remesh out of that.
That's general and can do everything. It generates discrete LODs per mip level. But true continuous LOD is by definition impossible, due to the changing topology problem. So starting from discrete levels is not actually bad i think.
However, the quantization problems from using a reguler grid remain. It can not handle thin objects, e.g. the curtains from the Sponza model. It will process them without crashing, and will give some output, but this output is of bad quality and has holes which should not be there.

It's good enough to give a high quality surfel heirarchy for GI, and the system can process volumetric scenes of 'infinite' size with out of core processing. This opens up some new options of content generation, which i'm exploring with some success. Using it to generate dynamic LOD geometry for rendering surely is possible (either triangles or points), but not for every kind of model. For 'imperfect' terrain it's ideal, for 'perfect' human made stuff not so much.

Actually i would say: Use Nanite for the latter and modular instances, mine for unique terrain, and something else like splatting for foliage.
Parametric surfaces would beat Nanite in many cases like cars.
That's 4 techniques then, which sucks. But i don't know better.

Though, at a low level, i'm convinced all geometry - no mater how it's generated, compressed and streamed, etc., could use the same BVH. And also the same rasterization system for triangles.
Beside DXR, nothing speaks agains doing this.
I expect efficent rendering, but increasing cost for background threds to stream data from disk, anhance it procedurally, generate the final data structures, which then can be cached and reused as long as their LOD remains static.
I guess this vision makes sense to anybody. It's obvious. But it really shows BVH MUST be a dynamic dta structure even for static scenes due to LOD.
Which brings me back to my request ;D
 
JoeJ is talking about the 'brute force' work needed to the the dedicated HW to work, which in itself can be considered a 'brute force' solution accelerated. It's not a criticism against RT HW but the nature of that hardware.
 
Dedicated HW usually is more efficient though
No. Dedicated HW usually just implements brute force algorithms, which is fine and expected.
But software is not, which is what i meant. Almost all of NVs research papers are about brute force software solutions, made practical by the 'magic' of parallel processing.
It's like that for decades. Gfx devs do not even know anymore something other than brute force exists. More raw power is the only option towards further progress they can see.
So they ignore the conflict of interests and adopt NVs newest breakthroughs, although they know making data center HW a standard for gamers will only hurt their sales and isn't practical.

The tech priests of modern times, their visions, marketing lies and promises. The network of self confirmation bias the hype generates over years and masses of people is hard to escape.
I know what you think now, but what about this example:
Elon Musk sends rockets to Mars. No return. The goal is to colonize it. To save humanity. Would you take a seat? To die on a lonely desert in the best case? No?
Good then. But the situation is the same: You don't know how far the speculated 'efficency' you mention will get us. You just assume it will be 'far' becasue that's what you'd like to see. You trust the promise without doubt, and you get confirmation from otheres who do the same.
As a result, many people will spend those 2000 bucks, and many will take a seat in Musks rockets.

Well - maybe my political views are arguable. But my request is not.
 
No. Dedicated HW usually just implements brute force algorithms, which is fine and expected.
But software is not, which is what i meant. Almost all of NVs research papers are about brute force software solutions, made practical by the 'magic' of parallel processing.
It's like that for decades. Gfx devs do not even know anymore something other than brute force exists. More raw power is the only option towards further progress they can see.
So they ignore the conflict of interests and adopt NVs newest breakthroughs, although they know making data center HW a standard for gamers will only hurt their sales and isn't practical.

The tech priests of modern times, their visions, marketing lies and promises. The network of self confirmation bias the hype generates over years and masses of people is hard to escape.
I know what you think now, but what about this example:
Elon Musk sends rockets to Mars. No return. The goal is to colonize it. To save humanity. Would you take a seat? To die on a lonely desert in the best case? No?
Good then. But the situation is the same: You don't know how far the speculated 'efficency' you mention will get us. You just assume it will be 'far' becasue that's what you'd like to see. You trust the promise without doubt, and you get confirmation from otheres who do the same.
As a result, many people will spend those 2000 bucks, and many will take a seat in Musks rockets.

Well - maybe my political views are arguable. But my request is not.

Right, i understand. How to convince every vendor/company to improve things then? :p Were atleast stuck with current gen machines for another five years, and those are going to follow the same route for BC and all that. Same for IHVs and even software companies.
It all could have been better but it aint.
 
It's not a criticism against RT HW but the nature of that hardware.
I do not criticise the HW, nor do i see a need to change this HW from what i know. I'm unsure if heavy RT is worth it for mainstream, but that's another story, subjective, and we will just see.
What i do criticize is the blackbox around that hardware, preventing some of use to use it efficiently or at all.
And this blackbox isn't implemented in HW, it's done in software in form of API restricted to legacy standards. A simplified high level API, missing half of the thing, inside a low level API like DX12. This conflict alone shows something is wrong.
 
How to convince every vendor/company to improve things then?
Hard route, as proposed by me:
Expose and specify BVH data formats per chip. Leave it to the devs to do the rest.

Easy route, and i assume Epic did propose / request this already:
Make 'cluster of triangles' a primitive for raytracing.
BVH over those clusters could be streamed and provided from game engines, or - if that's really such a big problem for IHVs - the driver can manage it to keep data structures hidden and proprietary.

Both is possible without any doubt.
But what do we get? DMM. Seems pretty meh.
Still waiting on email from NV about further details on that. But maybe they check forums and blacklist saboteurs :D
 
Hard route, as proposed by me:
Expose and specify BVH data formats per chip. Leave it to the devs to do the rest.

Easy route, and i assume Epic did propose / request this already:
Make 'cluster of triangles' a primitive for raytracing.
BVH over those clusters could be streamed and provided from game engines, or - if that's really such a big problem for IHVs - the driver can manage it to keep data structures hidden and proprietary.

Both is possible without any doubt.
But what do we get? DMM. Seems pretty meh.
Still waiting on email from NV about further details on that. But maybe they check forums and blacklist saboteurs :D

Highly doubt NV or any IHV is seeing you as some kind of saboteur ;)
 
I actually think per-vendor API kind of stuff is starting to make more and more sense, this seems logical to me.
I wanted this too, but now with Intel entering as well... probably too much fragmentation.
So i thinkt about querying the BVH data structure definition from teh DX12 / VK driver. Such things already exist e.g. to figure out if a GPU has support for 64bit atomics.
Using such definition, we could procedurally generate compute shader code on the client to implment our needs.
So this would even work for future GPUs. If it can be indeed that simple.
Not many people would use this feature, but it's also almost no work to implement for IHVs.
However, it would reveal some details about how their HW internally works, and maybe they don't want this.
But i don't think so. It's just that LOD is traditionally neglected, and Epics demos may not change this mindset so quickly. So ther is probably little demand.
 

Number are on his PC

On linkedin he gaves an example about Series X and PS5 100 microseconds for CPU(below 0.1ms) per character and below 50 microsecond per character(0.05 ms) and 1mb of RAM. They want to improve the CPU performance. Maybe this is what they use on Tekken 8. It can work with cloth deformation but they have better ML for clothing


Unreal Engine 5.1 ships with an updated ML Deformer which can greatly improve mesh deformation in real-time. Basically you can train it on any deformation. In the example below it is trained to approximate the results of a FEM solver that tries to preserve volume. However, you can use anything (for now without dynamics).

We have built a framework that allows people to build custom ML mesh deformation models. The framework gives you an editor to train, inspect and test your deformation on a given character. It also gives some ML Deformer asset type and ML Deformer component, to have a unified user experience.

UE 5.1 ships with a model that we developed and call Neural Morph Model. The great thing about this is that it is relatively light weight in both memory usage and performance.

The ML Deformer for the model below uses around 1 megabyte of memory on the GPU and it evaluates on latest consoles below 100 microseconds per character on CPU and below 50 microseconds on the GPU. We hope to still reduce the CPU time significantly.

While the Neural Morph Model is good for body deformations, and can also handle certain types of clothing, we also have another model in production that performs better on clothing.

The goal of this feature is to bring VFX quality deformation to real-time. Or at least, close to that quality.

The video below shows the difference between a character that is linear skinned (including helper bones), and with ML Deformer enabled. I am enabling and disabling it throughout the video. Notice the chest area.

I know the character in this video might not be the best example, as it is an older Paragon character, but I hope I can show more later on! :)

 
Last edited:
So this kind of ML can run on basically anything seeing its minimal cpu/gpu usage. Last gen consoles should be able too, though not in typical UE5 fidelity.
 
Back
Top