Unreal Engine 5, [UE5 Developer Availability 2022-04-05]


Afaict this is UE4, so sorry for posting here.
But, um, this game looks better than anything? I think the emissive materials look mind blowing.
So can anybody tell me why this looks so great? Maybe that's this new 'convolution bloom'?
Looks great
 
It seems to say than Direct ML is a problem for tensor core usage. From what I have seen he said they don't use tensor code and they said the code run fast on AMD 5700x but he doesn't talk about Nvidia hardware.
Regarding this matter again, this is what Anand describes it.

For Turing, NVIDIA changed how standard FP16 operations were handled. Rather than processing it through their FP32 CUDA cores, as was the case for GP100 Pascal and GV100 Volta, NVIDIA instead started routing FP16 operations through their tensor cores.

The tensor cores are of course FP16 specialists, and while sending standard (non-tensor) FP16 operations through them is major overkill, it’s certainly a valid route to take with the architecture. In the case of the Turing architecture, this route offers a very specific perk: it means that NVIDIA can dual-issue FP16 operations with either FP32 operations or INT32 operations, essentially giving the warp scheduler a third option for keeping the SM partition busy.


Ampere and Ada also follow the same path as Turing.
 
The "nVidia ACE" demo was built with UE5 and RTXDI:

Jin and his Ramen Shop scene were created by the NVIDIA Lightspeed Studios art team and rendered entirely in Unreal Engine 5, using NVIDIA RTX Direct Illumination (RTXDI) for ray traced lighting and shadows, and DLSS for the highest possible frame rates and image quality.
 
Very nice for an easily controllable small space where it's easy to keep the BVH from exploding in size. Now, let's see them do something similar with a large open world with high detail and lots of animated NPCs and objects without the RT requiring a simplified representation (to keep the BVH manageable) which potentially causes mismatches between the RT and actual geometry. Hell, even without the NPCs and objects.

Would certainly love to see some form of point and click adventure game with those graphics, however. :)

Regards,
SB
 
2023: 'Don't play games at launch. They're not finished yet.'
2026: 'Don't play games at launch. They are so crowded, publishers servers running the language models can't keep up, so NPCs take one minute to respond.'

Or can this run on client GPU as well? How large is the data such language models need to work?
Saw some Skyrim + ChatGPT + speech synthesis stuff before, but i guess most of that was streamed from servers.
Current 13B open source models like Manticore Chat, WizardLM Uncensored etc are coherent enough to pull this off nicely, but they do take a lot of resources even quantized. If you want to run it on the GPU, you're looking at around 12 GB VRAM for full context (2048 tokens). You can use CPU offloading though so it runs at decent speeds even on weaker hardware. But yeah as you can see, LLMs alone already push the hardware to the limits, when you add natural sounding text to speech and a whole game to render, this almost becomes an impossible feat.

Thankfully hardware will get stronger. Maybe in a couple of years.
 
Discussion of AI tech moved here.

Remember this thread is talk about UE5 as an engine and not every tool that plugs in or runs on it.
 
With all the hype about Nanite, does it actually work with destruction and terrain deformation? That’s not something I’ve seen talked about. I know for the matrix demo, when car collision occurred, nanite wasn’t used? For me, if it doesn’t work with deformation and destruction, it’s entirely useless for the type of games I want to play. I and many others are tired of the set dressing worlds in games during the ps4xb1 generation.
 
With all the hype about Nanite, does it actually work with destruction and terrain deformation? That’s not something I’ve seen talked about. I know for the matrix demo, when car collision occurred, nanite wasn’t used? For me, if it doesn’t work with deformation and destruction, it’s entirely useless for the type of games I want to play. I and many others are tired of the set dressing worlds in games during the ps4xb1 generation.
It does -- it absolutely works for destruction (most destruction is static fragments that need to fly around and perform well when there are lots on screen occluding eachother -- perfect for nanite) -- and in recent ue versions they added support for what unreal engine calls "World Position Offset", which is deforming the mesh in the vertex shader and is what is used for most deformation effects. I understand there's some kind of performance cliff to using wpo on nanite, but not the details.
 
I know for the matrix demo, when car collision occurred, nanite wasn’t used? For me, if it doesn’t work with deformation and destruction, it’s entirely useless for the type of games I want to play. I and many others are tired of the set dressing worlds in games during the ps4xb1 generation.
The Matrix solution doesn't really preclude your desires here. At any given time there will still be a lot more "intact" geometry than damaged/destroyed geometry and it's really an okay solution to render the smaller set with a different technique if needed. As noted, there is more flexibility on deformation than there used to be in Nanite so it could now be done with world position offset, but you wouldn't want that on every instance either of course. Fortnite uses something similar to this of course: most stuff renders via the fast Nanite path and WPO is only enabled selectively when objects need to wobble or be deformed (see the FN Nanite blog for more details).

And that really gets at the main point - no matter how you do the deformation, you still need instancing to keep these large worlds feasible at all. While it's easy to say "I want a world of millions of objects all of which can be deformed arbitrarily *at the same time*" even back of the envelope math will tell you there's no way your PC even has enough RAM for that. If too much stuff gets destroyed and deformed in some sort of open world physics sandbox then triangles are probably not appropriate either (that's where all the voxel demos end up going heh).

To summarize though, there's nothing really in the use of Nanite that precludes destructible stuff any more than without it. Both the Matrix solution and Fortnite work fine in practice.
 
The Matrix solution doesn't really preclude your desires here. At any given time there will still be a lot more "intact" geometry than damaged/destroyed geometry and it's really an okay solution to render the smaller set with a different technique if needed. As noted, there is more flexibility on deformation than there used to be in Nanite so it could now be done with world position offset, but you wouldn't want that on every instance either of course. Fortnite uses something similar to this of course: most stuff renders via the fast Nanite path and WPO is only enabled selectively when objects need to wobble or be deformed (see the FN Nanite blog for more details).

And that really gets at the main point - no matter how you do the deformation, you still need instancing to keep these large worlds feasible at all. While it's easy to say "I want a world of millions of objects all of which can be deformed arbitrarily *at the same time*" even back of the envelope math will tell you there's no way your PC even has enough RAM for that. If too much stuff gets destroyed and deformed in some sort of open world physics sandbox then triangles are probably not appropriate either (that's where all the voxel demos end up going heh).

To summarize though, there's nothing really in the use of Nanite that precludes destructible stuff any more than without it. Both the Matrix solution and Fortnite work fine in practice.

It does -- it absolutely works for destruction (most destruction is static fragments that need to fly around and perform well when there are lots on screen occluding eachother -- perfect for nanite) -- and in recent ue versions they added support for what unreal engine calls "World Position Offset", which is deforming the mesh in the vertex shader and is what is used for most deformation effects. I understand there's some kind of performance cliff to using wpo on nanite, but not the details.

I'll need to look more into world position offset.
 
The Matrix solution doesn't really preclude your desires here. At any given time there will still be a lot more "intact" geometry than damaged/destroyed geometry and it's really an okay solution to render the smaller set with a different technique if needed. As noted, there is more flexibility on deformation than there used to be in Nanite so it could now be done with world position offset, but you wouldn't want that on every instance either of course. Fortnite uses something similar to this of course: most stuff renders via the fast Nanite path and WPO is only enabled selectively when objects need to wobble or be deformed (see the FN Nanite blog for more details).

And that really gets at the main point - no matter how you do the deformation, you still need instancing to keep these large worlds feasible at all. While it's easy to say "I want a world of millions of objects all of which can be deformed arbitrarily *at the same time*" even back of the envelope math will tell you there's no way your PC even has enough RAM for that. If too much stuff gets destroyed and deformed in some sort of open world physics sandbox then triangles are probably not appropriate either (that's where all the voxel demos end up going heh).

To summarize though, there's nothing really in the use of Nanite that precludes destructible stuff any more than without it. Both the Matrix solution and Fortnite work fine in practice.

This is why you need low detail proxy meshes for animation now, not just physics. This solution, to limit what can be animated and by how much, makes animation worse than it was before generation over generation. Yet animation is the only clear thing that hasn't eclipsed older CG features like The Incredibles. We've done better than that in realtime in everything but animation, which still looks primitive in comparison.

The ram limitations, among other things, are a clear bottleneck. But animation still needs to be pushed forward, not limited even more for the sake of marching into ever more limited return on investment visuals like geometry detail.

Proxy meshes offer a workaround for this. Tetahedron meshes, used as cages, offer an interesting prospect of simplifying transforms and reducing ram usage while manipulating ultra high detail models for animation at the same time. Considerations should definitely be made before every game starts looking like Cyberpunk, where you can crank the lighting detail up ever higher, and with each tic up the poor animation looks ever more obvious, stilted, and out of place.
 
Proxy meshes offer a workaround for this. Tetahedron meshes, used as cages, offer an interesting prospect of simplifying transforms and reducing ram usage while manipulating ultra high detail models for animation at the same time. Considerations should definitely be made before every game starts looking like Cyberpunk, where you can crank the lighting detail up ever higher, and with each tic up the poor animation looks ever more obvious, stilted, and out of place.
Skeletal rigs (by the nature of being manually laced and weighted per vert, and having triangular meshes rather than some kind of a magic computer raymarching equation) already offer much more precision and control as proxies -- Games are limited because it's expensive to deform the vertices at the volume we need at realtime framerates -- we need to do one of two things, either store the deformation beforehand (which costs storage and bandwidth) or animate a proxy and deform at runtime (which costs compute).
 
The ram limitations, among other things, are a clear bottleneck. But animation still needs to be pushed forward, not limited even more for the sake of marching into ever more limited return on investment visuals like geometry detail.

Proxy meshes offer a workaround for this. Tetahedron meshes, used as cages, offer an interesting prospect of simplifying transforms and reducing ram usage while manipulating ultra high detail models for animation at the same time.
Cage deformers are interesting, but regarding character animation they miss to address the real problem, similar to volumetric offline skin simulation methods using tetrahedralization. Eventually the deformation is smooth across cage faces, eventually it does some volume preservation, but it still misses to simulate the underlying bone and muscle movements and sliding of skin to give anatomically correct results.
I do not think offline rendering has solved this either. At least not in Disney / Pixar movies, where they get away with non realistic results due to cartoon artstyle.
There are good results coming from full body simulations. But that's not only expensive to run, it's mainly the huge amount of work to set it up making it unpractical. Modeling all bones and muscles and understanding how various joints work in detail keeps you busy at least for months, if not years.
Currently ML methods promise a practical solution. But to me this rather feels like a minor improvement with questionable accuracy and performance.

I agree state of the art is terrible and it's embarrassing we keep failing to get this right. Artists came up with workarounds using a good set of extra bones. But results are acceptable only because there is nothing better to see, and failure cases on acrobatic poses are still more a monstrosity than human.
Maybe i'm more sensual to this than average people, but i doubt it, because we all are used to pay close attention on anatomical correctness.

That's also one of the first problems i tried to address again and again over the years. Currently my solution is kind of deformation patches made from decoupling twist and swing motion of joints. Works pretty well. It can do sliding and volume preseravtion based on artist setup, which is some work but not too much if i create proper tools. Runtime cost is similar to matrix palette skinning.
But i still want bone and muscle collisions as well for details, so i need to do the tedious work to set up virtual Adam and Eve base characters. Finally there exists some proper reference data: https://www.z-anatomy.com/
So i could get started. But i have to finish some other things first, so probably i'll be dead before i get back to this... :D

That said to point out the main problem here can't be solved with some fancy deformation method alone, and runtime performance is not the main problem either.
The first problem is how to model the complexity of the human body with some acceptable effort.

Beside this, i don't see any other animation issues in games. We can animate mechanical machines made from rigid bodies easily. We could improve foliage waving in the wind with large scale fluid simulation if desired (which i doubt).
Hair and cloth also are just performance problems but not too difficult if we can afford the cost.

either store the deformation beforehand (which costs storage and bandwidth) or animate a proxy and deform at runtime (which costs compute).
Regarding compute costs, worth to mention this increases now if we need some kind of acceleration structure, e.g. the BVH used for RT or Nanite.
Personally i do not understand why Nanite does not support deformation. The easiest way would be to make the bounding volume per node large enough it still bounds all potential animation.
After that, node positions just need to be deformed like the mesh vertices, and there is no need to update parent bounds by looking at their child nodes, which creates a expensive barrier per tree level.

But i also don't think missing deformation support is a big limitation for Nanite. Starting with rigid transforms is reasonable and makes sense, and i guess they keep lifting limitations with time.
 
Personally i do not understand why Nanite does not support deformation. The easiest way would be to make the bounding volume per node large enough it still bounds all potential animation.
Good post, agree with most everything overall -- on this point, I suppose you could remake vertex shader gpu skinning with wpo? Not sure how it would perform, would be a good worst case/ceiling. I suspect that a- skin weight data across the whole mesh and between cluster seams is some kind of annoying or too expensive, and b- needing to run skinning breaks fast paths for culling (... oh, maybe that's the limitation for wpo on nanite? Still need to look at the docs!)

One other thing I disgree with is your concern about labor cost to rig up muscle deformers - any AAA game easily has the resources for character artists and riggers to do that, and we've seen mediocre end results with similar up front efforts in games before. The limit to shipping something film quality is still perf and storage, although I agree with your overall thrust that there is tons of room to improve the state of the art and let studios with budgets less than ~2 million per character do the same.
 
Just got this UE5 game to try called: Desordre

Uses everything UE5 offers, Nanite, Lumen, TSR and VSM.

All of the walls and small cranks are pure geometry using Nanite and the reflections and lighting look gorgeous.

Ultra settings at native 1440p is above 60fps and with DLSS balanced mode it's 120fps locked (Game supports DLSS, TSR, FSR and XeSS)
 

Attachments

  • IMG20230605225215.jpg
    IMG20230605225215.jpg
    3.8 MB · Views: 42
  • IMG20230605225154.jpg
    IMG20230605225154.jpg
    3.1 MB · Views: 44
  • IMG20230605225025.jpg
    IMG20230605225025.jpg
    3.5 MB · Views: 43
The reality is that most games are never going to have destructability on the level some people imagine. There is just too much for the engine to calculate, and it's an entirely different problem compared to back when games were much more simplistic and assets were easier to deform with lower penalty cost.

With this in mind, we already have examples of ue5 games with large scale destruction systems. Not everything needs to use Nanite to be destroyed in the game itself
 
Back
Top