Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

My video should be going live today at 17 and at one point I mention a visual artefact with Nanite that I have seen no one else post about before - I do wonder if it is a feature of how nanite functions or if it is just an issue in the current EA version of UE5. I would be curious to hear what people think could be the cause! Essentially there is a some shuffling of nanite when the camera changes position, not anything like tessellation boiling or a discrete LOD shift, but more as if the world pieces shuffle into place as the camera comes to a rest.
Good video, thanks for posting!

Watching the video I think there's two things, both fairly minor. 1) the heat haze effect obviously makes things warble a little. You can disable it in the CFR_PostProcessVolume_Global here if you didn't already sort out how:
upload_2021-6-10_22-7-38.png

The second I think Guillaume already responded to you on twitter - for the rest here, it's nothing to do with Nanite, it's a precision thing in the velocity buffer:

Indeed if you disable TSR in the demo it doesn't seem to happen here at least.
 
Do you envision a time in the near future where a software renderer's flexibility will defacto trumps any efficiency gained from the mesh/primitive shader path (like mesh/task shaders beating the input assembler and tessellator) ?
So as Brian noted in his presentation, the consoles already use something more akin to a mesh shader path for the hardware rasterized triangles and while it's an improvement, the software rasterizer still wins for small triangles. The line of where it throws things at one or the other is fully configurable (via CVar) so you can go play with it yourself if you want. As relative efficiencies on different platforms adjust it's easy enough to adjust the optimal point, but at least so far on all platforms the software rasterizer wins for small triangles by a pretty large margin. This shouldn't be terribly surprising honestly... it's not so much "hardware vs software" here as when you get to small triangles you really need to use a different rasterization algorithm. Things like heavy primitive assembly and triangle setup and stamps no longer make sense if you only are drawing a few pixels. Hardware may one day take this back over (and it would be great if it did! There's always more uses for freeing up flops!), but I can't see that fundamental constraint changing any time soon. Making an efficient rasterizer for all different sizes of triangles is going to branch at certain sizes.

wonder why its so heavy, ps5 was rock steady 1440p30 in first iteration, now 2070super stuggle with 1080p
This has been said a bunch of times but bears repeating again... the Ancient scene is extremely heavy and probably not representative of what a game would do. In Brian's talk they go over in a fair amount of detail how they wanted to see how far they could push the kitbashing thing and learned a lot in the process. It's sort of amazing that it works as well as it does, but in a shipping game you would not have dozens of layers of geometry intersecting in a small area like the demo does.
 
I noticed that UE5 is using proprietary driver extensions on D3D12 but is there an open possibility where the final release could use SM 6.6 atomic operations ? Does the software rasterizer perform depth testing via AtomicOpMinU64 where the depth value is stored in the upper 32 bits and the colour information is stored in the lower 32 bits ?
 
So as Brian noted in his presentation, the consoles already use something more akin to a mesh shader path for the hardware rasterized triangles and while it's an improvement, the software rasterizer still wins for small triangles. The line of where it throws things at one or the other is fully configurable (via CVar) so you can go play with it yourself if you want. As relative efficiencies on different platforms adjust it's easy enough to adjust the optimal point, but at least so far on all platforms the software rasterizer wins for small triangles by a pretty large margin. This shouldn't be terribly surprising honestly... it's not so much "hardware vs software" here as when you get to small triangles you really need to use a different rasterization algorithm. Things like heavy primitive assembly and triangle setup and stamps no longer make sense if you only are drawing a few pixels. Hardware may one day take this back over (and it would be great if it did! There's always more uses for freeing up flops!), but I can't see that fundamental constraint changing any time soon. Making an efficient rasterizer for all different sizes of triangles is going to branch at certain sizes.


This has been said a bunch of times but bears repeating again... the Ancient scene is extremely heavy and probably not representative of what a game would do. In Brian's talk they go over in a fair amount of detail how they wanted to see how far they could push the kitbashing thing and learned a lot in the process. It's sort of amazing that it works as well as it does, but in a shipping game you would not have dozens of layers of geometry intersecting in a small area like the demo does.
So how much worse will game looks with decent performance and many npc, enemies etc, I’m still hyped by ue5 but perf in 1080p upscaling to 4k id little worrying (is quality of demo sustenable in real game environment)
 
So how much worse will game looks with decent performance and many npc, enemies etc, I’m still hyped by ue5 but perf in 1080p upscaling to 4k id little worrying (is quality of demo sustenable in real game environment)

This last demo is heavily unoptimized from a content point of view. Games content will probably being more like the first demo.

And they can make tons of performance improvement on Lumen side and probably some too with Nanite and adding the missing features will help with perfomance too(skinned mesh and transparent material).

This is a work in progress.
 
So as Brian noted in his presentation, the consoles already use something more akin to a mesh shader path for the hardware rasterized triangles and while it's an improvement, the software rasterizer still wins for small triangles. The line of where it throws things at one or the other is fully configurable (via CVar) so you can go play with it yourself if you want. As relative efficiencies on different platforms adjust it's easy enough to adjust the optimal point, but at least so far on all platforms the software rasterizer wins for small triangles by a pretty large margin. This shouldn't be terribly surprising honestly... it's not so much "hardware vs software" here as when you get to small triangles you really need to use a different rasterization algorithm. Things like heavy primitive assembly and triangle setup and stamps no longer make sense if you only are drawing a few pixels. Hardware may one day take this back over (and it would be great if it did! There's always more uses for freeing up flops!), but I can't see that fundamental constraint changing any time soon. Making an efficient rasterizer for all different sizes of triangles is going to branch at certain sizes.


This has been said a bunch of times but bears repeating again... the Ancient scene is extremely heavy and probably not representative of what a game would do. In Brian's talk they go over in a fair amount of detail how they wanted to see how far they could push the kitbashing thing and learned a lot in the process. It's sort of amazing that it works as well as it does, but in a shipping game you would not have dozens of layers of geometry intersecting in a small area like the demo does.

Thank you for the response.
 
Maybe Epic should make Lumen in the Land of Nanite available, so people can compare the performance for themselves.
 
Maybe Epic should make Lumen in the Land of Nanite available, so people can compare the performance for themselves.

I think it's too early for people to worry about benchmarking. It's not production ready yet, and not fully optimized. I think there are a select few AAA devs working with it right now, but they have resources beyond the average UE user. For the rest of us I think it's better to just wait, because we have no idea how much performance could improve by the time it's production ready.
 
I noticed that UE5 is using proprietary driver extensions on D3D12 but is there an open possibility where the final release could use SM 6.6 atomic operations ?
Yes we'll use the SM6.6 stuff when broadly available (both in drivers and the SM6 path in Unreal has stabilized a bit more). Some might guess that it's not a coincidence that made it into SM6.6 in the first place ;)

Does the software rasterizer perform depth testing via AtomicOpMinU64 where the depth value is stored in the upper 32 bits and the colour information is stored in the lower 32 bits ?
Yes, although it's visibility buffer info in the low bits, not colour per se. We write to the visibility buffer with atomics even when using the hardware rasterizer (for a variety of reasons), and sometimes overlap HW/SW rasterization with async compute. (I think that is currently disabled on PC, but it's not super critical to performance either way.)

So how much worse will game looks with decent performance and many npc, enemies etc, I’m still hyped by ue5 but perf in 1080p upscaling to 4k id little worrying (is quality of demo sustenable in real game environment)
I wouldn't try and speculate too much on what games will look and perform like at this point. The demos are meant to inspire and show people what is possible, but I have no doubt game teams will find their own balances, and in many cases look even better than these demos.

There's no fundamental reason you can't have something that looks the same as the Ancient demo but perform quite a bit better (arguably see last year's demo); it's really more a tradeoff of how the content is *built* than anything. Kit-bashing in the manner done in the demo is neat and convenient to produce a lot of variation quickly from a small set of megascans, but in a production game you'd likely want to either optimize the result further (i.e. remove a lot of the stuff that will never be seen), or build things in a somewhat different way to avoid the performance costs. People are just starting to experiment with how best to the use the tech and I imagine in a year we'll have much better ideas around building efficient content pipelines for UE5.

Honestly I'm impressed at a bunch of the stuff the community has thrown together very quickly and with early access code, and really excited to see what people do in the future!
 
I really need to get through my Steam backlog. After a few years of Nanite / mesh shaders / GI everywhere older games will become unplayable. They will look ridiculously flat in comparison.

Tell me about it. I'm getting genuinely concerned about this. And I have so many fricking games to get through!!
 
Yes we'll use the SM6.6 stuff when broadly available (both in drivers and the SM6 path in Unreal has stabilized a bit more). Some might guess that it's not a coincidence that made it into SM6.6 in the first place ;)


Yes, although it's visibility buffer info in the low bits, not colour per se. We write to the visibility buffer with atomics even when using the hardware rasterizer (for a variety of reasons), and sometimes overlap HW/SW rasterization with async compute. (I think that is currently disabled on PC, but it's not super critical to performance either way.)


I wouldn't try and speculate too much on what games will look and perform like at this point. The demos are meant to inspire and show people what is possible, but I have no doubt game teams will find their own balances, and in many cases look even better than these demos.

There's no fundamental reason you can't have something that looks the same as the Ancient demo but perform quite a bit better (arguably see last year's demo); it's really more a tradeoff of how the content is *built* than anything. Kit-bashing in the manner done in the demo is neat and convenient to produce a lot of variation quickly from a small set of megascans, but in a production game you'd likely want to either optimize the result further (i.e. remove a lot of the stuff that will never be seen), or build things in a somewhat different way to avoid the performance costs. People are just starting to experiment with how best to the use the tech and I imagine in a year we'll have much better ideas around building efficient content pipelines for UE5.

Honestly I'm impressed at a bunch of the stuff the community has thrown together very quickly and with early access code, and really excited to see what people do in the future!

A verified artist on era who work I suppose on a game using UE 5 said games will look as good or better than the demo.
 
Yes we'll use the SM6.6 stuff when broadly available (both in drivers and the SM6 path in Unreal has stabilized a bit more). Some might guess that it's not a coincidence that made it into SM6.6 in the first place ;)

Nice! Although this limits you to two vendors since others such as Intel or Apple virtually don't support this feature ...
 
I think it's too early for people to worry about benchmarking. It's not production ready yet, and not fully optimized. I think there are a select few AAA devs working with it right now, but they have resources beyond the average UE user. For the rest of us I think it's better to just wait, because we have no idea how much performance could improve by the time it's production ready.
I meant benchmarks between last years and this years demo.

To be honest, I doubt last years demo running on PS5 would be possible on PC right now, as it lacks DirectStorage. The demo, especially in the last segment, was very likely using the NVMe SSD and its decompressor to stream data in and out fast.

When using the drone in the current demo at high speed, it stutters heavily with much less asset variety than what was shown in last years demo.
 
Last edited:
I meant benchmarks between last years and this years demo.

To be honest, I doubt last years demo running on PS5 would be possible on PC right now, as it lacks DirectStorage. The demo, especially in the last segment, was very likely using the NVMe SSD to stream data in and out fast.

When using the drone in the current demo at high speed, it stutters heavily.
Ugh... this again... smh

And again, you've been told... by Epic employees right in this very thread... that the amount of overdraw and disregard for optimizing this new PC demo VS the PS5 demo, that it's even MORE demanding.

We can actually monitor storage usage as well... There's no bandwidth constraints happening here..
 
Ugh... this again... smh

And again, you've been told... by Epic employees right in this very thread... that the amount of overdraw and disregard for optimizing this new PC demo VS the PS5 demo, that it's even MORE demanding.

We can actually monitor storage usage as well... There's no bandwidth constraints happening here..
I must have missed something. I only know Nanite was 2x more demanding than in last years demo, but that doesn't say much as Nanite is a fraction of overall GPU rendering cost I would imagine.

Last years demo had much wider asset variety and actual physics. This year reuses a ton of assets it's pretty obvious. A ton of stuff was happening in last years demo, while streaming data ultra fast in and out and again, with much higher asset variety, probably better, more consistent textures as well. This years demo is impressive, but it doesn't hold a candle against the old one. Maybe we wouldn't have this discussion if Epic wouldn't be so secretive about this demo and just release it.

Yes, I've monitored it as well. While using the drone at high speed, my GPU is not close to being maxed out and yet it still stutters and frames drop. VRAM, RAM and CPU usage as well. My NVMe SSD is not even being used. So IDK where the bottleneck is.
 
Back
Top