Unreal Engine 5, [UE5 Developer Availability 2022-04-05]


Interesting UE5 talk here.

The performance target thing worries me A TON though. 1080p 30 with Hardware RT Lumen on Next Gen consoles. That's not acceptable, even with their new upsampler.

...Just why? I thought hardware-RT is faster than software RT at the same quality settings? Why not combine Medium Lumen + Hardware RT to get crazy good performance? It looks more than good enough. Or is triangle based Raytracing really that inefficient and their software solution is simply faster?

I can't wrap my head around this. THis is going to destroy performance on RT capable cards, instead of accelerating the effects like it should be, they crank up the settings to no end even though most people won't be able to tell the differences. I don't care if some effects are still in screen space. If I want 60 FPS on a low end RT card, I basically have to left its RT cores unused. Which is a shame. Atleast give us the option to use HW-acceleration to make Lumen run faster...
.......Or maybe, just maybe, this is an engine in EA and not indicative of the final performance when it eventually launches. Not to mention that UE is an ever-evolving platform these days.......
 

Interesting UE5 talk here.

The performance target thing worries me A TON though. 1080p 30 with Hardware RT Lumen on Next Gen consoles. That's not acceptable, even with their new upsampler.

...Just why? I thought hardware-RT is faster than software RT at the same quality settings? Why not combine Medium Lumen + Hardware RT to get crazy good performance? It looks more than good enough. Or is triangle based Raytracing really that inefficient and their software solution is simply faster?

I can't wrap my head around this. THis is going to destroy performance on RT capable cards, instead of accelerating the effects like it should be, they crank up the settings to no end even though most people won't be able to tell the differences. I don't care if some effects are still in screen space. If I want 60 FPS on a low end RT card, I basically have to left its RT cores unused. Which is a shame. Atleast give us the option to use HW-acceleration to make Lumen run faster...
simply on consoles we will have software version of lumen more often or modes, also there are no some cores on rdna2 for rt just fixed function for intersection in tmu but using it reduce texture fillrate capabilities
 

Interesting UE5 talk here.

The performance target thing worries me A TON though. 1080p 30 with Hardware RT Lumen on Next Gen consoles. That's not acceptable, even with their new upsampler.

...Just why? I thought hardware-RT is faster than software RT at the same quality settings? Why not combine Medium Lumen + Hardware RT to get crazy good performance? It looks more than good enough. Or is triangle based Raytracing really that inefficient and their software solution is simply faster?

I can't wrap my head around this. THis is going to destroy performance on RT capable cards, instead of accelerating the effects like it should be, they crank up the settings to no end even though most people won't be able to tell the differences. I don't care if some effects are still in screen space. If I want 60 FPS on a low end RT card, I basically have to left its RT cores unused. Which is a shame. Atleast give us the option to use HW-acceleration to make Lumen run faster...

S S D
 
I can't wrap my head around this. THis is going to destroy performance on RT capable cards, instead of accelerating the effects like it should be, they crank up the settings to no end even though most people won't be able to tell the differences.
Accelerating which effects precisely?
Adding RT to previous gen rendering will pretty much always incur a performance hit - because you're in fact adding RT and not accelerating something which was present already.
And if you want that RT to actually be visible and influential on the overall IQ then you have to add it in large amounts which also mean a significant hit on performance without it.

Another thing to keep in mind here - UE5 titles won't come out sooner than 2023 most likely which means that they will launch on next gen PC GPUs and thus having some additional engine scalability beyond something targeting PS4/XBO or even Turing/RDNA1 can only be a good thing.
 

Interesting UE5 talk here.

The performance target thing worries me A TON though. 1080p 30 with Hardware RT Lumen on Next Gen consoles. That's not acceptable, even with their new upsampler.

...Just why? I thought hardware-RT is faster than software RT at the same quality settings? Why not combine Medium Lumen + Hardware RT to get crazy good performance? It looks more than good enough. Or is triangle based Raytracing really that inefficient and their software solution is simply faster?

I can't wrap my head around this. THis is going to destroy performance on RT capable cards, instead of accelerating the effects like it should be, they crank up the settings to no end even though most people won't be able to tell the differences. I don't care if some effects are still in screen space. If I want 60 FPS on a low end RT card, I basically have to left its RT cores unused. Which is a shame. Atleast give us the option to use HW-acceleration to make Lumen run faster...
I think Lumen supports hardware RT already, but there's probably the situation of the insane amount of polygons possibily harming performance more than helping out.

If you have the geometry of the scene scaling with the camera (in order to achieve the 1:1 pixel density) that means you need to recreate the data structures a lot more often than you need on "non nanite" game in order to maintain accuracy. Not to mention that overall scene complexity being higher might mean the creation of the structures take longer too.

On the other hand, with super scalable geometry could also be easier to just get the right amount of detail that allows to use the hardware effectively without destroying performance. And game using lower lods for RT is nothing out of ordinary. But then there's the question: Hardware RT allows to raytracing in the polygon scene faster than without the hardware to create/search the data structures, but is it faster than for example raytracing on voxel space or within screen space? I think that may be a no, we moved past those methods because they had compromises quality wise, but together perhaps can they be good enough (given the limitations like the lack of mirror like reflections)? Or at least until the hardware RT is so fast that it can eat through all that geometry with no sweat.

Regardless of that, it does seem that right now this demo seems too heavy so that an actual game using all those features is out of the consoles reach, not unlike UE4 when they first presented, but who knows how much more optimized it will be when it launches or in a few years time. Either way, best to take this demo as a sneak peak of how games will look more towards the end of the generation rather than something short term
 
On consoles we are most likely going to get 1440p with nanite, SW lumen + TSR TAA. And real-time Lumen isn't strictly a necessity imo, at least the GI part, not every game needs to have dynamic lighting / GI.
 
The performance target thing worries me A TON though. 1080p 30 with Hardware RT Lumen on Next Gen consoles. That's not acceptable, even with their new upsampler.

...Just why? I thought hardware-RT is faster than software RT at the same quality settings? Why not combine Medium Lumen + Hardware RT to get crazy good performance? It looks more than good enough. Or is triangle based Raytracing really that inefficient and their software solution is simply faster?
Did not watch the video, but summing up the points already made before eventually explaining why software can be faster:
1. UE5 tends to have lots of overlapping models / instances. With HW RT this means overlapping BLAS, and each one has to be processed to find all ray hits. Only after that we know which object was hit first, the other traversal and intersection work is for nothing.
With SDF tracing, distance to closest hit is already known on ray entry to the per object SDF. So we can reject all objects but the closest one early, and only trace the single object which will give us the closest hit. That's a really interesting advantage of SDF over BVH.
To fix the problem, we they could avoid overlapping multiple BLAS by merging all geometry into a single BLAS (or fewer of them). But this would not reduce the number of (mostly hidden) triangles, so it's still not optimal, but we loose the option to save lots of memory by using instances of BLAS per object.
A better compromise might be to remove hidden triangles below the surface, which would be possible on the client using SDF blocks. It would break some easy streaming, so that's some work and will take time. Downside is: With full RT solution, having SDF blocks just for that purpose is storage and memory waste. So it would be better to mark hidden triangles per instance offline and use that smaller data over SDF blocks.
2. Current DXR does not allow continuous LOD like Nanite, so from the above we can only expect lower poly proxies, which still need discrete LODs at least to keep things scalable.
Epic may try to request RT API improvements so they can work on a proper solution instead hacks, which will take even more time.

So, maybe disappointing RT perf. at the moment is a good thing helping to improve RT APIs, which were designed without LOD in mind.To me, the cause and failure clearly lies on the API side and isn't Epics fault.
 
For some of the folks losing their minds about performance speculation, I think it's worth a listen to last week's twitch stream (and this week and next week are likely to be interesting too). In particular, the bits about how the current demo was constructed, what the goals were and weren't, and why you may not want to use it as a proxy for performance of a more crafted game experience. The screenshot in the Nanite document about the geometry in that demo is worth checking out... that sort of geometry is not just bad for RT, it's bad for Nanite and Lumen too.

(Specifically around 1:32:00)

Regarding RT, obviously everyone would love to be able to raytrace stuff at high performance. I imagine many hybrids are in everyone's futures. There are many problems here with the current hardware/APIs though that will need to evolve to be able to handle this level of geometry streaming and LODs efficiently. That's neither unexpected nor should be panic-inducing... DXR was very much designed as a first stab at the problem with the full knowledge up front that it was not going to be an ideal long term solution in several areas. But you can't figure out the right APIs and hardware until you start playing with it more broadly, which is exactly what is happening now. In the immediate term, the areas around BVH building, management and updates need to evolve significantly. Luckily those are largely in software at the moment meaning R&D can be done relatively efficiently.

Also in this day and age of power limits you need not worry about "leaving the RT cores empty/busy". Most parts of the chip are idle at any given time and are even designed to be... if you lit up everything at once that's a power virus. In terms of area/power, you should probably be more mad about the amount of tensor cores put on these consumer chips (yes, even with DLSS...), and the fact that Nanite is leaving the poor ROPs idle too :p
 
Last edited:
Back
Top