Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

Yikes. It seems like they are using the Software path for the high scalabilty setting in Lumen, so that it runs at 60 FPS on consoles and lower end RTX GPUs. http://advances.realtimerendering.com/s2022/SIGGRAPH2022-Advances-Lumen-Wright et al.pdf

This is really, really bad and my worst fear come true. Console users and lower end RTX GPUs will choose 60 FPS over 30 FPS almost any day, so that means the Ray accelerators and RT cores are idle in that mode, as it uses Software RT instead of HW-RT.

What the fuck, Epic? Why are you neglecting HW-RT so bad? It's not like 4A Games proved that extremly fast AND beautiful RTGI using HW-RT is easily feasible, even on consoles and at 60 FPS. Maybe, just MAYBE you should rethink your entire strategy of incorporating HW-RT into the engine.

Jesus christ. The Software Path is supposed to be for legacy hardware, not your main path for getting acceptable performance in UE5. This really sucks.

And as most games in the future will use Lumen and UE5, millions of RTX and console gamers will have their RT cores and Ray accelerators idling instead of contributing to performance like they should.

HWRT is in there. It can always be used by a dev that thinks its worth it.
 
HWRT is in there. It can always be used by a dev that thinks its worth it.
That's not my point.

The point is that the software RT path is going to be the path optimized for 60 FPS on consoles and lower end RTX GPUs (high scalability setting). So if you prefer 60 fps over 30 fps with better graphics (which is almost everyone in this day and age), then your RT cores / Ray accelerators are basically idling and wasting die space which is not something such a modern engine should do. And again, Metro Exodus Enhanced Edition proves that superb RTGI using HW-RT is possible on a tight 16.6ms budget even on consoles.

I really don't know why they chose to do this. High scalability already runs quite well with HW-RT enabled, so IDK why they are getting rid of HW-RT in that mode instead of tweaking it further.
 
That's not my point.

The point is that the software RT path is going to be the path optimized for 60 FPS on consoles and lower end RTX GPUs (high scalability setting). So if you prefer performance over quality, then your RT cores / Ray accelerators are basically idling and wasting die space which is not something such a modern engine should do. And again, Metro Exodus Enhanced Edition proves that superb RTGI using HW-RT is possible on a tight 16.6ms budget even on consoles.

I really don't know why they chose to do this. High scalability already runs quite well with HW-RT enabled, so IDK why they are getting rid of HW-RT in that mode instead of tweaking it further.
Just because the RT accelerators are not being used doesn't mean that performance will be worse overall, since the RT units only accelerate part of the pipeline. Software developers should care about optimising the ms needed to perform a given algorithm, not optimising "die space".
 
Sony and MS should have released consoles without RT support (RDNA2) then? Just because EPIC wouldnt make use of it doesnt mean the rest of the world doesnt.
 
Just because the RT accelerators are not being used doesn't mean that performance will be worse overall, since the RT units only accelerate part of the pipeline. Software developers should care about optimising the ms needed to perform a given algorithm, not optimising "die space".
His point isn't about performance suffering but part of the hardware going unused where it could be applied for better graphics.

Sony and MS should have released consoles without RT support (RDNA2) then? Just because EPIC wouldnt make use of it doesnt mean the rest of the world doesnt.
Huh? Dampf is making a point about Epic's choices, not MS/Sony's. There's no complaint about the inclusion of HWRT. Quite the converse, if the hardware is in there, does it not make sense to maximise the available hardware? If it's not being used, it's idle silicon.

The only logical counterargument to Dampf is an explanation how the HWRT capabilities of the consoles will still be used in UE5 games running RT in software mode. If the hardware sits idle in games, his point is valid.
 
Huh? Dampf is making a point about Epic's choices, not MS/Sony's. There's no complaint about the inclusion of HWRT. Quite the converse, if the hardware is in there, does it not make sense to maximise the available hardware? If it's not being used, it's idle silicon.

The only logical counterargument to Dampf is an explanation how the HWRT capabilities of the consoles will still be used in UE5 games running RT in software mode. If the hardware sits idle in games, his point is valid.

I ment that if EPIC's/stuidos choice is to go the 60fps way thats not the hardware/console/IHV etc fault. I sure hope the RT hw isnt going to sit idle in so many games. Studios should adapt to the hardware thats there i think. Which they probably end up doing anyway.
 
His point isn't about performance suffering but part of the hardware going unused where it could be applied for better graphics.
Well, he said in his original post that: "console gamers will have their RT cores and Ray accelerators idling instead of contributing to performance like they should".

Does Lumen's GI solution look noticeably worse than the GI solution used in Metro Exodus? I feel that this needs to be demonstrated rather than assumed. And again, I don't see how die space is relevant. What matters is performance vs. quality. The fact a particular algorithm leaves a given GPU resource idling shouldn't be a relevant factor in and of itself. And as I understand it, GPU resources are left idling all the time and part of the difficulty in optimising ray tracing is precisely making sure that work is scheduled efficiently so the GPU is kept occupied.
 
Last edited:
Okay, so that we are all on the same page, I will enlighten you about my little outrage yesterday!

First, Epic is currently working to get Lumen running at 60 FPS on the consoles.


  • Improved performance optimizations in High scalability mode with the goal of achieving 60 fps on consoles

This high scalability mode will be certainly the mode that is chosed most by players and RTX/RDNA2 PC players with console level performance (2070 Super and below) as I'm sure many of you know 60 fps is the prefered way to play.

Previously, I thought that mode would be using Hardware Raytracing as well, specifically optimized for a tighter frame budget. This makes sense, if you're running UE5 and turn scalability to high, it will still make use of Hardware Raytracing.

However, now Epic released this Siggraph presentation about Lumen: http://advances.realtimerendering.com/s2022/SIGGRAPH2022-Advances-Lumen-Wright et al.pdf

On Page 133 they actually state which path they are going to use for that high scalability mode in the future:

Projects that use Lumen have to choose which tracing method they’re going to use. Software Ray Tracing is the best choice for projects that need the absolute fastest tracing possible, like 60 frames per second on next generation consoles. Projects that have lots of overlapping meshes built using kitbashing should also use Software Ray Tracing, which was the case in our tech demos ‘Lumen in the land of Nanite’ and the ‘Valley of the Ancients’. Projects should use Hardware Ray Tracing if they need the absolute top quality possible, like Architectural Visualization. Projects should also use Hardware Ray Tracing if they need mirror reflections, like in ‘The Matrix Awakens’, or skinned meshes affecting the indirect lighting in a significant way

As you can see, they are planning to use Software Raytracing for that high scalability mode, which renders HW-RT useless for the vast majority of gamers, as I've stated above, most gamers will prefer 60 fps over 30.

Of course the 30 fps "Epic" mode is still going to use HW-RT as demonstrated by the Matrix Demo, but lets be frank that is not the mode players are going to use if a 60 FPS mode is available. That is why the RT hardware in many upcoming games (if not most, as so many developers switched to UE5 and got rid of their own engines because of Lumen and Nanite) will be idling and not only on consoles, PCs too as many people only have RTX/RDNA2 GPUs with below 2070 Super level of performance so they are going to use high scalability which disables HW-RT.

Why is this a problem? Because it leaves super powerful hardware unused. The goal of an engine should be to squeeze every piece of silicon out of a GPU, that is why Nintendo games on older hardware still look great. Avatar: Frontiers of Pandora will actually do this, they are using a Software approach for older hardware and then use HW-RT faster performance (yes, this means their SW-RT solution will be triangle based so I am curious how this will run without HW-RT). Metro Exodus Enhanced Edition makes full usw of Hardware Raytracing, their incredible RTGI solution runs at 60 fps on consoles. Why can't Epic? Going back to UE4, they were always the ones destroying RT performance bad for little benefits, especially compared to other solutions like in Control, Metro, Marvel and many more.

If you really read behind the lines, Epic is really not that keen about HW-Raytracing in general. They handle it as way for better lighting and reflections, specifically for Archiv projects, but it's clear they want you to use the software path for actual games. The Matrix Demo is the only demo making use of HW-RT, while there's two already for Software RT "Lumen in the Land of Nanite + Valley of the Ancients". And HW-RT is disabled by default EVEN on compatible hardware. It seems like their HW-RT solution is bottlenecked hard as soon as scenes have a ton of overlapping meshes (which is basically any video game with rocks, foliage etc)

To be honest, I wouldn't be bothered so much by this if Epic didn't have such a monopoly right now. But nearly every game developers switches to UE5 (even CDPR) so that's the way it's going to be for future games.
 
I think the big question is why software RT is faster than hardware RT. That's extremely counterintuitive and defeats the purpose of implementing acceleration in hardware!
 
I think the big question is why software RT is faster than hardware RT. That's extremely counterintuitive and defeats the purpose of implementing acceleration in hardware!
They aren’t like, tracing against triangles in bvh trees (the thing the hardware accelerates) using software. The “software rt” path is marching against sdf representations.
 
I think the big question is why software RT is faster than hardware RT. That's extremely counterintuitive and defeats the purpose of implementing acceleration in hardware!
I think in this case it is because they are not doing the same thing and because console HWRT is "weak". Software RT is not tracing into the same complexity (or with the same precision) as the HWRT path. If they were equal for quality, I think we can imagine that HWRT would be perhaps the winning performance thing there.

On console the things actually accelerated are only a small part of total frame , so HWRT with its greater complexity is not seeing as awesome speed ups as it otherwise might on other hardware (Intel/NV).
 
If they were equal for quality, I think we can imagine that HWRT would be perhaps the winning performance thing there.
That's exactly what I think too. High scalability mode with similar(!) quality settings to Software Raytracing, but hardware accelerated could achieve that sweet 60 fps target on consoles and low end GPUs.

But that's not what they are planning to do, instead they plan to get rid of HW-RT in lower scalability settings below Epic and replace it with SW-RT to reach 60 fps on lower end RTX/RDNA2 GPUs and consoles. And that is the issue I am seeing here.
 
Last edited:
That's exactly what I think too. High scalability mode with similar(!) quality settings to Software Raytracing, but hardware accelerated could achieve that sweet 60 fps target on consoles and low end GPUs.

But that's not what they are planning to do, instead they plan to get rid of HW-RT in lower scalability settings below Epic and replace it with SW-RT to reach 60 fps on lower end RTX/RDNA2 GPUs and consoles. And that is the issue I am seeing here.
The one area that HW will have trouble with regardless of scalability settings are massive overlapping assets close to the camera... traversing all of the various overlapped geometry instances will run poorly in HWRT in comparison to the lower accuracy SWRT solution. I am not sure if that means they need some sort of merging to occur for that to work better with HWRT or something, but it is a thing to also consider.

I am not exactly sure how common that type of level generation will end up being though. Just overlapping dozens of meshes in a kitbash way is interesting for a demo, for sure, but would a game world really be made that way? Especially for terrain? (Instead of some sort of procedural, heighmap, etc etc thing) I am not exactly sure...
 
But that's not what they are planning to do, instead they plan to get rid of HW-RT in lower scalability settings below Epic and replace it with SW-RT to reach 60 fps on lower end RTX/RDNA2 GPUs and consoles. And that is the issue I am seeing here.

That’s just now how it works — triangles fundamentally don’t scale the same way as sdfs — very low res tri meshes don’t capture a shape as well as a low res sdf, you can’t approximate good lighting with fewer rays the same way as you can with sdfs, and hardware doesn’t accelerate sdfs. (On the high end, this reverses — the amount of data you have to store to capture exact precise shapes with sdfs is immense compared to tris — but ue5 in particular has a great solution for high res shadows that couldn’t be achieved with rt anyway, so giving up exact precision for a cheaper representation of the general shape and only doing the trace for GI is smart! ) There’s no lack of focus on hardware from epic, they built an entire hardware path basically entirely for high end pcs!


I am not exactly sure how common that type of level generation will end up being though. Just overlapping dozens of meshes in a kitbash way is interesting for a demo, for sure, but would a game world really be made that way? Especially for terrain? (Instead of some sort of procedural, heighmap, etc etc thing) I am not exactly sure...
I suspect nonexistent — if nothing else, it’s just not very hard to post process all your scenes and remove overlaps (if a reader works at an aaa studio that doesn’t know how, feel free to dm me a job offer.) however, dense overlaps aren’t the only thing that scales badly for bvh traces — basically anything dense and close together (from the perspective of the ray) does. (I don’t think that’s the reason they prefer sdfs for the low end though.)
 
Last edited:
Are you mad that they aren't using HW rasterization for nanite either? Or that the tessellation units have sit idle in 90% of ps4/one generation games?
No engine keeps all HW parts occupied at all times. In fact, its probably not even possible to do that without hitting some other bottleneck.

If HWRT could help accelerate lumen at the 60fps quality settings, they sure as hell would be using it. If they are recomending against it, is because the overhead is actually slowing the render down rather than speeding it up.

In fact, I suppose on HW such as PS5, where the clock is variable, not using every single part of the sillicon might help keep the clock at its max speed, thus allowing the actually useful parts to run as fast as possible.
 
Last edited:
Few questions:
Is the bvh format implemented at a hardware level?
Is it possible to accelerate other models?
Is it the divergent nature that causes the problems in the end?
 
Few questions:
Is the bvh format implemented at a hardware level?
Is it possible to accelerate other models?
Is it the divergent nature that causes the problems in the end?
I believe (on pc at least) you don’t manually manage the bvh, so it’s not practical to make and use a bvh tree for your sdfs or whatever with hardware. Not sure how it breaks down on console — this is something I’d also love to know!
 
PCs too as many people only have RTX/RDNA2 GPUs with below 2070 Super level of performance so they are going to use high scalability which disables HW-RT.
I think PC games will implement both options: software RT for standard quality for low/medium end hardware to reach 60fps, and hardware RT and maximum quality for powerful hardware to render using their fullest of potential, the 4090 can easily push past 90fps even in hardware mode. In many UE5 demos available online, the hardware RT actually delivers far better image quality than the software mode, at the same fps or even faster fps.

Consoles too will probably have games with the two modes: 30fps max quality and 60fps standard quality.
 
Back
Top