Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

Jofan · May 7, 2023

Off The Grid - UE5 in engine cinematic footage (via DSOG)

chris1515 · May 10, 2023

chris1515 · May 11, 2023

The 5.2 version is released

DavidGraham · May 12, 2023

Improvements to HW Lumen in UE 5.2:

Ray-traced shadows for Rect Lights and lights with a source size are now more accurate and more closely resemble results from the Path Tracer.
Better approximation of secondary bounces in reflections with Hardware Ray Tracing (HWRT) hit lighting.
Two-sided foliage support in HWRT hit lighting.
We now support async compute for inline ray-tracing passes.

Unreal Engine 5.2 Release Notes | Unreal Engine 5.2 Documentation

DavidGraham · May 12, 2023

Layers of Fear is releasing next month on UE5, it will support HW Lumen for Reflections and Shadows, Niagra for particles, as well as DLSS and XeSS.

Recommended specs for 1080p60 is RTX2070/RX 6800XT. For 4K, a 3080Ti is minimum spec.

Layers of Fear PC Demo releases this Monday, PC requirements revealed

Bloober Team has announced that a PC demo for its Unreal Engine 5-powered horror game, Layers of Fear, will release this Monday.

www.dsogaming.com

Warchild · May 12, 2023

I thought this was interesting

scently · May 12, 2023

https://twitter.com/x/status/1657069842334367744

function · May 12, 2023

scently said:
https://twitter.com/x/status/1657069842334367744

Well, I guess this is indicative of how long it can take a 'less major' new API and hardware feature to work its way through to a big engine's core release. Still no guarantee other's will use it (or offer it as an option) any time soon though I suppose. It still takes work to tune it for your particular games, and perhaps there's the chance of it interfering with the inputs to some of the third party upscaling techs out there.

Still, positive news about T2 VRS at last!

Edit: Apparently the XeSS is recommended not to use with VRS, but The Coalition says they find it works well with UE5 TSR:

https://twitter.com/x/status/1657087958082785280

Edit 2:

Intel have been working on their own VRS solution to work with XeSS:

And it appears the implementation has to query the hardware to know tiles size and adjust accordingly. Something you'd never need to do on console, as it'd be constant for a platform.

DavidGraham · May 12, 2023

https://twitter.com/x/status/1657064346818674688

Karamazov · May 12, 2023

Make a new fight night game now.

SlmDnk · May 12, 2023

DavidGraham said:
https://twitter.com/x/status/1657064346818674688

That video is old news.

What is news, though, is that the ML Deformer Sample got released: https://www.unrealengine.com/marketplace/en-US/product/ml-deformer-sample

Metricity · May 13, 2023

There is quite a bit of detail about TSR in this feedback thread. Details on optimisations, changing how they measured it as the quality depends on the framerate as well as the input resolution.

chris1515 · May 13, 2023

Metricity said:
There is quite a bit of detail about TSR in this feedback thread. Details on optimisations, changing how they measured it as the quality depends on the framerate as well as the input resolution.

Interesting, and this is maybe the reason TSR is not used in many games for the moment.

Release schedule 5.1 compared to Chapter 4 and all the changes needed in TSR made it such that is was still not stable enough which is not great considering absolutely all 3D rendered pixels are going through it. This is where the upcoming 5.2 release is in fact a big deal for TSR being the first release both production proven state, stable and with more debuging tools.

EDIT:
Interesting too

What we can do specifically on PS5 and XSX is that conveniently most of their AMD GPU is public ( https://www.amd.com/system/files/TechDocs/rdna2-shader-instruction-set-architecture.pdf ) , so we can go really deep into hardware details and imagine and experiments crazy ideas. And this is what TSR did in 5.1, it exploit performance characterists of RDNA’s 16bit instructions heavily which can have huge performance benefits. In UE5/Main for 5.3, added shader permutation ( https://github.com/EpicGames/UnrealEngine/commit/c83036de30e8ffb03abe9f9040fed899ecc94422 ) to finaly tap on these instructions exposed in standard HLSL in Shader model 6.2 ( 16 Bit Scalar Types · microsoft/DirectXShaderCompiler Wiki · GitHub ) and for instance on an AMD 5700 XT, the performance savings in TSR are identical to how much these consoles are optimised too:

What makes TSR in 5.1+ so different from 5.0 is in how many more convolutions it does using this exposed hardware capabilities. That RejectShading is doing like 15 convolutions in 5.1+, each 3x3 on a current hardware compared to 3 in 5.0, which allows to make TSR substentially smarter thanks to very neat discovered properties chaining some very particular convolutions do. And while this number of convolutions massively increased by a factor of 5, the runtime cost of this part of TSR didn’t change, and yet this gain in smartness of the algorithm allowed to cut significant amount of other costs that was no longer required in the rest of the TSR algorithm which is core reason behind this performance saving from 3.1ms to 1.5ms on these console. Sadly this expose hardware capabilities in standard HLSL are not benefiting all GPUs equally because how they decided to architure their hardware too.

Inuhanyou · May 13, 2023

So they are saying TSR are super optimized for the console GPUs in 5.2 release?

Spell it out for the dummies in the room

chris1515 · May 13, 2023

Inuhanyou said:
So they are saying TSR are super optimized for the console GPUs in 5.2 release?

Spell it out for the dummies in the room

They use FP16 and dual FP16but it is used too on PC with GPU compatible with it.

So much this will saves in the runtime cost is largely if the drivers says it supports, but also how the hardware is capable to take advantage of this optimisation or not. 16bit often saves register pressure that when register bound can saves up to 2x performance improvement, but for instance RDNA GPUs also have the packed instructions like v_pack_mul_f16 capable to two two multiplications of the price of 1 which is another 2x. So that is the use of 16bit instruction on that shader can almost do a x4 perf improvement.

Inuhanyou · May 13, 2023

chris1515 said:
They use FP16 and dual FP16but it is used too on PC with GPU compatible with it.

So is that good for the performance outcome of game?

chris1515 · May 13, 2023

Inuhanyou said:
So is that good for the performance outcome of game?

Yes but this is visible in Fortnite. Next month Layers of fear remake, it will be interesting to see the performance of the game on current gen consoles and PC.

DavidGraham · May 13, 2023

chris1515 said:
They use FP16 and dual FP16but it is used too on PC with GPU compatible with it.

Vega, RDNA1 and RDNA2 support Rapid Packed Math, where FP16 operations run twice as fast as FP32 operations.

For NVIDIA, Turing, Ampere and Ada support running FP16 ops on the tensor cores for huge throughput, while also allowing for simultaneous FP16 (on Tensors) and FP32 (on ALUs) action.

chris1515 · May 13, 2023

DavidGraham said:
Vega, RDNA1 and RDNA2 support Rapid Packed Math, where FP16 operations run twice as fast as FP32 operations.

For NVIDIA, Turing, Ampere and Ada support running FP16 ops on the tensor cores for huge throughput, while also allowing for simultaneous FP16 (on Tensors) and FP32 (on ALUs) action.

It seems they don't use Tensor core but games can use XeSS or DLSS.

Also one last thing. It would be great if we could switch the unreal engine Anti-Aliasing pipeline to tensor cores, Xe cores, AMD matrix, for performance reasons. (A console command would allow devs to let players to decide to tap into unused GPU components if DLSS,XeSS, etc is not available).
TSR could run on tensor cores instead of the main graphical computing units providing the games main render frames.

Answer from Guillaume Abadie

Would be great! But we are limited to the APIs exposed to us: for instance DirectML that requires roundtrip to main memory between a programmed shader versus matrix multiplications which is not in the interest of performance. This is where each HIV have a lot less constraints compared to us because knows exactly what and how their hardware can do and exploit many advantages that are not standardized like avoiding roundtrip to main memory to squeeze as much as possible runtime GPU performance to invest even more on quality.

What we can do specifically on PS5 and XSX is that conveniently most of their AMD GPU is public ( https://www.amd.com/system/files/TechDocs/rdna2-shader-instruction-set-architecture.pdf ) , so we can go really deep into hardware details and imagine and experiments crazy ideas.

DavidGraham · May 13, 2023

chris1515 said:
It seems they don't use Tensor core.

I believe FP16 ops are routed automatically to Tensor cores on NVIDIA hardware, without the developer intervention.

Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

Jofan

chris1515

chris1515

DavidGraham

DavidGraham

Layers of Fear PC Demo releases this Monday, PC requirements revealed

Warchild

scently

function

None functional

DavidGraham

Karamazov

SlmDnk

Metricity

chris1515

Inuhanyou

chris1515

Inuhanyou

chris1515

DavidGraham

chris1515

DavidGraham

Similar threads