Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

I believe FP16 ops are routed automatically to Tensor cores on NVIDIA hardware, without the developer intervention.

I don't know but what he is saying they don't use tensor core at all. But FP16 will only be available on PC with the 5.3.

EDIT: It seems to say than Direct ML is a problem for tensor core usage. From what I have seen he said they don't use tensor code and they said the code run fast on AMD 5700x but he doesn't talk about Nvidia hardware.

EDIT: He said they don't use tensor core.
What makes TSR in 5.1+ so different from 5.0 is in how many more convolutions it does using this exposed hardware capabilities. That RejectShading is doing like 15 convolutions in 5.1+, each 3x3 on a current hardware compared to 3 in 5.0, which allows to make TSR substentially smarter thanks to very neat discovered properties chaining some very particular convolutions do. And while this number of convolutions massively increased by a factor of 5, the runtime cost of this part of TSR didn’t change, and yet this gain in smartness of the algorithm allowed to cut significant amount of other costs that was no longer required in the rest of the TSR algorithm which is core reason behind this performance saving from 3.1ms to 1.5ms on these console. Sadly this expose hardware capabilities in standard HLSL are not benefiting all GPUs equally because how they decided to architecture their hardware too.

We can’t do miracle using specifically marketed hardware feature in most efficient manner with what is only a subset of these exposed capability to us at the moment. But we can still do some surprising stuff on existing hardware wrongly assumed by the publicly uncapable of doing some particular things. And we are able to do it just with standard features, and better understanding of how the GPU works thanks for instance to that AMD pdf I linked above.

EDIT: On Nvidia and Intel GPU developer will probably use DLSS and XeSS respectively.
 
Last edited:
I finally got around to testing the DLSS version of The Matrix Awakens demo. Performance is far better than the non-DLSS version (which was mostly 30fps for me, even on an RTX 4090), but now, I'm bouncing between 60fps to 55fps in most scenes with epic IQ settings, and slight dips below that occurring during fast flying traversal movements into new areas. Such a gorgeous demo...

jeDNUG8.jpg

wuqQODs.jpg

nxy9VyL.jpg
 
I'm curious to see if they keep fortnite up to date with these minor UE releases. I'm not sure if they're at 5.0 or 5.1 or if they keep updating the game so it gets all of these continual improvements.
 
I'm curious to see if they keep fortnite up to date with these minor UE releases. I'm not sure if they're at 5.0 or 5.1 or if they keep updating the game so it gets all of these continual improvements.

Currently 5.1


Fortnite Battle Royale Chapter 4 is here! With the release of the new Chapter, Fortnite Battle Royale now makes use of Unreal Engine 5’s newest, most innovative features, via Unreal Engine 5.1.
 
I had a look at layers of fear demo today, and while visually it doesn't seem to be setting the world on fire I didn't notice an shader comp stutter. Actually during the whole demo I maybe had 2 very small stutters and they were for loading as I noticed the little icon in the bottom corner. Now the shader comp stutter in scorn was so bad I almost rage quit, only thing that stopped me was it was actually the end of the game so it was basically shader comp stuttering from start to end.

I only bring that up because layers of fear being a similar type of game using ue5 was almost perfect in tat regard. Is this what we can expect with ue5 going forward? or was this just an effect of being a less complex game and not to get my hopes up that ue5 will/has fixed the shader compiling problems?
 
I had a look at layers of fear demo today, and while visually it doesn't seem to be setting the world on fire I didn't notice an shader comp stutter. Actually during the whole demo I maybe had 2 very small stutters and they were for loading as I noticed the little icon in the bottom corner. Now the shader comp stutter in scorn was so bad I almost rage quit, only thing that stopped me was it was actually the end of the game so it was basically shader comp stuttering from start to end.

I only bring that up because layers of fear being a similar type of game using ue5 was almost perfect in tat regard. Is this what we can expect with ue5 going forward? or was this just an effect of being a less complex game and not to get my hopes up that ue5 will/has fixed the shader compiling problems?

We can only hope. As I mentioned in another thread, Blooper's games have always had tons of stutter on PC regardless of how small they were. That this demo doesn't suffer from this at all (mostly) is encouraging, even in the limited scope. Also noteworthy was the remarkably low vram usage.

Also I don't believe the game is even using Unreal 5.2 which has additional improvements to PSO gathering and async compiling. Fingers crossed!

The real litmus test is when Fortnight is updated. With all the skins/effects it's like a worst-case scenario. When they update to 5.2 is when we'll really see how this new system performs.
 
Last edited:
Iirc layers of fear wasn't using Nanite. So one of the biggest changes users will see in games using them(the amount of possible geometric detail on display and lack of pop in) wasn't shown off.
 
We can only hope. As I mentioned in another thread, Blooper's games have always had tons of stutter on PC regardless of how small they were. That this demo doesn't suffer from this at all (mostly) is encouraging, even in the limited scope. Also noteworthy was the remarkably low vram usage.

Also I don't believe the game is even using Unreal 5.2 which has additional improvements to PSO gathering and async compiling. Fingers crossed!

The real litmus test is when Fortnight is updated. With all the skins/effects it's like a worst-case scenario. When they update to 5.2 is when we'll really see how this new system performs.
Yeh i've played blair witch and medium and this demo seems to be in a much better state than either of those technically. I think the demo might have been a good call in this case, I wasn't even contemplating getting this game after the issues I had with medium but this has put it back on my radar.

Iirc layers of fear wasn't using Nanite. So one of the biggest changes users will see in games using them(the amount of possible geometric detail on display and lack of pop in) wasn't shown off.
Even without nanite if the visuals from this are what we can expect indie/aa devs to be able to manage going forward without getting themselves into performance hell, I think it's a positive for ue5. Obviously need more releases and ones with bigger environments before coming to a conclusion but this has given me a glimmer of hope.
 
Unreal Engine 5.2 Burned Dead Forest Tech Demo Released (via DSOG)




'This tech demo features a 2sqkm procedural burned forest that users can freely roam.
According to the team, this tech demo takes full advantage of Unreal Engine 5’s Lumen and Nanite techs. The entire 2sqkm forest is made up of billions of Nanite triangles of custom photogrammetry scanned trees, plants, rocks, debris etc.'
 
Unreal Engine 5.2 Burned Dead Forest Tech Demo Released (via DSOG)




'This tech demo features a 2sqkm procedural burned forest that users can freely roam.
According to the team, this tech demo takes full advantage of Unreal Engine 5’s Lumen and Nanite techs. The entire 2sqkm forest is made up of billions of Nanite triangles of custom photogrammetry scanned trees, plants, rocks, debris etc.'

There is an executable

https://www.mawiunited.com/_demo/bdf1

Hardware requirement: Minimum: RTX2080 Recommended: RTX3080
 
Unreal Engine 5.2 Burned Dead Forest Tech Demo Released (via DSOG)

'This tech demo features a 2sqkm procedural burned forest that users can freely roam.
According to the team, this tech demo takes full advantage of Unreal Engine 5’s Lumen and Nanite techs. The entire 2sqkm forest is made up of billions of Nanite triangles of custom photogrammetry scanned trees, plants, rocks, debris etc.'

Getting ~26fps on a 3090 at 4k ultra quality, ultra TSR in the burned dead biome. CPU usage is very evenly distributed. No hero threads in sight. The TSR setting seems to change the render resolution that's then upscaled to 4K. I'm assuming TSR ultra is native 4K but no way to know for sure. It's hard to tell what the render quality setting is doing. Low and medium are washed out like lots of shadows are missing. A lot more trees cast and receive shadows at high and above. High renders more trees in the foreground than very high or ultra which is probably a bug.

Some interesting observations from Nsight.
  • Very compute heavy. Average SM instruction issue rate for the entire frame is ~40%. Nice!
  • L2 hit rate is low and VRAM bandwidth usage is high for significant parts of the frame. Ada will probably do better here.
  • Negligible use of async compute
  • Hardware rasterizer is still used for some tasks.
 
Gave the burning Forest a quick test (Not impressed with it graphically tbh)

To get ~60fps on my 4070ti with a 4k output I used Very High setting with TSR at balanced.

Ultra quality cut my FPS in half compared to Very High.
 
I don't know but what he is saying they don't use tensor core at all. But FP16 will only be available on PC with the 5.3.

EDIT: It seems to say than Direct ML is a problem for tensor core usage. From what I have seen he said they don't use tensor code and they said the code run fast on AMD 5700x but he doesn't talk about Nvidia hardware.

EDIT: He said they don't use tensor core.


EDIT: On Nvidia and Intel GPU developer will probably use DLSS and XeSS respectively.

Yeah it's 5.3

Also programming for FP16 optimization on these high end GPUs is hard. Figuring out where and how you gets speed ups feels like guess and check, it's not just fill in the blank and you get a speedup, and it the speedup can be different for different vendors. I wouldn't be surprised if this is part of the reason say, Jedi Survivor runs much faster on AMD cards than Nvidia ones versus other games.

I'd love to see a comparison between TSR/FSR2.2/DLSS, I miss those from Digital Foundry, it'd be interesting to see how good TSR is now. But all their PC guy does now is whinge about how games are a bit buggy and don't have totally pointless stuff like fancy options menus.

That fluid physics is insane!
Also all precalculated I'm afraid, and the creator said the biggest cached one was 33gb, and that's after culling all the sim that's not directly in frame. So this level of fluid sim, even cached, isn't going to make it into games anytime soon :confused:
 
Back
Top