Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

I suspect popular reception in regard to MegaLight's performance will be similar to that of Nanite. Nanite allows millions of triangles to be displayed at 60FPS, which would be a slideshow without Nanite.
Not exactly, if your machine can handle say 20 millions polygons per frame with nanite, it can also handle that with another engine, it's just that nanite manages all LOD transitions automatically to stay in that "polygonal budget" and push the maximum details up close.
 

People seem to be really pissed about UE5's poor performance. Even a great feature like MegaLights cannot help to improve the reputation of it.

And I think I can agree. Getting rid of traversal stutter and shader comp stutters should be the main priority. Many games from many different developers exhibit UE typical performance issues.

Even Fortnite still has awful stuttering, despite the shader comp step now and much lower performance than a year ago.
 
Won't really matter as long as shader stutter and performance is not fixed, UE5 games has really been impacted by this for me.
I mean they make good demos, but games gave UE5 a bad rep and rightfully so 🤷‍♂️
 
The discourse in that reddit thread above is the type of sentiment I see on almost every gaming forum when a game is announced to be using Unreal Engine. You have the "WOW! Amazing graphics" reaction, followed up by #stutterstruggle. People are getting more vocal about it.. and regardless of who's fault it actually is, it has created a very negative image for the engine.
 
Tbf of all UE5 games I've played this year I can't think of any where I've ran into some especially high amount of stuttering. The combination of Epic improving PSO gathering and developers paying attention to that seem to result in UE5 being just your regular PC engine when it comes to stutters these days - i.e. there are some but nothing which sets UE5 apart from other engines.
 
Tbf of all UE5 games I've played this year I can't think of any where I've ran into some especially high amount of stuttering. The combination of Epic improving PSO gathering and developers paying attention to that seem to result in UE5 being just your regular PC engine when it comes to stutters these days - i.e. there are some but nothing which sets UE5 apart from other engines.
Which games are those?
 
This was Sebbi's comment on the Nanite debate.
Nanite’s software raster solves quad overdraw. The problem is that software raster doesn’t have HiZ culling. Nanite must lean purely on cluster culling, and their clusters are over 100 triangles each. This results in significant overdraw to the V-buffer with kitbashed content (such as their own demos). But V-buffer is just a 64 bit triangle+instance ID. Overdraw doesn’t mean shading the pixel many times.
Nanite actually does make use of hardware HiZ culling, but in a rather unique way. It doesn't prevent overdrawing to the V-buffer but it does stop over-shading the pixel.

GPUs seem to become more universal over time, with more and more workloads done as compute shaders these days. Will we end up with some generic, highly parallel compute machines with no fixed-function hardware? I don’t know. But Nanite technology from the new Unreal Engine 5 makes a step in this direction by implementing its own rasterizer for some of its triangles, in form of a compute shader. I recommend a good article about it: “A Macro View of Nanite – The Code Corsair” (it seems the link is broken already - here is a copy on Wayback Machine Internet Archive). Apparently, for tiny triangles of around single pixel size, custom rasterization is faster than what GPUs provide by default.

But in the same article we can read that Epic also does something opposite in Nanite: they use some fixed-function parts of the graphics pipeline very creatively. When applying materials in screen space, they render a full-screen pass per each material, but instead of drawing just a full-screen triangle, they do a regular triangle grid with quads covering tiles of NxN pixels. They then perform a coarse-grained culling of these tiles in a vertex shader. In order to reject one, they output vertex position = NaN, which makes a triangle incorrect and not spawning any pixels. Then, a more fine-grained culling is performed using Z-test. Per-pixel material identifier is encoded as depth in a depth buffer! This can be fast, as modern GPUs apply “HiZ” - an internal optimization to reject whole groups of pixels that fail Z-test even before their pixel shaders are launched.
 
Not exactly, if your machine can handle say 20 millions polygons per frame with nanite, it can also handle that with another engine, it's just that nanite manages all LOD transitions automatically to stay in that "polygonal budget" and push the maximum details up close.

This seems a little deceptive. If we're talking drawn polygons, then nanite will handle more faster. It can handle one polygon per pixel much better than a pure hardware rasterizer would. That's part of the point.

Really depends on how you count polygons. If you just add all of the LOD0 for all of the meshes in a scene, you could say a hardware rasterizer and a software rasterizer like nanite can both handle 20 million polygons or something, but the actual screen output won't be the same. The hardware rasterizer will have to make more more aggressive LOD changes to keep pixel coverage for triangles higher. There is some inflection point where nanite can outperform the hardware rasterizer and draw smaller polygons faster leading to more triangles drawn and more geometric detail.
 
This seems a little deceptive. If we're talking drawn polygons, then nanite will handle more faster. It can handle one polygon per pixel much better than a pure hardware rasterizer would. That's part of the point.

Really depends on how you count polygons. If you just add all of the LOD0 for all of the meshes in a scene, you could say a hardware rasterizer and a software rasterizer like nanite can both handle 20 million polygons or something, but the actual screen output won't be the same. The hardware rasterizer will have to make more more aggressive LOD changes to keep pixel coverage for triangles higher. There is some inflection point where nanite can outperform the hardware rasterizer and draw smaller polygons faster leading to more triangles drawn and more geometric detail.
To add to this; unless I’m out of date, hardware triangle ratings are specced at 16 pixels per triangle or larger.
Once you go below 16 you are losing your tris/clock exponentially until you can barely put out triangles at 1 triangle per pixel.
 
I suspect popular reception in regard to MegaLight's performance will be similar to that of Nanite. Nanite allows millions of triangles to be displayed at 60FPS, which would be a slideshow without Nanite. MegaLights does the same for hundreds of shadowcasting lights. But gamers will expect it to somehow provide better performance in every situation across the board, so when games using Nanite and MegaLights with millions of triangles and hundreds of shadowcasting lights release and have worse performance than games with a hundred thousand triangles and 2-3 shadowcasting lights, they'll be disappointed.
nVidia talked about it. You dont use something like MegaLights when you dont need hundreds of individual light sources. It is more inefficient than doing ray tracing for a few one.
 
This seems a little deceptive. If we're talking drawn polygons, then nanite will handle more faster. It can handle one polygon per pixel much better than a pure hardware rasterizer would. That's part of the point.
And even more on point, it's a silly statement to make because if you knew exactly which triangles you needed to draw, you wouldn't need 20 million to draw onto a ~2-8 million pixel screen. LOD and culling *are* the relevant parts of rasterization, so the interesting comparison is how effective those parts are. Waving my hand and saying "oh well if I just happened to have a perfect LOD for all my objects (and no big/near objects for which a mesh-wide LOD selection is insufficient) and how everything was occluded then I could submit just those draw calls as fast as Nanite" is very much missing the point.

But yes, it is well known that GPU pipelines are not well optimized for small triangles. As I've mentioned in the past, if they suddenly got better then you could change a cvar in Nanite to shift where software rasterization happens (or even disable it entirely) and that would be great. Software rasterization gets a lot of public attention but really it is an implementation detail, not the main component of Nanite. When people discuss things like "nanite vs mesh shaders" or "nanite vs explicit LOD" they are also missing the point. It's like comparing an entire car to just a transmission or something.
 
Last edited:
nVidia talked about it. You dont use something like MegaLights when you dont need hundreds of individual light sources. It is more inefficient than doing ray tracing for a few one.
Most modern games with large envirnments or open worlds do have hundreds of lights. They are mostly unshadowed though, where MegaLights can bring out a great visual quality improvement (at the expense of its denoising artifacts, unfortunately)
 

So from watching Digital Foundry's latest video it seems like this was a UE5 (nvidia branch) demo using all of their new tech. Seems like Nvidia's mega geometry can handle nanite's geometry density, so they can ray trace and have shadows and indirect lighting that accurately reflects the nanite representation.
 
Back
Top