Current Generation Games Analysis Technical Discussion [2022] [XBSX|S, PS5, PC]

Curiously, in NvRTX 5.0, there are improvements to solve some ray tracing issues, such as shadow mismatching when using Nanite.


The NvRTX branch also has a BVH tool to check for overlapped kitbashed geometry spots and clear it or exclude it entirely from the BVH structure (exclude it from ray tracing).

NVRTXTECHsession-625x352.png
Are these improvements exclusive to the RTX branch ? AFAIK you could already "solve" the shadow mismatch issues with Nanite in the main/upstream branch of UE5 by excluding Nanite geometry from ray tracing, toggling the bias factor, making the Nanite fallback mesh as precise as the source asset, or more aptly you can just turn off ray tracing and use virtual shadow mapping as Epic Games recommended ...

Some of these cases mentioned above such as excluding geometry from being ray traced or making the fallback mesh equally detailed are non-starters as explained in prior posts. What if the whole scene consists of Nanite meshes ? Excluding the whole scene from being ray traced just means that you're back to having no shadows which is unacceptable to Epic Games. Making fallback meshes as detailed as the source asset is still too slow even on the most powerful hardware available especially in dynamic scenes. Toggling the bias factor introduces a peter panning effect where the shadow caster is disconnected from it's shadow ...

None of these solutions were good enough according to Epic Games and likely nothing Nvidia is cooking up will be elegantly compatible with Nanite short of introducing a new fixed function HW accelerated unit for acceleration structure construction ...
 
Are these improvements exclusive to the RTX branch ?
So far yes. I guess it should be coming for the main branch soon.

such as excluding geometry from being ray traced
The BVH tool excludes kitbashed "overlapped" geometry only. It also gives you the ability to un"overlap" the spots that causes trouble.
nothing Nvidia is cooking up will be elegantly compatible with Nanite
They are already providing solutions compatible with Nanite, they had to for the sake of their RTXDI to work. you may call these solutions inelegant, but they work.
 
This is the nanite mesh
2Srx0Li.png


And this is the Nanite fallback mesh, this is not a problem of shadow being precise to the pixel anymore. Shadow aren't looking like the same than the Nanite Mesh. This is not a pixel size error at all...
OGQFIsi.png
I don't think I understand what you mean -- both these shots look exactly how I'd expect. What are you trying to point out that's odd?


Nanite proxies tend to be (much) lower polygon count than meshes in a non-nanite game, but this makes sense. Rather than one mesh which is like, 10k tris, you get one mesh that's 10 million tris that you render for the visibility buffer and for shadow maps, and one mesh that's 2k tris that you use for mesh collisions, raytraced GI (and raytraced shadows, if you were to use them for some reason), etc.

In a perfect world you'd love to raytrace against the 10 million tri meshes, but we can't do that with current technology -- rt hardware isnt flexible enough to do it in principle, or fast enough to do it in practice. Instead, we could just ship ue4 style games that don't use nanite at all and all have ~10k tri meshes, last gen style polycounts, and raytraced shadows. You can do that if you want.

A third solution is to use a cutting edge shadow map system that can run with nanite at 60fps and have reasonably high quality shadows against zillion triangle geo. Then we can use the low res proxies for RT GI and occlusion, because they're accurate enough for that, and if we need soft shadows we can cone trace against the sdfs, which is accurate enough in a lot of cases (although much less accurate in some cases, like the cube with a nearby point light example). This is a reasonable set of tradeoffs, so it's the standard path for ue5 content.
 
Last edited:
The BVH tool excludes kitbashed "overlapped" geometry only. It also gives you the ability to un"overlap" the spots that causes trouble.
"Overlapped" geometry isn't the problem here. The problem is that the geometry between our local scene representation (rasterized depth buffer) isn't identical to the non-local scene representation (acceleration structure) within the same given camera view frustum hence self-shadowing artifacts observed when using RT shadows with Nanite ...
They are already providing solutions compatible with Nanite, they had to for the sake of their RTXDI to work. you may call these solutions inelegant, but they work.
If by "compatible" you mean non-interactive framerates on dynamic scenes then Nvidia must have set the bar to be very low as opposed to Epic Games. Epic Games too had existing solutions that met those same conditions but then they had a massive epiphany that they couldn't ship RT shadows with Nanite because they realized that there's no hardware releases on the horizon that could rebuild acceleration structures for every new frame. Nanite's technology works on the basis that every meshlet has fine grain and continuous LoD transitions. A meshlet's LoD can very easily change between different frames so without rebuilding the acceleration structure every frame, artists are quickly going to find themselves accumulating self shadowing artifacts ...

With these facts in mind, Epic Games basically had rationale to implement virtual shadow mapping ...
 
I don't think I understand what you mean -- both these shots look exactly how I'd expect. What are you trying to point out that's odd?


Nanite proxies tend to be (much) lower polygon count than meshes in a non-nanite game, but this makes sense. Rather than one mesh which is like, 10k tris, you get one mesh that's 10 million tris that you render for the visibility buffer and for shadow maps, and one mesh that's 2k tris that you use for mesh collisions, raytraced GI (and raytraced shadows, if you were to use them for some reason), etc.

In a perfect world you'd love to raytrace against the 10 million tri meshes, but we can't do that with current technology -- rt hardware isnt flexible enough to do it in principle, or fast enough to do it in practice. Instead, we could just ship ue4 style games that don't use nanite at all and all have ~10k tri meshes, last gen style polycounts, and raytraced shadows. You can do that if you want.

A third solution is to use a cutting edge shadow map system that can run with nanite at 60fps and have reasonably high quality shadows against zillion triangle geo. Then we can use the low res proxies for RT GI and occlusion, because they're accurate enough for that, and if we need soft shadows we can cone trace against the sdfs, which is accurate enough in a lot of cases (although much less accurate in some cases, like the cube with a nearby point light example). This is a reasonable set of tradeoffs, so it's the standard path for ue5 content.

That's his point. When RT is involved you have to use the fallback nanite mesh to prevent problems showing up which leads to less accurate shadows than non-RT.

We'll have to see if the special NV branch actually addresses that.

Regards,
SB
 
I don't think I understand what you mean -- both these shots look exactly how I'd expect. What are you trying to point out that's odd?


Nanite proxies tend to be (much) lower polygon count than meshes in a non-nanite game, but this makes sense. Rather than one mesh which is like, 10k tris, you get one mesh that's 10 million tris that you render for the visibility buffer and for shadow maps, and one mesh that's 2k tris that you use for mesh collisions, raytraced GI (and raytraced shadows, if you were to use them for some reason), etc.

In a perfect world you'd love to raytrace against the 10 million tri meshes, but we can't do that with current technology -- rt hardware isnt flexible enough to do it in principle, or fast enough to do it in practice. Instead, we could just ship ue4 style games that don't use nanite at all and all have ~10k tri meshes, last gen style polycounts, and raytraced shadows. You can do that if you want.

A third solution is to use a cutting edge shadow map system that can run with nanite at 60fps and have reasonably high quality shadows against zillion triangle geo. Then we can use the low res proxies for RT GI and occlusion, because they're accurate enough for that, and if we need soft shadows we can cone trace against the sdfs, which is accurate enough in a lot of cases (although much less accurate in some cases, like the cube with a nearby point light example). This is a reasonable set of tradeoffs, so it's the standard path for ue5 content.
This is not for you but for everyone just to understand. ;) I forget to add the +1 maybe in my post. I agree 100% with your post.
As I understand it, all that's being said here is that you can't use RT shadows with Nanite. So this doesn't impact RT global illumination, reflections, or AO?

So what's the issue then? Clearly there is still a big place for HWRT in Nanite based games if the above is true?
Exactly
 
As I understand it, all that's being said here is that you can't use RT shadows with Nanite. So this doesn't impact RT global illumination, reflections, or AO?

So what's the issue then? Clearly there is still a big place for HWRT in Nanite based games if the above is true?

I think it’s premature to dismiss raytraced Nanite meshes altogether. UE5 is very much still a work in progress and Epic will continue to experiment with hardware capabilities. We keep saying in this thread that high density geometry and RT won’t work on current hardware but are there any actual stats to back that up? You can dial-in the density you want from any given Nanite mesh so there must be some threshold under which HWRT is practical.

Regarding shadow accuracy I’m glad to see other folks champion pixel perfect shadows. This is one of the killer features of RT that is under appreciated when rendering local or self shadows where shadow maps simply lack sufficient resolution.
 
Very interesting interview: https://wccftech.com/destroy-all-hu...40-features-will-be-used-if-supported-by-ue5/

The new GeForce RTX 4000 Series also introduced Shader Execution Reordering (SER), Opacity Micro-Maps (OMM), and Displaced Micro-Mesh (DMM) to help with ray tracing optimization. All of these have to be explicitly enabled and set up by game developers. Are you planning to take advantage of any of them?

As we are using UE5 for our next game and try in general not to touch the core rendering pipeline of the engine, it highly depends on Epic’s roadmap on making optimizations using those new Nvidia optimizations.

The micro-mesh technology is very similar to what Nanite is doing in UE5. Consequently, if Epic decides to implement some Nanite-specific optimizations taking advantage of DMM, we will definitely use this. As it applies to all new tech, if the implementation is there at the right time in the project for us, we will be able to take full advantage of those new technologies.
This is what I've been telling @DavidGraham over and over again. When Nvidia specific features like RTXGI/RTXDI and the new Ada lovelace features are not implemented directly into the engine, many developers will just ignore them.

That is why Epic has to integrate these features into the release branch of UE5, only then these features will get widespread use.
 
These features will not see widespread use even if put into the main UE branch IMO. Proprietary features have never seen broad usage when it requires changes to the art workflow or core rendering.
 
I think it’s premature to dismiss raytraced Nanite meshes altogether. UE5 is very much still a work in progress and Epic will continue to experiment with hardware capabilities. We keep saying in this thread that high density geometry and RT won’t work on current hardware but are there any actual stats to back that up? You can dial-in the density you want from any given Nanite mesh so there must be some threshold under which HWRT is practical.

Regarding shadow accuracy I’m glad to see other folks champion pixel perfect shadows. This is one of the killer features of RT that is under appreciated when rendering local or self shadows where shadow maps simply lack sufficient resolution.
When we look at the BFV presentation on ray traced reflections, it took 64ms to do a naïve rebuild of acceleration structure with full quality on what was firmly last generation quality polygonal models. At least half of the tricks mentioned with regards to acceleration structure management by Nvidia's best practices for ray tracing aren't going to work with full quality Nanite meshes. With BFV and other similar last generation games, you have the benefit that these games will do somewhat harsh LoD transition. With Nanite, you don't get that benefit to apply more clever acceleration management since LoD transitions happen more frequently and they change per-cluster rather than the per-model ...

The only workarounds available would be doing no LoD transitions and build the acceleration structure once per-level while keeping every mesh at full detail. In this scenario, you'd still have to update the acceleration structure for dynamic scenes to refit geometry and memory consumption would quickly become an issue which would place soft restrictions on level designs (smaller levels/worlds/maps) ...

The other workaround is to introduce a fixed function HW accelerated unit to be able to rebuild the acceleration structure every frame to be able to elegantly handle high-detail screen-space geometry + decent quality off screen geometry + LoD/level streaming for Nanite meshes. You would have a lower quality scene representation but performance would massively improve and memory consumption would more reasonable as opposed to the aforementioned case above. The drawback is spending a good chunk of the die size to implement the hardware to be robust enough for Nanite meshes ...

I think we can safely repudiate the idea that our acceleration structure will be able to subsist with the full generality of Nanite ...
 
When we look at the BFV presentation on ray traced reflections, it took 64ms to do a naïve rebuild of acceleration structure with full quality on what was firmly last generation quality polygonal models. At least half of the tricks mentioned with regards to acceleration structure management by Nvidia's best practices for ray tracing aren't going to work with full quality Nanite meshes. With BFV and other similar last generation games, you have the benefit that these games will do somewhat harsh LoD transition. With Nanite, you don't get that benefit to apply more clever acceleration management since LoD transitions happen more frequently and they change per-cluster rather than the per-model ...

The only workarounds available would be doing no LoD transitions and build the acceleration structure once per-level while keeping every mesh at full detail. In this scenario, you'd still have to update the acceleration structure for dynamic scenes to refit geometry and memory consumption would quickly become an issue which would place soft restrictions on level designs (smaller levels/worlds/maps) ...

The other workaround is to introduce a fixed function HW accelerated unit to be able to rebuild the acceleration structure every frame to be able to elegantly handle high-detail screen-space geometry + decent quality off screen geometry + LoD/level streaming for Nanite meshes. You would have a lower quality scene representation but performance would massively improve and memory consumption would more reasonable as opposed to the aforementioned case above. The drawback is spending a good chunk of the die size to implement the hardware to be robust enough for Nanite meshes ...

I think we can safely repudiate the idea that our acceleration structure will be able to subsist with the full generality of Nanite ...

What do you make of the note in the UE5 thread from earlier today on experimental support for tracing Nanite meshes? It’s doubtful the early work on BFV is still relevant today as there has been rapid iteration on RT implementations since that time and developers have a much better understanding of the api and hardware.

  • (Experimental) You can enable initial support for native ray tracing and path tracing of Nanite meshes by setting r.RayTracing.Nanite.Mode=1. This approach preserves all detail while using significantly less GPU memory than zero-error fallback meshes. Early tests show a 5-20% performance cost over ray tracing a low-quality fallback mesh, but results may vary based on content.
 
What do you make of the note in the UE5 thread from earlier today on experimental support for tracing Nanite meshes? It’s doubtful the early work on BFV is still relevant today as there has been rapid iteration on RT implementations since that time and developers have a much better understanding of the api and hardware.

  • (Experimental) You can enable initial support for native ray tracing and path tracing of Nanite meshes by setting r.RayTracing.Nanite.Mode=1. This approach preserves all detail while using significantly less GPU memory than zero-error fallback meshes. Early tests show a 5-20% performance cost over ray tracing a low-quality fallback mesh, but results may vary based on content.
The note wasn't one bit helpful. What content ? Where are the detailed performance metrics ? It could be anything from 16ms -> 20ms or 800ms -> 1000ms if we use the 20% figure without being specific. Also they enabled this for "path tracing" so how do you know that this change isn't intended to improve offline rendering support as opposed to real-time rendering ?

BFV is very much a relevant example even today because hardware designs hasn't evolved in any way to make building acceleration structures inherently faster besides more brute force and the situation has arguably gotten worse with respect to Nanite's per-cluster LoD streaming system. DXR 1.1 being 3 years old as well at this point reflects this reality too that hardware design hasn't changed all that much to necessitate another new API revision either ...
 
The note wasn't one bit helpful. What content ? Where are the detailed performance metrics ? It could be anything from 16ms -> 20ms or 800ms -> 1000ms if we use the 20% figure without being specific. Also they enabled this for "path tracing" so how do you know that this change isn't intended to improve offline rendering support as opposed to real-time rendering ?

BFV is very much a relevant example even today because hardware designs hasn't evolved in any way to make building acceleration structures inherently faster besides more brute force and the situation has arguably gotten worse with respect to Nanite's per-cluster LoD streaming system. DXR 1.1 being 3 years old as well at this point reflects this reality too that hardware design hasn't changed all that much to necessitate another new API revision either ...

I didn’t imply the api or hardware has changed significantly. However developers are certainly better equipped today than they were in 2018 to efficiently integrate RT into their games. The BFV example isn’t very relevant as several other games have already surpassed it in both IQ and performance.

The UE5.1 release notes didn’t offer any details but it is evidence that things are constantly evolving. DXR obviously doesn’t support Nanite’s continuous LOD out of the box but maybe they’ve found a creative way to make it work.
 
I just wanted to make a post about how disappointed I am in the lack of adoption/utilization of Xbox technologies by MS and 3rd parties thus far this generation.

At the beginning of the generation, when MS announced the Series X.. they did an excellent job of "marketing" the features of the console. I loved their fancy name for their architecture.

The VELOCITY Architecture
DirectStorage
- a new I/O subsystem allowing them to unleash the power and speed of the SSD.
Dedicated hardware decompression block - ensuring lightning fast loading and smaller game storage footprints.
Sampler Feedback Streaming - texture asset loading in fine granularity, multiplier for storage bandwidth.

New Graphics Features
Variable Rate Shading
- more efficient shading, saving resources for the parts of the image that actually matter most.
Mesh Shaders - to enable geometric detail never before thought possible.
DXR Ray Tracing - Hardware accelerated Ray Tracing support
DirectML - Machine learning hardware support for advanced ML algorithms (reconstruction, AI, ect)

System Features
Quick Resume
- Quickly switch between multiple games on the go.
VRR - support dynamic variable refreshrates improving latency and screen tearing.


Now.. both of the system features have gotten very good and extensive use. Quick Resume and VRR are great additions.. not going to dispute that. However the rest of the features of the console have felt completely invisible. The Velocity architecture hasn't really been utilized at all yet. Games basically load as quick as they would if they were on PC with a similar speed SSD and CPU. We're 2 years into the generation and there's still no big Xbox exclusives advertising that they're taking complete advantage of the Velocity Architecture, and actually demonstrating something mindblowing with it. No games utilizing Sampler Feedback Streaming to reduce texture footprint and push vastly higher detail. There's no games with Mesh Shaders proudly touting their advanced geometry and object counts..

Variable Rate Shading has gotten some use, but it's mostly been a negative with some varying levels of quality, and largely considered a crutch that Xbox was leaning on during it's first year on the market..

And then there's Ray Tracing. Outside of Forza Horizon 5, which lets be honest... has a pathetic RT implementation... there's NOTHING from Xbox taking advantage of this stuff. Even Minecraft, which was working in prototype, was apparently scrapped and never did release.

So basically you have a whole bunch of nothing from MS so far. Now I know people will say that they haven't really begun to release their next gen games yet... but we're 2 years in, and whether all of this stuff comes year 3 or not... it's pretty sad that we haven't seen any of this stuff bear fruit.

Perhaps now with DirectStorage on PC, and the new wave of fully next gen/PC only titles coming, we'll see them start to tout this stuff... but I'm skeptical. DirectStorage on PC feels like something tangibly cool... whereas on Xbox feels... just like it's just matching up to where PC was. I hope I'm wrong about that and that we start seeing some real next level ideas and stuff come from MS that take advantage of this soon.
 
I just wanted to make a post about how disappointed I am in the lack of adoption/utilization of Xbox technologies by MS and 3rd parties thus far this generation.

At the beginning of the generation, when MS announced the Series X.. they did an excellent job of "marketing" the features of the console. I loved their fancy name for their architecture.

The VELOCITY Architecture
DirectStorage
- a new I/O subsystem allowing them to unleash the power and speed of the SSD.
Dedicated hardware decompression block - ensuring lightning fast loading and smaller game storage footprints.
Sampler Feedback Streaming - texture asset loading in fine granularity, multiplier for storage bandwidth.

New Graphics Features
Variable Rate Shading
- more efficient shading, saving resources for the parts of the image that actually matter most.
Mesh Shaders - to enable geometric detail never before thought possible.
DXR Ray Tracing - Hardware accelerated Ray Tracing support
DirectML - Machine learning hardware support for advanced ML algorithms (reconstruction, AI, ect)

System Features
Quick Resume
- Quickly switch between multiple games on the go.
VRR - support dynamic variable refreshrates improving latency and screen tearing.


Now.. both of the system features have gotten very good and extensive use. Quick Resume and VRR are great additions.. not going to dispute that. However the rest of the features of the console have felt completely invisible. The Velocity architecture hasn't really been utilized at all yet. Games basically load as quick as they would if they were on PC with a similar speed SSD and CPU. We're 2 years into the generation and there's still no big Xbox exclusives advertising that they're taking complete advantage of the Velocity Architecture, and actually demonstrating something mindblowing with it. No games utilizing Sampler Feedback Streaming to reduce texture footprint and push vastly higher detail. There's no games with Mesh Shaders proudly touting their advanced geometry and object counts..

Variable Rate Shading has gotten some use, but it's mostly been a negative with some varying levels of quality, and largely considered a crutch that Xbox was leaning on during it's first year on the market..

And then there's Ray Tracing. Outside of Forza Horizon 5, which lets be honest... has a pathetic RT implementation... there's NOTHING from Xbox taking advantage of this stuff. Even Minecraft, which was working in prototype, was apparently scrapped and never did release.

So basically you have a whole bunch of nothing from MS so far. Now I know people will say that they haven't really begun to release their next gen games yet... but we're 2 years in, and whether all of this stuff comes year 3 or not... it's pretty sad that we haven't seen any of this stuff bear fruit.

Perhaps now with DirectStorage on PC, and the new wave of fully next gen/PC only titles coming, we'll see them start to tout this stuff... but I'm skeptical. DirectStorage on PC feels like something tangibly cool... whereas on Xbox feels... just like it's just matching up to where PC was. I hope I'm wrong about that and that we start seeing some real next level ideas and stuff come from MS that take advantage of this soon.

Outside of demos, isn’t unreal 5 with Nanite and Lumen in the same boat?

Epic has had access to PC hardware capable of supporting UE5 far longer than devs have had access to XS dev kits.

Yet here we are.

🤷🏻‍♂️
 
I just wanted to make a post about how disappointed I am in the lack of adoption/utilization of Xbox technologies by MS and 3rd parties thus far this generation.

At the beginning of the generation, when MS announced the Series X.. they did an excellent job of "marketing" the features of the console. I loved their fancy name for their architecture.

The VELOCITY Architecture
DirectStorage
- a new I/O subsystem allowing them to unleash the power and speed of the SSD.
Dedicated hardware decompression block - ensuring lightning fast loading and smaller game storage footprints.
Sampler Feedback Streaming - texture asset loading in fine granularity, multiplier for storage bandwidth.

New Graphics Features
Variable Rate Shading
- more efficient shading, saving resources for the parts of the image that actually matter most.
Mesh Shaders - to enable geometric detail never before thought possible.
DXR Ray Tracing - Hardware accelerated Ray Tracing support
DirectML - Machine learning hardware support for advanced ML algorithms (reconstruction, AI, ect)

System Features
Quick Resume
- Quickly switch between multiple games on the go.
VRR - support dynamic variable refreshrates improving latency and screen tearing.


Now.. both of the system features have gotten very good and extensive use. Quick Resume and VRR are great additions.. not going to dispute that. However the rest of the features of the console have felt completely invisible. The Velocity architecture hasn't really been utilized at all yet. Games basically load as quick as they would if they were on PC with a similar speed SSD and CPU. We're 2 years into the generation and there's still no big Xbox exclusives advertising that they're taking complete advantage of the Velocity Architecture, and actually demonstrating something mindblowing with it. No games utilizing Sampler Feedback Streaming to reduce texture footprint and push vastly higher detail. There's no games with Mesh Shaders proudly touting their advanced geometry and object counts..

Variable Rate Shading has gotten some use, but it's mostly been a negative with some varying levels of quality, and largely considered a crutch that Xbox was leaning on during it's first year on the market..

And then there's Ray Tracing. Outside of Forza Horizon 5, which lets be honest... has a pathetic RT implementation... there's NOTHING from Xbox taking advantage of this stuff. Even Minecraft, which was working in prototype, was apparently scrapped and never did release.

So basically you have a whole bunch of nothing from MS so far. Now I know people will say that they haven't really begun to release their next gen games yet... but we're 2 years in, and whether all of this stuff comes year 3 or not... it's pretty sad that we haven't seen any of this stuff bear fruit.

Perhaps now with DirectStorage on PC, and the new wave of fully next gen/PC only titles coming, we'll see them start to tout this stuff... but I'm skeptical. DirectStorage on PC feels like something tangibly cool... whereas on Xbox feels... just like it's just matching up to where PC was. I hope I'm wrong about that and that we start seeing some real next level ideas and stuff come from MS that take advantage of this soon.
Eh. Anything that can't be used in a uniform fashion with the rest of multiplatform console development will go by the wayside unless it's on an Xbox exclusive. Same with any features Sony might have. That will last beyond the crossgen phase.

MS hyped up a lot of bullet points because it was something they could get the hype cycle to run on. That has nothing to do with the utilization of those things in games. Same with Sony's talk about the SSD which also won't be used outside of exclusives. Even if all devs will benefit from having an SSD in the consoles in general
 
Digital Foundry posted a new Analysis for a Plague Tale requiem and got a true 1:1 comparisons between PC GPU's & consoles

with the exact settings a 2070Super delivers a 3.8% lead over the PS5

while the PS5 is faster than an RX5700 by 28% (so i think in this title is vastly faster than an RX5700XT as well which is not shown) the game doesn't seem to scale well with AMD compared to Nvidia

 
Back
Top