Digital Foundry Article Technical Discussion [2022]

Status
Not open for further replies.
Surprising how well this port does on pc. They are (or are considering?) bringing over the settings like the PS5 has, currently the PC version doesnt dial down as much as the PS5. Nixxess also loves to be able to push way beyond what the console version does. From the EG article:

''So as I think you know, from your early analysis, to achieve 60fps on PS5 with ray tracing it makes other compromises [even beyond ray tracing]. So it turns down crowd density for example, or there's fewer cars around. And so that compensates for some of those CPU things, and we didn't make that very easy for the user to do in the [early review] build you played. So we are actually offering up some more options to make that to allow that to be better balanced [in the retail build].

And then if we don't dial down things that are dialled down on the console, we now have even more work to do on the CPU.''
 
Dialed down things on the consoles should always be available on the PC, so configurations with CPUs close to the PS5 like my i7 9750H laptop will be able to play these games nicely.
 
Interesting to hear directly from Nixxes how the initial PC release was doing WAY more than the PS5 version of the game with some things not reduceable to the level they are on PS5. Good on them to work on allowing reduction of more things post release, however.

Great example of how it's difficult or even impossible to directly compare PC releases to console releases a lot of the time.

Regards,
SB
 
Alex's tech interview with Nixxes is up on Eurogamer.net


The section about PSO/Shader compilation confirms what we already knew. It mostly comes down to how tedious it is for developers (QA) to collect PSO data so they can use it to pre-compile shaders during loading screens, or in the background when assets are loaded.

Hats off to Nixxes for doing it right, for as much of a pain in the butt as it is, by having their QA teams go through and generate the best data possible to ensure this isn't an issue. I wish more studios took the issue as serious as they do.

It's also an indication of what a huge hassle this is for smaller teams who don't have a QA team of testers at the ready who can play through the entire game again if they want to optimize a few shaders late in development. I don't know how DX13/14 or future Vulkan versions could deal with this but I'm interested to see the approaches.

Insomniac also deserves some credit for managing their materials well so there's not a huge list of shaders to compile to begin with - the fact that some can be done during ~4 second loading screens is likely a testament to that. It's a long game and semi-open world too, but ultimately it's a narrative story-driven game that has a straightword path to completion, games with more dynamic branching content just seems to emphasize to me how the approach of 'play every possible inch of this game' to get PSO data is such a struggle for some teams/game designs.

Like It's great that they do this and I wish more devs would prioritize this for PC ports no doubt, but otoh it just further reiterates how this approach to shader compilation just seems so broken atm. While we wait for some potential changes in API's going forward, to me this just indicates we really, really need for some method of sharing compiled PSO's for D3D games that works across all storefronts.

Dialed down things on the consoles should always be available on the PC, so configurations with CPUs close to the PS5 like my i7 9750H laptop will be able to play these games nicely.

That's true, but they have given the options since to optimize traffic density/crowd density, and it actually makes very little difference in the CPU bottlenecks in this game (both with RT and without - this game is not just CPU bound using RT!). The performance deficit the PC version is exhibiting here is not just about equalizing settings.

Maybe more stuff was said off-the-record, but dunno - kinda got that impression from that interview that we shouldn't expect that many significant performance increases with future patches, this is more of an engine/architecture bottleneck. We'll see.
 
Last edited:
This was interesting btw:

Nixxes said:
Jurjen Katzman: We certainly felt like DLSS didn't work very well, but it wasn't so much changing DLSS specifically but more changing the stability of reflections overall and then DLSS picks up on that.

Rebecca Fernandez: Because on PlayStation 5, they have the temporal upscaling like the temporal injection upscaler. So they built their reflections with that in mind. So when we don't have that there, we have to make some adjustments, not just for DLSS. I mean, also, if you run with no anti aliasing whatsoever {Alex laughs] - people do! - we still need to make sure that looks okay.

This is mostly referring to RT reflections with DLSS, but I find in many spots in the game with certain effects DLSS doesn't play well - often no reconstruction tech at all (just bilinear is being use I imagine if upscaler is set to 'off') with dynamic res will produce a better result. Figured the issue was primarily how some effects were tailored to Insomniac's temporal upscaler and DLSS kind of freaks out in the same specific conditions.

Hopefully they can continue to improve this, DLSS outside of these events generally looks very good, but you do tend to do these actions quite often and as such DLSS 'overall' can be more distracting than just TAA upscaling.
 
Last edited:
Alex's tech interview with Nixxes is up on Eurogamer.net


The section about PSO/Shader compilation confirms what we already knew. It mostly comes down to how tedious it is for developers (QA) to collect PSO data so they can use it to pre-compile shaders during loading screens, or in the background when assets are loaded.

Hats off to Nixxes for doing it right, for as much of a pain in the butt as it is, by having their QA teams go through and generate the best data possible to ensure this isn't an issue. I wish more studios took the issue as serious as they do.

Some awesome information in there. Here's a few key points I picked out:

  • The UMA of the PS5 is much simpler to work with (more dev friendly) than the split memory pools of the PC and it takes quite a lot of work to ensure memory management is implemented correctly on the PC side. No real info on whether there is a performance impact of the split pools vs UMA (positive or negative) when memory management is done well though.
  • The game doesn't use Direct Storage and they don't see it having a major impact on load times because the IO stack and decompression isn't the major bottleneck here. They see more benefit in game, but only from the decompression offload which isn't part of Direct Storage yet. That may be why they didn't bother to include it. They did do some experiments with it though which explains the references in the code base.
  • The high in game CPU overhead in relation to the PS5 seems down to two major items - decompression, which they apparently throw a whole core at(!) - and they mention Direct Storage GPU decompression might address, and API overhead. So perhaps not much room to improve things through patches until DS GPU decompression is up and running. It would be massively interesting to see the impact of that if Nixxes were to patch it in when available.
  • The longer load times on PC sound like they are more down to shader compilation and BHV generation for static objects - neither of which are required on console as I understand it. So the only answer here may be throwing lots more CPU power at it.
  • As we've often discussed here, the "min spec" for a game isn't necessarily the lowest spec the game will run on, or even the lowest spec it's been tested on.
 
This was interesting btw:



This is mostly referring to RT reflections with DLSS, but I find in many spots in the game with certain effects DLSS doesn't play well - often no reconstruction tech at all (just bilinear is being use I imagine if upscaler is set to 'off') with dynamic res will produce a better result. Figured the issue was primarily how some effects were tailored to Insomniac's temporal upscaler and DLSS kind of freaks out in the same specific conditions.

Hopefully they can continue to improve this, DLSS outside of these events generally looks very good, but you do tend to do these actions quite often and as such DLSS 'overall' can be more distracting than just TAA upscaling.
There's certain aspects of reflections which still exhibit nasty ghosting depending on the glass' material properties, as well as inside reflections seemingly missing interpolation and being quite choppy/delayed. Also puddle reflections can be quite grainy and shimmer a lot, depending on resolution.

I'd like to see them continue to fix, tweak, and refine things further... but I am perfectly happy with where the game is right at this moment.
 
The UMA of the PS5 is much simpler to work with (more dev friendly) than the split memory pools of the PC and it takes quite a lot of work to ensure memory management is implemented correctly on the PC side. No real info on whether there is a performance impact of the split pools vs UMA (positive or negative) when memory management is done well though.

From the previous generation atleast it was generally said that UMA and split memory pools had each its own dis- and advantages. I remember AF being one of them, and memory contention ofcourse. UMA is easier to develop for. This has always been true as far back as the OG Xbox (which had a single memory pool too).
 
Some awesome information in there. Here's a few key points I picked out:

  • The UMA of the PS5 is much simpler to work with (more dev friendly) than the split memory pools of the PC and it takes quite a lot of work to ensure memory management is implemented correctly on the PC side. No real info on whether there is a performance impact of the split pools vs UMA (positive or negative) when memory management is done well though.
  • The game doesn't use Direct Storage and they don't see it having a major impact on load times because the IO stack and decompression isn't the major bottleneck here. They see more benefit in game, but only from the decompression offload which isn't part of Direct Storage yet. That may be why they didn't bother to include it. They did do some experiments with it though which explains the references in the code base.
  • The high in game CPU overhead in relation to the PS5 seems down to two major items - decompression, which they apparently throw a whole core at(!) - and they mention Direct Storage GPU decompression might address, and API overhead. So perhaps not much room to improve things through patches until DS GPU decompression is up and running. It would be massively interesting to see the impact of that if Nixxes were to patch it in when available.
  • The longer load times on PC sound like they are more down to shader compilation and BHV generation for static objects - neither of which are required on console as I understand it. So the only answer here may be throwing lots more CPU power at it.
  • As we've often discussed here, the "min spec" for a game isn't necessarily the lowest spec the game will run on, or even the lowest spec it's been tested on.
Yep, and I figured as much after I saw how much time the RT Geometry Detail setting takes to kick in while in game.

Loading my save from main menu:
RT(on) - 3.67sec
RT(off) - 3.01sec
Changing RT in game - 3.61sec

So basically the total load time is about what it takes for the RT BVH to be built, and without that step it's around half a second quicker. Not much that can be done to further reduce that, unless they made an option in the game's menu to compile all shaders ahead of time to reduce loading times. That's the only way developers are likely to get under that 2 second north star. Like I figured when it came to Forspoken, the bigger impact of DirectStorage might come in the form of reduced CPU utilization during gameplay and asset streaming, before lowering initial loading times. Ensuring the highest resolution assets and textures are able to be streamed in incredibly efficiently. Reducing pop-in and stuff like that.
 
Some awesome information in there. Here's a few key points I picked out:

  • The high in game CPU overhead in relation to the PS5 seems down to two major items - decompression, which they apparently throw a whole core at(!) - and they mention Direct Storage GPU decompression might address, and API overhead. So perhaps not much room to improve things through patches until DS GPU decompression is up and running. It would be massively interesting to see the impact of that if Nixxes were to patch it in when available.

While it's not perfect without RT, the biggest CPU hit by comparison to the PS5 is with ray tracing, so I'm not sure how much of the CPU limitation most are complaining about is really due to texture decompression as a main factor.

I'd like to see a better benchmark that incorporates more dynamic swinging throughout the city so computerbase.de's CPU benchmarks aren't the best example, but in a CPU limited scenario without RT, there is almost no difference from a 6 core i5 to a 16core i9 in the most stressful moments:

Yd5ESDJ.png


the bigger impact of DirectStorage might come in the form of reduced CPU utilization during gameplay and asset streaming, before lowering initial loading times. Ensuring the highest resolution assets and textures are able to be streamed in incredibly efficiently. Reducing pop-in and stuff like that.

That is still an issue in the latest patch, that some textures on some buildings don't stream in until you get closer, but it's so sporadic I'm not sure if that's just more of a mip bug than a processing limitation. Doesn't really make sense that you would have 95% of the city in perfect clarity while one lone building or patch on a guards uniform is a low-res mip if it's a processing issue.

There's certain aspects of reflections which still exhibit nasty ghosting depending on the glass' material properties, as well as inside reflections seemingly missing interpolation and being quite choppy/delayed. Also puddle reflections can be quite grainy and shimmer a lot, depending on resolution.

Tree shadows also can shimmer pretty significantly vs the PS5 as well, but only under some specific lighting conditions. They have made some DLSS improvements, while ghosting was never a big concern for me there was some horrible black trails in front of lit billboards in times square that they're significantly reduced with later patches.

I'd like to see them continue to fix, tweak, and refine things further... but I am perfectly happy with where the game is right at this moment.

I'll echo what I said in my initial write-up: It's overall good, but ultimately while the quality of life nods in terms of UI and features are great to see, they still don't necessarily trump final performance for me. As is stands now, you basically need the fastest x86 on the planet to have a hope of maintaining above 60fps with RT, and probably still wouldn't reach the framerate that the PS5 version gets when running with VRR in many situations.

Granted you can do this at a higher res with a top-end GPU (and have a selection that will be much more so in a few months), but I think this is the first time we've seen a release where top-end PC hardware doesn't greatly outpace the native console version. With the poor frame pacing on lower-end GPU's (and that means anything below a 2060 Super), this is a pretty restrictive title in terms of PC hardware that can just match a console experience, if not beat it. This is not a dig at Nixxes mind you, they may be doing the best they can - but if so, that does speak to either some significant inefficiencies in the PC's architecture or the potential difficulty of ports that are more 'native' to the next gen consoles going forward - most of the ports that have brought over until this have largely been PS4 versions with PS5 patches.
 
Last edited:
The UMA of the PS5 is much simpler to work with (more dev friendly) than the split memory pools of the PC and it takes quite a lot of work to ensure memory management is implemented correctly on the PC side. No real info on whether there is a performance impact of the split pools vs UMA (positive or negative) when memory management is done well though.

One of the main reasons I've wanted to see more powerful APU's released on the PC, but I get the market issues with producing such a chip without a closed market.
 
So perhaps not much room to improve things through patches until DS GPU decompression is up and running
Instead of using CPU ressources they are going to use GPU ressources then. Not sure it's going to help improve performance if they do that during gameplay. Particularly on Xboxes and their relatively modest GPUs (compared to PC where some GPUs might have the headroom).

Also how efficient is it going to be on GPU? It can only be a temporary solution until dedicated I/O and decompression hardware appear on PC (and inevitably Xbox).
 
Instead of using CPU ressources they are going to use GPU ressources then. Not sure it's going to help improve performance if they do that during gameplay. Particularly on Xboxes and their relatively modest GPUs (compared to PC where some GPUs might have the headroom).

Also how efficient is it going to be on GPU? It can only be a temporary solution until dedicated I/O and decompression hardware appear on PC (and inevitably Xbox).
Far quicker and more efficient than on the CPU. You wont even notice a hit on the GPU I'll bet.

Xbox already has dedicated silicon for I/O and decompression.
 
One of the main reasons I've wanted to see more powerful APU's released on the PC, but I get the market issues with producing such a chip without a closed market.

Not something i'd want to see, yes its easier for the devs but it also means the disadvantages of such a setup.

Instead of using CPU ressources they are going to use GPU ressources then. Not sure it's going to help improve performance if they do that during gameplay. Particularly on Xboxes and their relatively modest GPUs (compared to PC where some GPUs might have the headroom).

Also how efficient is it going to be on GPU? It can only be a temporary solution until dedicated I/O and decompression hardware appear on PC (and inevitably Xbox).

According to nvidia its not even noticeable, a fraction of GPU resources. Yet much higher and flexible performance. Its the way forward.
 
Also how efficient is it going to be on GPU? It can only be a temporary solution until dedicated I/O and decompression hardware appear on PC (and inevitably Xbox).

I'd have to hunt for the quote but every indication I read from Nvidia on RTX I/O was that it was a relatively minuscule GPU hit - they are extremely efficient at this kind of stuff vs the CPU. Remember also these are devices with local bandwidth from ~300GB/sec to approaching 1TB/sec - you're tasking them to deal with uncompressing textures into something approaching ~10GB/sec at peak. The overall % of resources from your GPU that this will require is likely insignificant.

Overall though, yeah - I think custom silicon is really the key to unlocking significant performance gains in a future where process shrinks are few and far between. I'd love to see CPU's/chipsets with specialized low-power cores to offload common tasks that suck your CPU power currently. Let me render a 4K video with 10% CPU involvement like I can on my Macbook. Have an AI core pick out faces in my huge photo library and not affect performance one iota. Have a texture decompression block. Gimme gimme gimme. The downside is this requires more developer involvement.
 
There are existing posts on the B3D forums with some early figures from RADTools on GPU decompression. They should have improved performance since then. One should read them before trying to say GPU Decompression is a heavy task. They have even done GPU decompression on the PS4.

TLDR from fuzzy memory: Just a few TF of GPU resources and it outpaces the PS5. This is not an issue considering the PC has 50% more GPU TFs.
 
Overall though, yeah - I think custom silicon is really the key to unlocking significant performance gains in a future where process shrinks are few and far between. I'd love to see CPU's/chipsets with specialized low-power cores to offload common tasks that suck your CPU power currently. Let me render a 4K video with 10% CPU involvement like I can on my Macbook. Have an AI core pick out faces in my huge photo library and not affect performance one iota. Have a texture decompression block. Gimme gimme gimme. The downside is this requires more developer involvement.

I think todays CPU's are incorporating media encoders/decoders, Intel CPU's in special. Their very fast and efficient at certain decoding/encoding workloads.

Granted you can do this at a higher res with a top-end GPU (and have a selection that will be much more so in a few months), but I think this is the first time we've seen a release where top-end PC hardware doesn't greatly outpace the native console version. With the poor frame pacing on lower-end GPU's (and that means anything below a 2060 Super), this is a pretty restrictive title in terms of PC hardware that can just match a console experience, if not beat it. This is not a dig at Nixxes mind you, they may be doing the best they can - but if so, that does speak to either some significant inefficiencies in the PC's architecture or the potential difficulty of ports that are more 'native' to the next gen consoles going forward - most of the ports that have brought over until this have largely been PS4 versions with PS5 patches.

Ray tracing remains a beast. Equal specced-to-ps5 cpu's perform about as well in this game. That things dont scale all that well upwards from there isnt all that uncommon when including RT (as of yet).
 
Last edited:
Another thing is the Xbox already has IO and Decompression hardware. It just doesn't have the Kraken scheme of decompression.

It's really quite something to see a single post contain such completely wrong statements. It makes me believe they are deliberately doing so in bad faith only to prop up their little plastic box called PlayStation in relation to PC and Xbox.
 
Status
Not open for further replies.
Back
Top