Digital Foundry Article Technical Discussion [2022]

Status
Not open for further replies.
Decompression happening on the CPU is also a reason for the performance as mentioned by Alex in his analysis video.

Spidermans 'PS4 level' asset streaming needs are nothing compared R&C's.

So good luck decompressing the data required for those portal transitions and building/updating the BVH on a PC CPU when it can barely handle Spiderman.

What information do you have to suggest R&C is streaming more in game vs Spiderman? Sure portal transitions will be loading a lot of data very quickly - but that's not background streaming that will impact framerate. That's just a very short loading screen. World traversal is much faster in Spiderman which is one of the primary drivers for in game streaming. The comparison with PS4 is silly given the the PS5 version has higher detail assets, textures and shadows, all of which would balloon data streaming requirements.

And it's the current release I'm referring too.

Although with 3 years to get a single game out the gate I dread to think how long it will take for the GPU decompression to be implemented.

That's a different discussion. I just wanted to point out that the Direct Storage technology is designed to address the hardware requirement as well. I agree it'll likely be a while before we see it.
 
A little disappointed we have to wait for the DLSS/FSR comparisons, but a separate video just means more detail so looking forward to it. This was expectedly thorough though, well done as always done Alex.

One niggle is I would have liked to see is a bit more performance comparisons with the PS5's VRR mode. I don't have a VRR display so I'm not sure if the Performance RT mode on the PS5 is allowed to go beyond 60fps in that mode, it would have been interesting to see the particular stress test areas to compare what PC CPU is needed to match it if it can go beyond 60fps with RT. I know the Fidelity mode can go slightly beyond 40fps with VRR now at least, and non-RT performance can go to ~80fps in spots, but it's difficult to determine as DF never did a specific video for the 40fps/VRR Spiderman update and the bulk of the videos on youtube about this are from guys reporting their LG CX1's VRR on-screen indicator which is not showing the actual framerates due to LFC (a lot of 'holy crap my PS5 is running Spiderman at 120fps! comments).

And it's the current release I'm referring too.

Although with 3 years to get a single game out the gate I dread to think how long it will take for the GPU decompression to be implemented.

This is true, the GPU decompression was the big 'get' with DirectStorage as it was hyped, so it being 'released' without this crucial component doesn't exactly excite me.

That being said, we're coming up on almost 2 years since the the PS5 release, and there was certainly a lot of hype near and during its debut about the seemingly untouchable architectural advantage this hardware texture decompression along with SSD would bring. Perhaps Ratchet and Clank would struggle on the PC if it were ported currently, who knows - I'd at least hazard a guess it would be the toughest porting challenge to date, sure.

But it's also fair to say we haven't exactly seen a flood of titles from Sony that exactly live up to the hype in this area either. I think the pandemic has affected development across the board so delayed schedules are not exactly the best indicator for timelines going forward.
 
Because they're unloading what you can see and then re-loading as you turn the camera, that alone (plus the much higher asset quality compared to Spiderman) all but guarantees it streams more than Spiderman.


How extensively this technique is used is open to extreme scepticism. The game is only 33GB on the SDD. Say 66GB uncompressed. The PS5 has roughly 10GB of usable GPU memory. Are you suggesting that 1/6th of the entire games data is present in the visible viewport at any given time? If it were then the game would have a serious issue with variety - which it doesn't. I suspect they are exaggerating quite a bit there in terms of what the SDD enables them to do vs what they actually do with it.

It's highly likely that when the game comes to PC, if the technique above is used on PS5 it will be done entirely differently on PC. e.g. the data that is loaded on PS5 when turning will simply be present already in the much larger GPU+CPU RAM pool. Hence the CPU impact from decompression requirements actually will be negligible.

And just to address this earlier point more directly:

So good luck decompressing the data required for those portal transitions and building/updating the BVH on a PC CPU when it can barely handle Spiderman.

Watch the DF video from here:


That's a PC CPU handling Spiderman at over 100fps in some of the most demanding traversal scenes. Perhaps we have different definitions of barely?
 
How extensively this technique is used is open to extreme scepticism.
But until such a source exists to counter Insomniac games this is what we go on.
The game is only 33GB on the SDD. Say 66GB uncompressed.
The maximum compression ratio with Kraken+Oodle texture is 3.16:1 so it could very well be over 66Gb uncompressed.
The PS5 has roughly 10GB of usable GPU memory.
Does it? It has a unified system where developers can chose how much they want to allocate as VRAM, PS5 can have over 10Gb if the developer chooses, Series-X is the one with the 10Gb VRAM pool so are you confusing the two?
Are you suggesting that 1/6th of the entire games data is present in the visible viewport at any given time? If it were then the game would have a serious issue with variety - which it doesn't. I suspect they are exaggerating quite a bit there in terms of what the SDD enables them to do vs what they actually do with it.
No, I'm suggesting there is a lot of decompression and I/O going on, you need to just stop thinking it's purely decompression.

The game may only stream say 500Mb/s constantly as the camera moves but if that is made up of hundreds/thousands of files it's not the decompression itself that will kill the CPU performance, it's the stuff after like file check in and copying that will kill performance.

You've already said the game doesn't have an issue with variety, so it very well could have hundreds/thousands of files that need to be managed after decompression.

This is something that PS5 also had dedicated hardware to handle and something that Direct Storage will still do on the CPU in a PC.
It's highly likely that when the game comes to PC, if the technique above is used on PS5 it will be done entirely differently on PC. e.g. the data that is loaded on PS5 when turning will simply be present already in the much larger GPU+CPU RAM pool. Hence the CPU impact from decompression requirements actually will be negligible.
Are you really going to have potentially an insane RAM requirement? RAM is not cheap.
That's a PC CPU handling Spiderman at over 100fps in some of the most demanding traversal settings. Perhaps we have difference definitions of barely?
I was talking in general, a 12900k is not your average CPU and anything less than a 11700k and 5800x can not lock the game to 60fps max settings.

An 8700k, still a very capable gaming CPU has a minimum FPS of 47fps.

This game has also put me in the shit as I'm looking at upgrading my 4770k, I was looking at a Ryzen 3600 but looking at how this game runs I may need to wait to see what happens with the next generation CPU's and need to spend even more money.........ffs.
 
If there’s any hyper special hardware advantage in a game like ratchet the experience you’d get on pc without the super special hardware is slightly more lod and texture streaming artifacts*. There’s simply not so much mesh and texture data being passed around in an AAA 3d scene to imagine anything more serious happening. At a certain point it doesn’t even matter how much data you can load, your gpu is not going to hit 40fps if it’s being fed too much . Rather runs at a solid fps so we know the total budget is reasonable.

*and possibly brutal shader compilation hitches because pc is an awful gaming platform compared to either console despite other advantages
 
*and possibly brutal shader compilation hitches because pc is an awful gaming platform compared to either console despite other advantages

Well yeah, every platform is far worse than the other if you don't count the advantages. :)

I will certainly agree that shader compilation stutter (and stutter in general) definitely gives me significant pause for new releases on PC now vs. console, that and the near eradication of the low-end due to crypto/pandemic has really hurt the PC's image over the past couple of years, especially in the price/performance arena. Otoh, you also see releases like Playstation exclusives we never would have dreamed of a few years ago. The best place to play Horizon Zero Dawn, God of War, Days Gone, arguably Spider Man now - is the PC.

The price to get those advantages is still somewhat prohibitive, as is the delay seeing them released - but these are not titles anyone really expected before now. With Bethesda being owned by MS as well (um...eventually?), that's potentially another slew of games where you'll require 2 consoles to be able to experience them. So while I'm highly critical of some of the hassles PC gaming brings and think consoles are probably a better fit for the majority of families, the PC now being regarded as an 'awful' platform due to UE4 shader stutter is a bit much.

Edit: Bear in mind if you mean as a gaming development platform only vs. the consumer side, then ok disregard the above. :)
 
But until such a source exists to counter Insomniac games this is what we go on.
Fair enough but by the same logic, until we see this presenting some kind of special challenge in terms of CPU performance in a PC port there's no reasonable basis upon which to assume it will.

The maximum compression ratio with Kraken+Oodle texture is 3.16:1 so it could very well be over 66Gb uncompressed.

Both Sony and RADGame Tools claim a 2:1 average compression ratio. There's no reason whatsoever to assume anything better than that.

Does it? It has a unified system where developers can chose how much they want to allocate as VRAM, PS5 can have over 10Gb if the developer chooses, Series-X is the one with the 10Gb VRAM pool so are you confusing the two?

I said roughly and I believe there's a thread on here that confirms its something like 12.5GB available for games, and so 10GB as pure vram seems reasonable. In any case, if its more then that strengthens my point so ho hum.
No, I'm suggesting there is a lot of decompression and I/O going on, you need to just stop thinking it's purely decompression.

The game may only stream say 500Mb/s constantly as the camera moves but if that is made up of hundreds/thousands of files it's not the decompression itself that will kill the CPU performance, it's the stuff after like file check in and copying that will kill performance.

You've already said the game doesn't have an issue with variety, so it very well could have hundreds/thousands of files that need to be managed after decompression.

This is something that PS5 also had dedicated hardware to handle and something that Direct Storage will still do on the CPU in a PC.

IO overhead is relatively small compared to decompression but in any case, no, the PS5 does not have dedicated hardware to handle that (there's a great blog entry from RADGame Tools on this). It has a very efficient API which makes that overhead negligible which is exactly what DirectStorage does. So if what you say is correct (which I don't agree with) then the simple implementation of the existing version of DirectStorage on the PC version of this game would more or less equalise the CPU overhead.

Are you really going to have potentially an insane RAM requirement? RAM is not cheap.

Insane? Even a low end gaming PC these days would have at least 24GB of total RAM making 16-20GB of that available for games. A decent amount more than the PS5.

A good gaming PC by the time that game releases could reasonably be expected to contain 40GB+

You could fit an entire half of R&C in that. Why under those circumstances woukd you need to clear down and reload RAM for a simple 180 degree turn in a single environment?

I was talking in general, a 12900k is not your average CPU and anything less than a 11700k and 5800x can not lock the game to 60fps max settings.

Max settings are far from PS5 settings. At PS5 settings the 12900k can exceed the PS5 performance by more than 50%. There are many CPU's including those you reference above that the 12900k is less than 50% faster than.

An 8700k, still a very capable gaming CPU has a minimum FPS of 47fps.

At PS5 matched settings (as per Alex's video)?

This game has also put me in the shit as I'm looking at upgrading my 4770k, I was looking at a Ryzen 3600 but looking at how this game runs I may need to wait to see what happens with the next generation CPU's and need to spend even more money.........ffs.

So you're disappointed that you were planning to get a CPU with only 75% of the cores of the PS5 CPU while sharing the same architecture and now it turns out to be performing slightly worse?

Perhaps check your expectations.
 
Decompression happening on the CPU is also a reason for the performance as mentioned by Alex in his analysis video.

Spidermans 'PS4 level' asset streaming needs are nothing compared R&C's.

So good luck decompressing the data required for those portal transitions and building/updating the BVH on a PC CPU when it can barely handle Spiderman.
I already did a bunch of testing when I upgraded my PS5 drive and posted about it here. The PS5 version of Spider-Man MM is vastly more IO intensive than the PS4 version and this remaster is seemingly similar on PC judging by the video.

I also tested R&C in that thread and it didn't strike me as being much more read intensive. I'll have to wait until SM finishes downloading on my PC though.
 
Fair enough but by the same logic, until we see this presenting some kind of special challenge in terms of CPU performance in a PC port there's no reasonable basis upon which to assume it will.
I think it's safe the assume the portal transitions would be a pain the arse to get working on PC at the same speed and quality as they are on PS5 without serious help from a GPU for the decompression.
Both Sony and RADGame Tools claim a 2:1 average compression ratio. There's no reason whatsoever to assume anything better than that.
The 3.16:1 figure is from RADGame Tools after they announced Oodle texture, the 2.1:1 ratio was declared prior.
IO overhead is relatively small compared to decompression but in any case
Last gen I/O overhead was low, next gen overhead gets quickly out of hand as Mark Cerny said himself.
The PS5 does not have dedicated hardware to handle that (there's a great blog entry from RADGame Tools on this).
Yes it does, the I/O complex contains more then just the fixed function decompressor, there's other co-processors that handle all the other stuff like check-in and file copying which means PS5's CPU doesn't have to do any I/O work, neither PC (Nor Series consoles for that matter) have that hardware.
Imt98hhiGyx2HAQx.jpg
It has a very efficient API which makes that overhead negligible which is exactly what DirectStorage does. So if what you say is correct (which I don't agree with) then the simple implementation of the existing version of DirectStorage on the PC version of this game would more or less equalise the CPU overhead.
Again, it's not just the API as even with Direct Storage the CPU will still be required to do things PS5's CPU won't as the I/O complex handles it all.
Insane? Even a low end gaming PC these days would have at least 24GB of total RAM making 16-20GB of that available for games. A decent amount more than the PS5.
And without knowing how much pre-loading would be needed that may not be enough as some games available now max out the 8Gb on my 3060ti
A good gaming PC by the time that game releases could reasonably be expected to contain 40GB+
At the rate prices are increasing for hardware I doubt the average PC will be at that level any time soon and what if people don;t have the memory? Is it reasonable to expect them to pay for the game and for a RAM upgrade?
You could fit an entire half of R&C in that.
You/we don't know that for certain
At PS5 matched settings (as per Alex's video)?
I will have to double check
So you're disappointed that you were planning to get a CPU with only 75% of the cores of the PS5 CPU while sharing the same architecture and now it turns out to be performing slightly worse?

Perhaps check your expectations.
PS5 only has 6 cores for gaming while running at a lower clock speed so the difference is mute in reality.

So there's no need to check my expectations.
 
I already did a bunch of testing when I upgraded my PS5 drive and posted about it here. The PS5 version of Spider-Man MM is vastly more IO intensive than the PS4 version and this remaster is seemingly similar on PC judging by the video.

I also tested R&C in that thread and it didn't strike me as being much more read intensive. I'll have to wait until SM finishes downloading on my PC though.

Your testing only showed data transfer, it didn't show anything more than that such as how many individual files were sent, what the compression ratio was..etc..etc..
 
Because they're unloading what you can see and then re-loading as you turn the camera, that alone (plus the much higher asset quality compared to Spiderman) all but guarantees it streams more than Spiderman.


If you have a strong LOD system you're unlikely to use 12.5 GB just on what's infront of you + other game logic stuff. Certainly, nothing we've seen so far would seem to be worth that magnitude of footprint, even in R&C. Plus, if you have raytraced reflections YOU CAN'T FUCKING UNLOAD EVERYTHING BEHIND YOU.

I swear to god that context lacking tweets + hyperbole make everything worse.
 
Your testing only showed data transfer, it didn't show anything more than that such as how many individual files were sent, what the compression ratio was..etc..etc..
That's why I want to test it on PC. The PC version is roughly the same size as the PS5 version and if they show similar data throughput then we can assume them to be broadly the same.

That especially applies if it comes with the Oodle DLL. I expect that it does but I'll need to see when it finishes.
 
You know... these games are HIGHLY tailored to the specific hardware and architecture they were meant to run on. You can easily bottleneck even the most powerful PCs, with code architected for something completely different. This isn't your average 3rd party game.. Despite the fact that this is at its core a PS4 game.. certain realities of the games design at a fundamental level limit potential, especially when contending with budget and time constraints.. They don't have the budget to completely redesign the thing from the ground up.

And hearing Ratchet and Clank come up now in this thread... remember there was this rumor that PS developers were asked to keep PC in mind as they developed PS5 games. It's possible that some smarter design decisions made when developing THOSE games, as opposed to porting older PS4 games... could make a big difference in the outcome of those games vs these ones.

If Ratchet and Clank was built from the ground up for PC... a reasonable spec machine would have no damn problem at all doing anything the PS5 version does... not that I think it will anyway. And now that PS developers know that future PC ports are likely to happen, they can plan and make better decisions earlier on that can have a much more positive impact on ports for the future.
 
That's why I want to test it on PC. The PC version is roughly the same size as the PS5 version and if they show similar data throughput then we can assume them to be broadly the same.

That especially applies if it comes with the Oodle DLL. I expect that it does but I'll need to see when it finishes.
Quick update: The game doesn't ship with the Oodle Kraken DLL which genuinely surprised me. I fully expected that it would since both GOW and HZD did along with countless other games released in the past few years.

It's still fairly IO intensive where the game reads about 10GB from disk during the first 3 minutes of intro + gameplay with 32GB of memory. Limiting free memory to just 8GB increased those reads to around 18GB during the same sequence.

For perspective, the R&C intro lasts about 3:30 and reads ~18GB from disk. The Spider-Man MM intro sequence lasts about 2:30 and reads ~32GB from disk.

I've had a few beers so take this with a grain of salt. But these numbers should be mostly accurate.
 
One niggle is I would have liked to see is a bit more performance comparisons with the PS5's VRR mode. I don't have a VRR display so I'm not sure if the Performance RT mode on the PS5 is allowed to go beyond 60fps in that mode, it would have been interesting to see the particular stress test areas to compare what PC CPU is needed to match it if it can go beyond 60fps with RT. I know the Fidelity mode can go slightly beyond 40fps with VRR now at least, and non-RT performance can go to ~80fps in spots, but it's difficult to determine as DF never did a specific video for the 40fps/VRR Spiderman update and the bulk of the videos on youtube about this are from guys reporting their LG CX1's VRR on-screen indicator which is not showing the actual framerates due to LFC (a lot of 'holy crap my PS5 is running Spiderman at 120fps! comments).

Was able to find a video that shows uncapped PS5 framerates - it's IGN so uh...NX gamer of course. He states the Performance RT mode when VRR is a 'dynamic 4k but often settling around 1440p', but from DF and other outlets before this VRR patch, it was commonly reported as 1440p at its height, down to 1080p in spots. So not sure how accurate that is.

Quibbles on res aside though, it can at least show us how the PS5 is doing on the CPU side of things, and frankly these numbers - even considering the lowered ray tracing settings and even if was 1080p - paint the PC version (or, its hardware) in a somewhat less flattering light than before viewing this imo.

fZX9jqv.png


Perhaps it's not the most demanding area, but seeing framerates that are usually between 80-100, with RT, is very impressive. That's 12900k territory, hopefully it indicates there's a lot more work to optimize on Nixxes behalf, but I do think if you're going to compare CPU performance to the PS5 version you really have to use the uncapped VRR mode to paint an accurate picture.

Edit: One thing that's always bugged me about Miles Morales is unlike SM:Remastered which is basically perfect locked 60fps (understandable considering the headroom showed above), on my PS5 MM had routine 1-2fps hitches flying around most city sections, in either performance or performance RT mode. I haven't played it in quite a while so I just tried it again, and they're still there. So not sure how the game can run 80-100fps now with VRR and still drop below 60fps on a 60hz display - perhaps it's more of a GPU issue and VRR mode allows the dynamic res to be more aggressive in terms of downres, but if anything that would make less sense on VRR as opposed to a fixed refresh where a drop below 60fps is far more noticeable. Dunno just something to note that I found odd after seeing the video above, perf may vary significantly depending on location I guess.
 
Last edited:
I think it's safe the assume the portal transitions would be a pain the arse to get working on PC at the same speed and quality as they are on PS5 without serious help from a GPU for the decompression.

Even if they were (it's a different argument but pre-caching to the larger RAM pool is one easy way to get around this) this has no impact on the games frame rate as I mentioned above because this isn't an example of in game streaming, it's an example of a very short load screen where frame rates are irrelevant.

The 3.16:1 figure is from RADGame Tools after they announced Oodle texture, the 2.1:1 ratio was declared prior.

No that figure was for a specific texture set in a specific game, not the overall compression ratio which in the same article they state to be "closer to 2:1" when using Oodle Texture:


Last gen I/O overhead was low, next gen overhead gets quickly out of hand as Mark Cerny said himself.

It can be significant, but still lower than decompression compute requirements which also scales with data throughput rate. This diagram from Nvidia suggests at a 7GB/s raw throughput the compute requirements to handle the IO are 2 full cores. When decompression is added on top of that the requirements go up to 24 cores.

1660377013068.png

Yes it does, the I/O complex contains more then just the fixed function decompressor, there's other co-processors that handle all the other stuff like check-in and file copying which means PS5's CPU doesn't have to do any I/O work, neither PC (Nor Series consoles for that matter) have that hardware.
View attachment 6749

Nope. From the same RAD Game Tools link above:

Fabian Geisen said:
In the PS5 case, the goal was for the decompressors to never be the bottleneck in real workloads, so they're dialed in to be fast enough to keep up with the SSD at all times, with a decent safety margin. That's all there is to it.

Along the same lines, 2 helper processors in an IO block that has both a full Flash controller and the decompression/memory mapping/etc. units is not by itself remarkable. Every SSD controller has one. That's what processes the SATA/NVMe commands, does the wear leveling, bad block remapping and so forth. The special part is not that these processors exist, but rather that they run custom firmware that implements a protocol and feature set quite different from what you would get in an off-the-shelf SSD.

I think the coherency engines are unique (as well as the hardware decompressor). Everything else else is just standard SSD controller hardware. Don't let yourself be fooled by marketing.

Again, it's not just the API as even with Direct Storage the CPU will still be required to do things PS5's CPU won't as the I/O complex handles it all.

See above. I'll grant, the PS5 will probably remain more efficient at IO than even the final version of Direct Storage on PC just due to the nature of the platform, however when we're at the stage of 10% of 1 core vs 50% of 1 core it's going to be largely irrelevant to real world gaming performance impact, especially in light of how much more powerful PC CPU's already are than the one in the PS5.

And without knowing how much pre-loading would be needed that may not be enough as some games available now max out the 8Gb on my 3060ti

It's literally 25% or more of the entire game content. Are you suggesting that having 25% of the entire game content resident in memory would still be insufficient to allow the character to do a 180 degree turn in a single game environment? Is the game only 4 seconds long?

At the rate prices are increasing for hardware I doubt the average PC will be at that level any time soon and what if people don;t have the memory? Is it reasonable to expect them to pay for the game and for a RAM upgrade?

We're not talking about the "average PC". We're talking about gaming PC's that are going to be used for playing current gen games. 8GB video RAM and 32GB system RAM is not at all uncommon in such PC's today and will only become more-so as time moves on.

You/we don't know that for certain

It's simple math?? The game is around 66GB uncompressed. Half of 66GB is 33GB. 40>33. Not sure why I'm having to explain this.

I will have to double check

I assume the answer was no? I'm not sure of the relevance anyway. The 8700k pre-dates the PS5 by 3 years and has only 75% of it's cores. We should damn well hope it's slower.

PS5 only has 6 cores for gaming while running at a lower clock speed so the difference is mute in reality.

So the PS5 has a higher OS and system overhead than a PC running a full fat version of Windows now does it? Wow, I hadn't realised it was so inefficient.
 
Perhaps it's not the most demanding area, but seeing framerates that are usually between 80-100, with RT, is very impressive. That's 12900k territory, hopefully it indicates there's a lot more work to optimize on Nixxes behalf, but I do think if you're going to compare CPU performance to the PS5 version you really have to use the uncapped VRR mode to paint an accurate picture.

Yes I agree we need a proper settings matched head to head performance comparison in the same area using unlocked framerates. I was hoping we'd get that in Alex's video but I assume the time constraints and the already massive amount of content he had to cover made that a step to far. Unfortunately, without it, the internet is left to take NXG's head to heads at face value despite them almost certainly not being settings matched.

Perhaps someone on here could do a performance comparison to that specific area NXG uses above while using the PS5 Performance settings identified by Alex here:


The tricky part will be the resolution to choose. As I understand it the PS5 is rendering internally between 1080p-1440p and upscaling to 4k in 60fps mode. But how the heck does DRS work with an uncapped frame rate? Perhaps fixing to 1080p to prevent the GPU from becoming the bottleneck would at least isolate CPU performance.

That said, if Nixxes (via Alex) are saying they still have work to do to significantly improve BVH generation efficiency then any CPU performance comparison at this point is likely to be misleading.
 
Status
Not open for further replies.
Back
Top