Ratchet & Clank technical analysis *spawn

With vsync, 60fps. System reboot between each as usual.

DS On:

View attachment 9316\

DS Off:
View attachment 9317

It was apparent even just eyeballing it that any stutters during the world transitions were less with DS disabled.

At this stage I can't help but feel the implementation is just broken. The increased load on the CPU of turning off DS it pretty small (say 10-15% at Very High Textures) which means the decompression load must be pretty light. That's further supported by the data read info that Holysmoke provided which is nothing extraordinary. It doesn't make sense that the GPU impact of that during normal gameplay would be so high, especially not on something like a 4090.

We really need to understand if AMD are using DS at all as if they are, and are not seeing performance degradation then it's potentially an implementation issue specific to Nvidia and Intel.
 
At this stage I can't help but feel the implementation is just broken. The increased load on the CPU of turning off DS it pretty small (say 10-15% at Very High Textures) which means the decompression load must be pretty light. That's further supported by the data read info that Holysmoke provided which is nothing extraordinary. It doesn't make sense that the GPU impact of that during normal gameplay would be so high, especially not on something like a 4090.

We really need to understand if AMD are using DS at all as if they are, and are not seeing performance degradation then it's potentially an implementation issue specific to Nvidia and Intel.
Maybe the API or GPU hardware prevents efficient simultaneous calculation of graphics and decompression currently? I recall when Nvidia bought PhysX, their GPUs couldn’t run the workload simultaneously with graphics which caused a certain amount of overhead.
 
At this stage I can't help but feel the implementation is just broken. The increased load on the CPU of turning off DS it pretty small (say 10-15% at Very High Textures) which means the decompression load must be pretty light. That's further supported by the data read info that Holysmoke provided which is nothing extraordinary. It doesn't make sense that the GPU impact of that during normal gameplay would be so high, especially not on something like a 4090.

We really need to understand if AMD are using DS at all as if they are, and are not seeing performance degradation then it's potentially an implementation issue specific to Nvidia and Intel.
I always thought that the scene data was too light to be bandwidth limited such that you require an SSD w/DS in order to move data from the drive into usable VRAM. It all just doesn't make sense to me tbh.
 
Maybe the API or GPU hardware prevents efficient simultaneous calculation of graphics and decompression currently? I recall when Nvidia bought PhysX, their GPUs couldn’t run the workload simultaneously with graphics which caused a certain amount of overhead.
That was when the GPUs could not context Switch elegantly, hence the stall when doing physx compute.
Now since Decompression in 2023 is a computer shader and NV Supports graphics and Compute queues and Async Compute, it must be something else.

Would Gdeflate compression usually be hidden with Async compute like BVH Transform and update?
 
That was when the GPUs could not context Switch elegantly, hence the stall when doing physx compute.
Now since Decompression in 2023 is a computer shader and NV Supports graphics and Compute queues and Async Compute, it must be something else.

Would Gdeflate compression usually be hidden with Async compute like BVH Transform and update?
I wasn't aware it was processed as a compute shader. Do we know what parts of the GPU limit performance during decompression?
 
If the GPU has decompressed what it needs to within 1ms and then it has to wait for the CPU to decompress all the smaller stuff which takes 1.5ms wouldn't that cause latency issues?

I would have thought they would have shifted everything they could I've to GPU decompression and not just the big stuff.
 
not surprised. Why wouldn't it work the same? DirectStorage are the fantasies of some who talk about things without really having a clue. They hear wind over there and say over here. If the CPU is not saturated, DirectStorage is worthless, basically in practice.
 
not surprised. Why wouldn't it work the same? DirectStorage are the fantasies of some who talk about things without really having a clue.
Can we please stop with these judgements over opinions of tech? Particularly untested, nascent tech. B3D should be about enjoying the exploration of tech's progress, it's highs and lows. Too many people are invested in being right about a tech before it's even manifest.

Regards DS, no-one has a clue. It was a paper spec and a few promotional benchmarks. Until it's field tested and refined in real games, we've no idea what it can bring. Similar to, say, DLSS which looked pants to begin and some felt had nowhere to go with but is now incredible. Or maybe like Nanite+Lumen which looked amazing in the UE5 showcases but now in games we're starting to realise for the time being that they are too demanding to feature on consoles. Or DX12's promises of low overheads ending up being a real downer on PC game performance.

Maybe DS will go nowhere. Maybe it'll one-day be a huge leap for PC IO performance. It doesn't really matter - all that matters is we can make our best observations and guesses and see if we nailed it or not, with no downside to calling it wrong.
 
Can we please stop with these judgements over opinions of tech? Particularly untested, nascent tech. B3D should be about enjoying the exploration of tech's progress, it's highs and lows. Too many people are invested in being right about a tech before it's even manifest.

Regards DS, no-one has a clue. It was a paper spec and a few promotional benchmarks. Until it's field tested and refined in real games, we've no idea what it can bring. Similar to, say, DLSS which looked pants to begin and some felt had nowhere to go with but is now incredible. Or maybe like Nanite+Lumen which looked amazing in the UE5 showcases but now in games we're starting to realise for the time being that they are too demanding to feature on consoles. Or DX12's promises of low overheads ending up being a real downer on PC game performance.

Maybe DS will go nowhere. Maybe it'll one-day be a huge leap for PC IO performance. It doesn't really matter - all that matters is we can make our best observations and guesses and see if we nailed it or not, with no downside to calling it wrong.
that sounds very fine, though this is a jab at fanboys on other sites shouting "Kraken!" at the minimal sight. Maybe the Kraken isn't that necessary on PC nor a equivalent to it.

It might be useful in the future in PCs where energy conservation is a must 'cos of not having a very powerful CPU, but on a desktop PC, for now it's all bells and whistles.
 
If the CPU is not saturated, DirectStorage is worthless,

I would agree with this. You always want to put the processing load where there are spare processing cycles and in the case of R&C on a typical system that seems to be the CPU. It would be interesting to see how this works on a system with a big GPU and slow CPU though, e.g. 1600x + 4090. But that doesn't represent a realistic user scenario. That said, Flappys test above in a CPU limited scenario still showed slow down with DS which doesn;t make sense to me unless there is literally something wrong with the implementation.

Its a shame we can't see how it works in TLOU where the CPU was a significant bottleneck when decompression was taking place.
 
Go Go Nixxes!


  • Resolved texture streaming issues that could result in certain textures remaining low resolution.
  • Fixed visual issues with water reflections that occurred when ambient occlusion was set to anything other than SSAO.
  • Resolved an issue that caused the interact button prompt to remain visible on the screen.
  • Fixed a visual issue with weapon previews at the vendor when using ultra-wide resolutions.
  • Various bug fixes, stability improvements and optimizations.

Potentially two of the biggest issues still remaining with the game are addressed with this patch. I'm off to test it now!
 
So I've just done a little testing in the first section working through each of the comparison points that DF showed in their first video.

Comparison 1 - the RT reflections in the audience members eye - specifically the blocky bush textures. This is no longer an issue, the bushed have the proper level of detail (fixed in the first patch)
Comparison 2 - the grass shadows that were blocky, Flappy already confirmed they are now fixed but I can also confirm.
Comparison 3 - this one is interesting. They compared an area under a glass dome where the PC version was in full shadow with RT reflections while there was a more pleasant natural light on the console. This was confirmed by looking at the sun and seeing it was occluded by a highway. Now however, with RT shadows on, the area is back in daylight like the PS5, even though the highway was still occluded by the sun... I can only assume Nixxes have taken the artistic decision to stop that highway from casting a shadow. Very interesting if so and I wonder if they have made similar changes throughout the game. That could certainly make RT shadows more of a win, particualry as they clearly extend further into the distance which in that opening scene is pretty cool as you see a lot more moving shadows of the flying vehicles in the distance.
Comparison 4 - The non-loading textures on Nefarious... FIXED :) So much for this being an IO limitation eh? Oh and I'm running without Direct Storage enabled.

1690824678274.png

1690824702570.png

EDIT: I can also confirm the Screen Space Reflections are now working properly :D.

Amazing job Nixxes! Now we just need water caustics enabled and this game will be in great shape on the PC.
 
Last edited:
I would agree with this. You always want to put the processing load where there are spare processing cycles and in the case of R&C on a typical system that seems to be the CPU. It would be interesting to see how this works on a system with a big GPU and slow CPU though, e.g. 1600x + 4090. But that doesn't represent a realistic user scenario. That said, Flappys test above in a CPU limited scenario still showed slow down with DS which doesn;t make sense to me unless there is literally something wrong with the implementation.

Its a shame we can't see how it works in TLOU where the CPU was a significant bottleneck when decompression was taking place.
after seeing your screengrabs in your most recent post, where you didn't use DirectStorage, I wonder if this Kraken thing isn't something great to make up for the fact that console CPUs, even though they are super excellent overall, because of the fact they share bandwidth with the GPU, don't have the speed of a dedicated PC CPU.

A good CPU with enough bandwidth can transmit data with the GPU very quickly.

So maybe DirectStorage is ONLY an advantage for small PCs like the Steam Deck or Asus RoG Ally, -coincidentally for the same reason as consoles, shared CPU and GPU, with more limited CPUs that are still powerful-, or for PCs with less powerful CPUs.
 
Halo ring effect fixed:

1690826956753.png

DLSS reflections seem to be checkerboarded now (still have that issues at least on my system with corruption at the bottom of the screen with RT reflections and DLSS). This is 4K with performance DLSS, reflections set to High:

1690827020636.png

One issue still remains is water caustics, the PS5 has a wave effect that occurs when you walk through water, this is still missing on the PC.
 
Last edited:
DLSS reflections seem to be checkerboarded now (still have that issues at least on my system with corruption at the bottom of the screen with RT reflections and DLSS). This is 4K with performance DLSS, reflections set to High:
Isn't that normal? From my understanding, PS5 uses reflections equivalent to High in Quality Mode and the reflections are checkerboarded. Very High uses full resolution.
 
Isn't that normal? From my understanding, PS5 uses reflections equivalent to High in Quality Mode and the reflections are checkerboarded. Very High uses full resolution.

Yes, I'm saying that's why it's fixed now. It wasn't before, see DF's video. When using DLSS you just got a straight pre-reconstructed reflection before, which was very low res in comparison to the proper checkerboarded solution used in IGTI/XESS/FSR.

1690828038522.png
 
Can we please stop with these judgements over opinions of tech? Particularly untested, nascent tech. B3D should be about enjoying the exploration of tech's progress, it's highs and lows. Too many people are invested in being right about a tech before it's even manifest.
We don't need to "wait and see" to prove that gpu decompression only provides a benefit if you don't have enough frame time to run cpu decompression. That fact doesn't make directstorage good or bad. Tech does specific things, if we aren't basing our understanding on what it actually does our analysis will always be terrible. A purely wait and see approach had humans thinking the sun rotated around the earth for hundreds of years.
 
We don't need to "wait and see" to prove that gpu decompression only provides a benefit if you don't have enough frame time to run cpu decompression. That fact doesn't make directstorage good or bad. Tech does specific things, if we aren't basing our understanding on what it actually does our analysis will always be terrible. A purely wait and see approach had humans thinking the sun rotated around the earth for hundreds of years.
Huh? That's not the only claim that was being made in light of R&C's initial PC performance and the philosophy isn't extended to 'don't bother thinking' as you suggest. It's simply dumb to jump to conclusions based on limited evidence.

At the moment we don't know whether the GPU decompression is working as it should be or not so, obviously if you have spare CPU cycles you can use those without eating rendering time, but otherwise what GPU decompression might bring at any level and how that compares to custom hardware isn't revealed in this one title and needs more data.
 
Back
Top