Ratchet & Clank technical analysis *spawn


Nvidia Hotfix driver for DS in R&C does improve things, but not much. :(

I'd say that's a pretty big improvement (178 vs 201fps). However the performance loss with DS is clearly still present.

I'm guessing what we're seeing here is simply that the Reflex element of that performance loss is now cleared up, but DS itself still has a hit.

Shame he didn't test with Reflex on vs off. I will check that out myself in a bit.
 
I tested my usual test area with DS on and Reflex on vs off. I can confirm that the performance drop with Reflex on is now significantly decreased. It's hard to tell without a perfectly repeatable sequence but I'd say at worse I'm losing 2-3 fps with Reflex on now vs ~10fps previously.
 
I'd say that's a pretty big improvement (178 vs 201fps). However the performance loss with DS is clearly still present.

I'm guessing what we're seeing here is simply that the Reflex element of that performance loss is now cleared up, but DS itself still has a hit.

Shame he didn't test with Reflex on vs off. I will check that out myself in a bit.

Yep as you confirmed as I pointed out when i read Nvidia's notes before the hotfix, that was indeed the only focus:


Surprising that AMD's performance with DS hasn't been looked into more closely at this point. I think everyone assumes they're not using GPU decompression due to suffering no performance hit with it enabled, but even CPU DS can have performance benefits - would like to see CPU usage with/without the DS files on Radeon.

Edit: Well they do during the portal sequence, not so much during gameplay:

 
Last edited:
Surprising that AMD's performance with DS hasn't been looked into more closely at this point. I think everyone assumes they're not using GPU decompression due to suffering no performance hit with it enabled, but even CPU DS can have performance benefits - would like to see CPU usage with/without the DS files on Radeon.

Agreed. The log I mentioned does state whether DS is enabled or not as well (although not specifically GPU decompression) so if anyone here has the game with a Radeon it would be worth looking in there to see if it's enabled.
 
Agreed. The log I mentioned does state whether DS is enabled or not as well (although not specifically GPU decompression) so if anyone here has the game with a Radeon it would be worth looking in there to see if it's enabled.

Again this depends if Radeon is using GPU decompression somewhat, but even if it is - when the far and away #1 vendor shows a significant performance decrease with GPU decompression, I think some of this has to placed on Nixxes' shoulders too. Surely they were seeing these numbers well before release - why then pursue the GPU decompression at all? It doesn't provide a benefit to any architecture, it can't even be seen when running on slow CPU's, the weakest platform with any hope of sales (SteamDeck) manages fine without it - how does this get greenlit?

This poster claims there's slow texture loading with DS, but I think that's a general game bug that's oddly happening with high-end cards, I can't replicate it:

 
Last edited:
Again this depends if Radeon is using GPU decompression somewhat, but even if it is - when the far and away #1 vendor shows a significant performance decrease with GPU decompression, I think some of this has to placed on Nixxes' shoulders too. Surely they were seeing these numbers well before release - why then pursue the GPU decompression at all? It doesn't provide a benefit to any architecture, it can't even be seen when running on slow CPU's, the weakest platform with any hope of sales (SteamDeck) manages fine without it - how does this get greenlit?

This poster claims there's slow texture loading with DS, but I think that's a general game bug that's oddly happening with high-end cards, I can't replicate it:


Those AMD results are just confusing. DS looks like a win outside of the portal sequence (albeit slight) and despite the lower lows/stutters still has a higher average even in the portal scene.

Then again the whole games performance profile is confusing to me. FG no longer seems to win me any performance and I've no idea what, if anything has changed to result in that. I'd say it might be the hot fix driver but I've seen similar things happen previously which seemed to resolve themselves for some undetermined reason, but now nothing I do seems to resolve it.

I might try rolling back the driver and/or a clean driver install.

EDIT: Strike that, it's fixed itself again. No idea why. Might be RTSS based as it seemed to resolve after I fixed RTSS at 60fps without FG (which plays great), then switched FG on with the same frame limit (which plays like arse), then turned off the RTSS limiter, and I was back up to 100+ fps (which I then lock to 90 which plays like heaven.
 
Last edited:
Hey, we actually found a scenario where DS has a benefit - when you card has a crippled PCI-E bus. :)


That's interesting although you would expect that when PCIe bandwidth constrained. More interesting still though is that it would seem to support the conclusion that AMD is using GPU decompression. So it really does seem they like don't suffer the same general performance drop off as Nvidia. Perhaps RTX-IO is just not as efficient as AMD's implementation despite all the hype.
 
That's interesting although you would expect that when PCIe bandwidth constrained. More interesting still though is that it would seem to support the conclusion that AMD is using GPU decompression. So it really does seem they like don't suffer the same general performance drop off as Nvidia. Perhaps RTX-IO is just not as efficient as AMD's implementation despite all the hype.
That would be funny, considering this compression format was developed by Nvidia lol.

Wonder if it has something to do with AMD having better Async compute implementation?
 
That would be funny, considering this compression format was developed by Nvidia lol.

Wonder if it has something to do with AMD having better Async compute implementation?
Async Compute is coded by the Game, Not by the Driver! Running a trace would Show whethrt or Not Async Compute is even used, let alone for DS
 
Async Compute is coded by the Game, Not by the Driver! Running a trace would Show whethrt or Not Async Compute is even used, let alone for DS
Here's how it looks on Turing during a dimensional warp. There's not much overlap (which might look different on Ampere/Ada/RDNA) but it's definitely running on async queues.

Mind that it's a cherry picked example and not gameplay.

2u8YNPN.png
 
Is this perhaps a reason that XSX versions of games are performing less well than expected versus PS5? DirectStorage getting in the way? I know it's not quite the same thing in a console environment, but maybe they simply haven't done the work to implement it in as clean a manner as Sony's equivalent?
 
Is this perhaps a reason that XSX versions of games are performing less well than expected versus PS5? DirectStorage getting in the way? I know it's not quite the same thing in a console environment, but maybe they simply haven't done the work to implement it in as clean a manner as Sony's equivalent?
Not the same on console. Series Consoles have the same setup as PS5. PS5 licensed with Radtools for their hardware decompressor. XSX still uses LZ I believe but it serves the same role, but doesn’t compress as well. Wrt the pipeline, assets stay compressed on SSD and decompress into memory.
 
Not the same on console. Series Consoles have the same setup as PS5. PS5 licensed with Radtools for their hardware decompressor. XSX still uses LZ I believe but it serves the same role, but doesn’t compress as well. Wrt the pipeline, assets stay compressed on SSD and decompress into memory.
On the upside XBS has an improved texture compression format, no?
 
On the upside XBS has an improved texture compression format, no?
MS released a special RDO format for their textures called BCPack, it’s unclear if this is available on PC releases.

Oodle from Radtools is also a RDO texture compression format and should perform similarly if I recall correctly. I don’t recall if Sony’s license agreement applies to both, but likely.
 
Again this depends if Radeon is using GPU decompression somewhat, but even if it is - when the far and away #1 vendor shows a significant performance decrease with GPU decompression, I think some of this has to placed on Nixxes' shoulders too. Surely they were seeing these numbers well before release - why then pursue the GPU decompression at all? It doesn't provide a benefit to any architecture, it can't even be seen when running on slow CPU's, the weakest platform with any hope of sales (SteamDeck) manages fine without it - how does this get greenlit?

This poster claims there's slow texture loading with DS, but I think that's a general game bug that's oddly happening with high-end cards, I can't replicate it:


DS doesn't require a IHV solution to provide GPU compression. DS has its own built-in gdeflate solution thats hardware agnostic. IHV can provide additional DS optimizations for their own hardware using a metacommand enabled graphics driver. Nvidia has RTX IO and AMD has a solution but its still under NDA and hasn't been officially released.
 
Is this perhaps a reason that XSX versions of games are performing less well than expected versus PS5? DirectStorage getting in the way? I know it's not quite the same thing in a console environment, but maybe they simply haven't done the work to implement it in as clean a manner as Sony's equivalent?

The XSX DS solution functions more like PS5 IO than PC DS GPU decompression scheme. Both the PS5 and XSX looks more like the CPU decompression scheme of DS but with the CPU replaced by hardware-based compressor within the GPU's DMAs.

To me at least the current DS scheme for GPU decompression looks to be just a stop gap until a better scheme (pathway doesn't include a stop in system memory) is available. It presents too many hoops that the data has to jump through to get to its final destination. Its eats up shader resources while amplifying bandwidth use on the GPU.
 
Last edited:
The XSX DS solution functions more like PS5 IO than PC DS GPU decompression scheme. Both the PS5 and XSX looks more like the CPU decompression scheme of DS but with the CPU replaced by hardware-based compressor within the GPU's DMAs.

To me at least the current DS scheme for GPU decompression looks to be just a stop gap until a better scheme (pathway doesn't include a stop in system memory) is available. It presents too many hoops that the data has to jump through to get to its final destination. Its eats up shader resources while amplifying bandwidth use on the GPU.
I dont see that direct pathway from storage to GPU memory happening anytime soon, though. So for the time being, DirectStorage *needs* to work better. PC is a major platform these days. It's not something multiplatform developers can just disregard.

The alternative is that either devs disregard the PC platform, or they have to provide a specific, downgraded version for PC where info can be streamed in without hurting performance.

OR, they just dont make I/O heavy games at all, which ruins the whole point of these new consoles.

Basically, unless DirectStorage can be made to work better, PC might be a big problem for the generation.
 
If there are implementation issues with how DS as an API is designed, devs will just rely on hardware hopefully getting fast enough to brute force passed the issues. Outside of Sony 1st party I don’t expect devs push IO hard though.
 
Back
Top