Next-Generation NVMe SSD and I/O Technology [PC, PS5, XBSX|S]

At the same time, Direct Storage on good hardware doesn't need to be as efficient and "fast" that the PS5 solution. Being close enough will be enough to have a similar experience, loading/streaming wise, imo. Plus modern pc hardware have other advantages, so all in all, if Direct Storage is not a total letdown, it will do the job i think.
 
At the same time, Direct Storage on good hardware doesn't need to be as efficient and "fast" that the PS5 solution. Being close enough will be enough to have a similar experience, loading/streaming wise, imo. Plus modern pc hardware have other advantages, so all in all, if Direct Storage is not a total letdown, it will do the job i think.

I think any PC with a SATA SSD or a NVME SSD will be ok. In Forspoken loading a little bit under 4 seconds with a SATA SSD is great. I think in the two next years maximum the last few PC gamers without SSD will buy one SATA or NVME and everyone will enjoy fast loading and everyone will be happy and no one will come back.

Imo this is like saying you can come back to use a PC or a Mac with only a HDD after having one with a SSD or a hybrid SSD/HDD. Since I have a SATA SSD, I am annoyed anytime I need to use a PC with only a HDD.
 
Last edited:
i think real PS5 Exclusives pose a challenge for PC ports though. The custom hardware in PS5 is massive. Leveraging fast io just like PS5 is simply not possible on PC. Iam a hard disbeliever of Direct Storage Efficiency and only a realy good implementation that pulls just like PS5 in the same Game will make me revise my thoughts on the matter. A cobbled together Solution of making use of ~some GPU Ressources while using a very new API AND the SSD Protocols cannot be as efficient and friction less as PS5 all hardware based abstracted Solution.
Cerny made it very clear that ALL Steps need to be friction less not only the decompression. DMA , Check In and Load Management are done on PS5 with another custom unit wich is equivalent to another 2 zen 2 cores. The plus in latency on PC for all those steps will eventually cause friction. Then there are the PCs inferior to PS5. Can they also profit from Direct Storage?
Dont get me wrong Direct Storage WILL enhance PC games!
I just dont think that it will be on par with PS5 on most PCs.. The GPU Ressources alone.. I mean it is cute that Jensen thinks that every RTX Card could produce 14GB/s of decompressed Data.. But i want to know how will that pull away Rasterisation performance or heck Raytracing Performance.. But if the Ram Pool solution is enough than all the mumbo jumbo i just wrote is irrelevant . And of course lowering some settings might do it for some aswell of course .
With Sporfoken :ROFLMAO: using DS on PC we will have a first pointer how future will be for the tech. On PS5 the Demos fast travel is realy instantaneous. Like < 1sec.
PC can actually leverage faster IO than PS5. The demands of decompression on the GPU through DirectStorage will be less impactful to game performance than if they were done on the CPU. That's just simply logic. The CPU is the bottleneck, on PC we have GPU cycles for days. Streaming during gameplay isn't even remotely as bandwidth intensive as initial bulk loading assets into memory. PCs have FARRRRR more memory. So not only can the GPU bulk load data at much higher bandwidths than consoles (it's way more than 14GB/s) into memory.. that data also remains compressed until it is needed and RAM to VRAM has far less latency than retrieving data directly from console storage. So PC doesn't need to be as efficient as consoles... it never has needed to be, and never will... and that doesn't stop the results from being better.

I believe, that DirectStorage now puts PC in a far better position overall than the PS5... since there's more memory available, and data remains compressed in memory until it's needed. That's an architectural advantage that PS5 can never overcome no matter what you do. Latency can be overcome by sheer brute force. The biggest thing holding back loading in general right now is the game engines and other bottlenecks in the pipeline.

I think you'll see PS5 exclusives on PC which... if they actually utilize DirectStorage, will load just as fast if not faster than PS5.
 
on pure "load this level" scenarios it will be 100% fine with Direct Storage. I remain doubtfull with the on demand streaming during gameplay exept maybe super high end rigs.. IIRC Ratchet & Clank Rift Apart uses the PS5 I/O not only with the portals und generl loading but as well with streaming during Gameplay. I think one of their Devs made a Twitter post wich said that they only load what is in Player Vieport plus some percentage outside the players view to have a little wiggle room with the loading during fast turns. Their plan for PS5 Exclusives is - according to Cerny - to have only the next second of gameplay sitting in ram. Such a Scenario will be a challenge for Direct Storage and also will melt the the occasional SSD without proper cooling in Desktop PCs i suppose :-?
 
Time to dig out prior TLDRs...


TLDR: They get 6 to 12 GB/s per 1 TF GPU Decompression. A mere 2 TF will outperform the PS5.

Post in thread 'Blazing Fast NVMEs and Direct Storage API for PCs *spawn*' https://forum.beyond3d.com/threads/...-storage-api-for-pcs-spawn.61761/post-2158006


GPU decompression won't take much to exceed that of even the PS5, maybe 2TF or so. I posted it before, but on an early version 1.0 of GPU-based decompression they were getting 60-120 GB/s on a PS5. There's still possibilities of improvements, but even at first pass that's 6-12 GB/s per GPU TF used.

Here's a repost of the info from the Nvidia Ampere thread.

----------
Radgames comes up quite a few times in the Console Tech section, here's some posts referring to some nice posts about it. Be sure to read the full twitter threads about it.

Mostly that you could get 60-120 GB/s of textures decompressed if you used the entire PS5 GPU (10.28 TF). The Ampere has near that much TF to spare over and above the PS5.

Naturally, you wouldn't need to use that much, but it gives you an idea on how powerful the GPUs are when it comes to decompression.

https://forum.beyond3d.com/posts/2134570/

https://forum.beyond3d.com/posts/2151140/
https://forum.beyond3d.com/posts/2134405/


External references --

http://www.radgametools.com/oodlecompressors.htm
http://www.radgametools.com/oodletexture.htm
https://cbloomrants.blogspot.com/

Oodle is so fast that running the compressors and saving or loading compressed data is faster than just doing IO of uncompressed data. Oodle can decompress faster than the hardware decompression on PS4 and XBox One.



GPU benchmark info thread unrolled: https://threadreaderapp.com/thread/1274120303249985536

A few people have asked the last few days, and I hadn't benchmarked it before, so FWIW: BC7Prep GPU decode on PS5 (the only platform it currently ships on) is around 60-120GB/s throughput for large enough jobs (preferably, you want to decode >=256k at a time).​
That's 60-120GB/s output BC7 data written; you also pay ~the same in read BW. MANY caveats here, primarily that peak decode BW if you get the entire GPU to do it is kind of the opposite of the way this is intended to be used.​
These are quite lightweight async compute jobs meant to be running in the background. Also the shaders are very much not final, this is the initial version, there's already several improvements in the pipe. (...so many TODO items, so little time...)​
Also, the GPU is not busy the entire time. There are several sync points in between so utilization isn't awesome (but hey that's why it's async compute - do it along other stuff). This is all likely to improve in the future; we're still at v1.0 of everything. :)
 
Their plan for PS5 Exclusives is - according to Cerny - to have only the next second of gameplay sitting in ram. Such a Scenario will be a challenge for Direct Storage and also will melt the the occasional SSD without proper cooling in Desktop PCs i suppose :-?
Being able to turn over all of your sound/mesh/texture assets within one second isn't the same as doing it literally every second, unless they're shipping very disorienting multi terabyte games.
 
on pure "load this level" scenarios it will be 100% fine with Direct Storage. I remain doubtfull with the on demand streaming during gameplay exept maybe super high end rigs.. IIRC Ratchet & Clank Rift Apart uses the PS5 I/O not only with the portals und generl loading but as well with streaming during Gameplay. I think one of their Devs made a Twitter post wich said that they only load what is in Player Vieport plus some percentage outside the players view to have a little wiggle room with the loading during fast turns. Their plan for PS5 Exclusives is - according to Cerny - to have only the next second of gameplay sitting in ram. Such a Scenario will be a challenge for Direct Storage and also will melt the the occasional SSD without proper cooling in Desktop PCs i suppose :-?
lol no. Why would it be a challenge for DirectStorage... especially when you could keep the next 3-4 seconds in RAM? Game levels aren't changing entirely every 60 frames... They aren't swapping out the entire RAM and loading all new assets every second.. No games are streaming in that much data. No PS5 games will be doing that this generation. There's so many other bottlenecks elsewhere which prevent that from even being close to reality.

Also, Ratchet and Clank works perfectly fine with an underspec'd SSD in the PS5...
 
Everybody loves zero load times, but zero load times between levels are not that interesting in regards to gameplay.
Streaming of course made it possible to do more, but I hope that these SSD speeds will give us some interesting gameplay too.
Wether a PC will beat a PS5 in a speed race or if a PC port of a PS5 game does not run as fast as on the PS5, is the least interesting aspect, to me at least.
 
PC and Xbox Series are good to go. The people working for RAD tools game helping Sony to make the hardware decompressor and helped on software too told to PS fanboy to stop fantasy. I think a SATA SSD will be good to go most of the time.

Out of an hypothetical Doctor Strange game*, I don't see any scenario when a NVME SSD can be pushed to the max during streaming. If in the future other engines begin to utilize virtualised texturing and geometry like UE 5, this will make most of the game running very well on a SATA SSD.

* With scenery changing completely changing depending of the action of the player as Doctor Strange.
 
lol no. Why would it be a challenge for DirectStorage... especially when you could keep the next 3-4 seconds in RAM? Game levels aren't changing entirely every 60 frames... They aren't swapping out the entire RAM and loading all new assets every second.. No games are streaming in that much data. No PS5 games will be doing that this generation. There's so many other bottlenecks elsewhere which prevent that from even being close to reality.

Also, Ratchet and Clank works perfectly fine with an underspec'd SSD in the PS5...
ithink you misunderstood me. Ratchet & Clank has not ~ 3-4 seconds (like the emidiate area around the player) Actually, what you dont see (aka in your sight) is not in ram. when you turn the world behind you vanishes from ram. And is loaded in again if you again turn a 180°.
Thats the concept behind it.

thx @ Brit for linking the Thread. Iam currently reading it again. I think i was aware of it when it was fresh but did not participate because it was all theoretical as Direct Storage was so far in to the future.
 
ithink you misunderstood me. Ratchet & Clank has not ~ 3-4 seconds (like the emidiate area around the player) Actually, what you dont see (aka in your sight) is not in ram. when you turn the world behind you vanishes from ram. And is loaded in again if you again turn a 180°.
Thats the concept behind it.

thx @ Brit for linking the Thread. Iam currently reading it again. I think i was aware of it when it was fresh but did not participate because it was all theoretical as Direct Storage was so far in to the future.
Nah, I understood. I'm well aware of how it works. What I said was that on PC you could keep the next 3-4 seconds in RAM, unlike PS5... because there's far more of it... but that's not even the point. The point is that in a game level, even the stuff behind you, out of the view frustrum is using many of the same assets as what's right in front of you. There's not unique textures and assets for every object in the game's worlds... the world is full of repeat assets and textures and enemies. That stuff always stays in RAM.. so the demands of streaming in the "missing" data that isn't already in memory is much lower.
 
ithink you misunderstood me. Ratchet & Clank has not ~ 3-4 seconds (like the emidiate area around the player) Actually, what you dont see (aka in your sight) is not in ram. when you turn the world behind you vanishes from ram. And is loaded in again if you again turn a 180°.
Thats the concept behind it.

thx @ Brit for linking the Thread. Iam currently reading it again. I think i was aware of it when it was fresh but did not participate because it was all theoretical as Direct Storage was so far in to the future.
R&C isn't as read intensive as many people think. I recorded a full 8 hour playthrough when I bought my secondary drive and it read exactly 1512GB from disk which translates to 53MB/s on average.

The burst reads during area warps are only in the 400-500MB range.
 
The burst reads during area warps are only in the 400-500MB range.

That actually makes perfect sense as that's what Spiderman on PC can hit at burst speeds too.

Insomniac have said they're being limited by their engine now in terms of I/O so it looks like their engines limit is 400-500MB/s.
 
Being able to turn over all of your sound/mesh/texture assets within one second isn't the same as doing it literally every second, unless they're shipping very disorienting multi terabyte games.

Yup! Even if you only had the next 1 second* of gameplay in memory, the *next* second's gameplay might only involve changing a few hundred or few of tens of MB's (or less) of what's actually in ram.

*no game actually has a constant data change rate, other than maybe, I dunno, an FMV game?
 
So yeah i read the linked Thread and i still would not believe Jensen everything. The Idea that a GPU will outperform PS5 with 2 TF is a statement from Nvidia. I believe it when i see it. Btw iam talking here about RTX 2xxx cards and maybe 3xxx Cards. A PS5 does not need to be better as a RTX 4xxx Card. A GPU with 100 add TF will of course have enough ressources to do the decompression. Also again - decompression is one thing. But PS5 has alot more custom tech no account for bottlenecks that would follow the decompression. Iam not seeing the PS5 I/O Block in its Entirety being mimiced with Direct Storage.
And another Thought ( i mentioned that before) :
If decompressing per GPU would be such a logic and viable thing to do why did they not choose this path with the PS5. They could have avoided all the custom I/O Shenanigens and have a beefier GPU instead. But they went with the heavy custom I/O Solution anyways.
Something is not adding up. Theres more to PS5s I/O and lovely Jensen flipping a switch and "just leave the GPU all the work" is not going to cut it.

I remember reading about that 400MB/s with Ratchet & Clank aswell. Also the Fact that lesser Addon SSDs where cabable of Streaming the Game on PS5. Wich both makes perfect sense if the solution they´ve found with their first PS5 Exclusive is a mediocre one because their engine is not at all fitted to the bandwith provided by PS5.
But that means contrary to what one might think not that PS5 Exclusives will be also in the future not I/O heavy, but rather that they will be much more heavy relient on PS5s bandwith. What we see is just the beginning and devs on PS5 and Xbox and PC are just start to figure out what to do with all those Cores and I/O.
Happy Times ahead. :yes:
 
That actually makes perfect sense as that's what Spiderman on PC can hit at burst speeds too.

Insomniac have said they're being limited by their engine now in terms of I/O so it looks like their engines limit is 400-500MB/s.
Keep in mind that these loads happen in much less than a second. 500MB loaded in 0.1s is still equivalent to 5GB/s.
 
So yeah i read the linked Thread and i still would not believe Jensen everything. The Idea that a GPU will outperform PS5 with 2 TF is a statement from Nvidia. I believe it when i see it. Btw iam talking here about RTX 2xxx cards and maybe 3xxx Cards. A PS5 does not need to be better as a RTX 4xxx Card. A GPU with 100 add TF will of course have enough ressources to do the decompression. Also again - decompression is one thing. But PS5 has alot more custom tech no account for bottlenecks that would follow the decompression. Iam not seeing the PS5 I/O Block in its Entirety being mimiced with Direct Storage.
And another Thought ( i mentioned that before) :
If decompressing per GPU would be such a logic and viable thing to do why did they not choose this path with the PS5. They could have avoided all the custom I/O Shenanigens and have a beefier GPU instead. But they went with the heavy custom I/O Solution anyways.
Something is not adding up. Theres more to PS5s I/O and lovely Jensen flipping a switch and "just leave the GPU all the work" is not going to cut it.

I remember reading about that 400MB/s with Ratchet & Clank aswell. Also the Fact that lesser Addon SSDs where cabable of Streaming the Game on PS5. Wich both makes perfect sense if the solution they´ve found with their first PS5 Exclusive is a mediocre one because their engine is not at all fitted to the bandwith provided by PS5.
But that means contrary to what one might think not that PS5 Exclusives will be also in the future not I/O heavy, but rather that they will be much more heavy relient on PS5s bandwith. What we see is just the beginning and devs on PS5 and Xbox and PC are just start to figure out what to do with all those Cores and I/O.
Happy Times ahead. :yes:

I think you oversold the PS5 solution, this is great in a console. This is better from a cost perspective and probably from what RAD tools game hint a better solution from a performance per watt perspective but this is not magic.

We begin to have people testing Direct Storage using demo on PC and the result are good.

Out of an hypothetical Doctor Strange games pushing unpredictable burst all the time because it is part of the gameplay I don't see any game where SATA SSD or NVME SSD slower than the PS5 one and probably the Xbox Series one will be a problem or where the burst will reduce significantly the resources of the GPU.
 
Back
Top