Next-Generation NVMe SSD and I/O Technology [PC, PS5, XBSX|S]

Interesting that the initial load is faster on the HDD, as I've found some early access UE4 games that load up significantly faster on my 5400RPM drive than they do on a SATA SSD I store games on. One game took almost 10 minutes to load to the menu from the SSD and loads in about 60-90 seconds on the slow HDD.
 
Interesting that the initial load is faster on the HDD, as I've found some early access UE4 games that load up significantly faster on my 5400RPM drive than they do on a SATA SSD I store games on. One game took almost 10 minutes to load to the menu from the SSD and loads in about 60-90 seconds on the slow HDD.
Those 10 minutes make me think it was probably compiling the shaders.
 
AFAIK PC games use mostly zlib and Windows' I/O is a complete drag compared to the new consoles that are made for NVMe SSDs..

I have 64GB RAM and I've tried to run games directly from a RAM drive that benches at over 10GB/s (and has much lower latency than any SSD out there). The result is really disappointing, as the CPU becomes a major bottleneck for all the hoops and loops that Windows demands game data to make, plus the poor performance that zlib has for CPU decompression.

The lack of improvement you're seeing from a RAM drive likely has very little to do with the IO and decompression overhead (which is worse on PC right now, no question) and more to do with the fact that the game isn't designed to take advantage of such high speed IO. I.e. it's bottlenecked by some other CPU process long before the impact of a RAM drive or even an NVMe drive comes into play. One of the RadGame Tools guys had an amazing twitter thread on this topic which I think Dictator may have posted previously, possibly in this very thread. That's why most PC games load no faster on an NVMe than an SATA SDD and the same reasoning applies to the majority of next gen console games that don't load any (or significantly) faster than their PC counterparts.

Yes, this is unfortunately true.
I only know of two non-game demos on the PC requiring SSDs to run: the UE5 Demo and Star Citizen. Neither of them are games, as all games in the market still need to support HDDs.

There are no PC games requiring SATA SSDs, much less NVMe PCIe 3.0 at >2GB/s, let alone NVMe PCIe 4.0 at >5GB/s.

Even after Direct Storage comes out, IMO it'll be years before PC games can afford to demand a minimum 2.5GB/s NVMe from their audience like the Series consoles. We might see something that takes advantage of Direct Storage and a faster I/O for faster loadings, but not a faster I/O that is instrumental for gameplay (like we see in e.g. Ratchet&Clank Rift Apart or pretty much any UE5 game that makes heavy use of Nanite).

Once again, you do not need a high minimum standard of IO performance on PC to release games that can take advantage of high IO performance, yes even at a gameplay level... PC games scale. Take your R&C example. Your assertion here is that without making the minimum requirement on PC something in the ball park of a 5.5GB/s NVMe SSD, then it could not be released on PC. And here's why that's not the case:

1. You incorporate scaling options into the game to reduce the IO requirement. Texture resolution, LOD, draw distance etc...
2. You implement pre-caching to system RAM on PC's with sufficient memory to ease the IO requirements
3. You advise users of a reduced experience on drives below a specific standard. Perhaps recommending an NVMe (of any speed) and making a SATA SDD the minimum. That wouldn't be at all unreasonable over the next few years, several games already do this, e.g Cyberpunk and The Medium.
4. Users for which non of the above helps have to suffer through a few extra seconds of portal transit animation at those relatively occasional points in the game where it's required. Hardly deal breaking.

If you build a game that depends on NVMe levels of access times and bandwidth, people with their games installed on HDDs won't be able to play them. We know what will happen when they try:


Yes, so you include scaling options, and if you really need to, you set a SATA SDD as the minimum spec to be used in combination with reduced settings. Or alternatively you set a minimum level of system RAM and implement a good pre-caching system. There are plenty of ways around this.
 
Yes, so you include scaling options, and if you really need to, you set a SATA SDD as the minimum spec to be used in combination with reduced settings. Or alternatively you set a minimum level of system RAM and implement a good pre-caching system. There are plenty of ways around this.
There are ways around if it people invest and upgrade their PCs. You either need to invest in a fast SSD or a lot of RAM. I'm not sure what circle in the venn diagram of PC owners, have HDDs, or slow SSDs, but 32-64Gb of RAM.
 
There are ways around if it people invest and upgrade their PCs.

That would obviously be preferable, but as noted above, there's no reason the software shouldn't be able to scale down to reduce the IO requirement meaning hardware equivalence is not required in order to play ports of games designed around the PS5's fast IO.

You either need to invest in a fast SSD or a lot of RAM.

Or neither and live with lower settings (subject to minimum hardware requirements).

I'm not sure what circle in the venn diagram of PC owners, have HDDs, or slow SSDs, but 32-64Gb of RAM.

I imagine there would be quite a few PC owners with 32GB RAM and SATA SDD's. But why stop there? Anything less than a 5.5GB/s PS5 equivalent SDD could benefit from pre-caching to additional RAM. I'd wager a very significant percentage of gaming PC's sporting a PCIe3 NVMe drive feature at least 32GB RAM for example. In this case that extra 24GB (over Windows) could potentially be used to pre-cache sufficient data to for example allow a 3GB/s NVMe drive to offer an equivalent experience to the PS5.
 
That would obviously be preferable, but as noted above, there's no reason the software shouldn't be able to scale down to reduce the IO requirement meaning hardware equivalence is not required in order to play ports of games designed around the PS5's fast IO.

But ToTTenTranz said "a game that depends on NVMe levels of access times":

If you build a game that depends on NVMe levels of access times and bandwidth, people with their games installed on HDDs won't be able to play them.

If the game depends upon it, how do you scale it down? If you can scale it down and still run the game, it no longer depends on it. :???: Even if somebody would be content with awful texture pop-in, if the game depends on streaming in vast amounts of basic geometry and cannot do it quickly enough in a sustained way the game requires, the game will fall apart very quickly.
 
But ToTTenTranz said "a game that depends on NVMe levels of access times":

If the game depends upon it, how do you scale it down? If you can scale it down and still run the game, it no longer depends on it. :???: Even if somebody would be content with awful texture pop-in, if the game depends on streaming in vast amounts of basic geometry and cannot do it quickly enough in a sustained way the game requires, the game will fall apart very quickly.

The same way every game on the PS5 depends on a 36CU RDNA2 GPU and yet still manages to run just fine on PC's of lesser capability at lower settings. On the PS5 the game depends upon a 5.5GB/s drive at a specific set of graphical settings. If you change those settings you will obviously change the hardware requirement. The easiest and most obvious way of doing this is to reduce texture resolution. Textures are by far the biggest consumer of IO bandwidth (Microsoft claims 80% in a typical game) and a simple reduction from say 4k to 2k would quarter the size of them resulting in a massive reduction in IO requirement. Geometry requirements could be reduced through more aggressive LOD settings or decreased draw distance. It's not as if any of these settings are uncommon in PC games and people are already content to use them if they're using hardware below the recommended requirements.
 
The same way every game on the PS5 depends on a 36CU RDNA2 GPU and yet still manages to run just fine on PC's of lesser capability at lower settings.
No PS5 game is running on PCs of lesser capability because there are no PC ports of PS5 games. So far Sony only released PC ports of PS4 games that are designed for HDDs. PS5 ports haven't even been announced.


I think you're making quite the number of assumptions around I/O scaling down that go directly against statements from several developers, while having no proof of these claims.
There are plenty of videos on Youtube showing what happens when you play games/demos that depend on low-latency I/O to stream assets from a HDD, and I posted one of those above.
Though this is one of those circular arguments on B3D where everyone has made their point and no one will budge until we get e.g. one PS5 game that depends on constant streaming (like Rift Apart) launching on PC with formal support for HDD (and without running like ass).
At this stage the discussion seems a bit pointless IMO.



I will say one thing though: I was wrong about the PS5's I/O being completely unmatched for several years on the PC side. Oodle Selkie is an absolute beast on CPU decompression performance and with Oodle Texture it can already get better than zlib compression ratio. With Selkie, 1 CPU core can already output close to 6GB/s out of a 2.5GB/s source, which is probably plenty for most use cases, and with a 7GB/s hard drive and up to 3 CPU cores that can rise up to over 15GB/s.

This might be where the future PS5 -> PC ports will go. They'll probably demand a fast NVMe drive and use Selkie for CPU decompression, after DirectStorage is figured out on Windows 11.
I think they'll do that instead of redesigning the game's I/O engine that streams data on-the-fly into something that uses massive amounts of RAM as cache, which probably represents a development cost that is well above whatever Sony is willing to do for PS5 games that have lost sales relevance on the original platform.
 
The same way every game on the PS5 depends on a 36CU RDNA2 GPU and yet still manages to run just fine on PC's of lesser capability at lower settings.

Because some games are designed to run on lower-specification hardware like lastgen consoles and older PCs. This seems to be a difficult concept for you to grasp but - again - ToTTenTranz is talking about. a game that isn't designed to run on older hardware.

That's the entire point. Embracing the new I/O paradigm and eschewing older hardware and the problems it caused for game designers and developers.

On the PS5 the game depends upon a 5.5GB/s drive at a specific set of graphical settings.

I think most people would agree that what limits PS5's graphics is nothing to do the NVMe and I/O systems, it's the available GDDR6, the internal memory bandwidth and number or CUs. In a bunch of instances Series X is pushing higher resolutions and better quality effects with a slower storage drive.

I don't doubt for a second many games could physically feed higher quality assets to the PS5 game engines, but the graphics couldn't do it justice in terms of PS5's raw performance.
 
No PS5 game is running on PCs of lesser capability because there are no PC ports of PS5 games. So far Sony only released PC ports of PS4 games that are designed for HDDs. PS5 ports haven't even been announced.

I think you're making quite the number of assumptions around I/O scaling down that go directly against statements from several developers, while having no proof of these claims.

Because some games are designed to run on lower-specification hardware like lastgen consoles and older PCs. This seems to be a difficult concept for you to grasp but - again - ToTTenTranz is talking about. a game that isn't designed to run on older hardware.

That's the entire point. Embracing the new I/O paradigm and eschewing older hardware and the problems it caused for game designers and developers.

I think most people would agree that what limits PS5's graphics is nothing to do the NVMe and I/O systems, it's the available GDDR6, the internal memory bandwidth and number or CUs. In a bunch of instances Series X is pushing higher resolutions and better quality effects with a slower storage drive.

I don't doubt for a second many games could physically feed higher quality assets to the PS5 game engines, but the graphics couldn't do it justice in terms of PS5's raw performance.

Okay so let's be absolutely clear about what you're both saying here... Is it your claim that reducing texture resolution, geometry LOD and draw distance in a game like Ratchet and Clank Rifts Apart would have zero impact on the amount of data it has to stream from the SDD?
 
Okay so let's be absolutely clear about what you're both saying here... Is it your claim that reducing texture resolution, geometry LOD and draw distance in a game like Ratchet and Clank Rifts Apart would have zero impact on the amount of data it has to stream from the SDD?

No.
The point I'm (and I think @DSoup as well) making is that a game built around an I/O engine that streams data with the expected latency of a NVMe drive will probably not work decently through a HDD, regardless of texture resolution, geometry LOD and draw distance.

Sure, you can reduce the IQ settings and resolution and that will lower the raw bandwidth demands for the I/O system, but not the latency.
 
Okay so let's be absolutely clear about what you're both saying here... Is it your claim that reducing texture resolution, geometry LOD and draw distance in a game like Ratchet and Clank Rifts Apart would have zero impact on the amount of data it has to stream from the SDD?
No.. that is not what I'm saying. How did you get there? :-?

There are things you can pare back. But if there are no bounds to the ability to scale then PC games wouldn't need minimum hardware requirements.
 
Is it your claim that reducing texture resolution, geometry LOD and draw distance in a game like Ratchet and Clank Rifts Apart would have zero impact on the amount of data it has to stream from the SDD?



Excellent. So you both acknowledge that it's possible to scale the graphics of a game downwards to reduce the IO requirement and therefore it would not require a 5.5GB/s SSD in a PC to enable a port of a PS5 game which takes full advantage of the PS5's IO. Since I never claimed such games could be necessarily be scaled down to HDD's, we have no argument there and it seems we're now in full agreement.
 
No.
The point I'm (and I think @DSoup as well) making is that a game built around an I/O engine that streams data with the expected latency of a NVMe drive will probably not work decently through a HDD, regardless of texture resolution, geometry LOD and draw distance.

Sure, you can reduce the IQ settings and resolution and that will lower the raw bandwidth demands for the I/O system, but not the latency.
RAM has less latency than an NVMe drive...

That game, ported to PC, would likely require higher system/video memory allocation from the game to reduce the strain on the HDD. The developers would likely opt to prefetch more of the game data ahead of time. RAM can act as a buffer allowing the game to have more time to load in assets.
 
IO speeds != latency / seek times

Sort of want to echo that one is really debating about how data is laid out on a Hard drive and levels therefore reliant on seek times, vs the quality of the textures and size; IO transfer rate.

having zero latency seek times allows for very dynamic levels to occur in which many different things can come together at once. Having huge seek times means curated levels and inventory, options etc, because it’s difficult to recall anything on demand.

The above can be scaled in texture sizes to support the IO, but that doesn’t mean a player can randomly just grab anything from anywhere in the assets folder without impact to their game performance.
 
I think I kind of follow the logic or what I think is being said.

If you use 1/4 the textures sizes you can use the same memory buffer size to read 4x the resources which allows for 4x the IO latency range. Even assuming perfect scaling, I don't think going from something built for 8μs to being adjusted to 32μs won't be enough. I think you'd need to double or triple the memory buffer.

Of course you need to have predictive loading and know exactly what to preload. This is the biggest issue. I don't see that being feasible without extensive profiling and constraints. As a DEV I'd prob just brute force it with a huge booty buffer and using base textures that are always available in memory until the specific unique texture is loaded.
 
I think there's a lot of validity in the argument that you can't scale down to HDD levels of performance (unless you have a very big pool of RAM to pre-cache a significant proportion of the game to). I believe EPICs Brian Karis said that it was SDD latency which allowed Nanite in the first place. HDD's have latencies in the millisecond range, SATA SDDs are in the 10's-100's of microseconds and NVMe drives are in the 1's-10's of microseconds. So I think it's quite likely that a SATA SDD could be made the minimum spec for a PC port of a PS5 exclusive. I could imagine some extreme cases where an NVMe drive may be made the minimum spec as well, in part because Direct Storage doesn't support SATA SDD's. However in those cases I'm pretty certain there will be no specific speed associated with the NVMe requirement. i.e. there will never be a scenario where a PC port of the PS5 exclusive requires at least a 5.5GB/s PCIe4 NVMe drive.
 
I think I kind of follow the logic or what I think is being said.

If you use 1/4 the textures sizes you can use the same memory buffer size to read 4x the resources which allows for 4x the IO latency range. Even assuming perfect scaling, I don't think going from something built for 8μs to being adjusted to 32μs won't be enough. I think you'd need to double or triple the memory buffer.

Of course you need to have predictive loading and know exactly what to preload. This is the biggest issue. I don't see that being feasible without extensive profiling and constraints. As a DEV I'd prob just brute force it with a huge booty buffer and using base textures that are always available in memory until the specific unique texture is loaded.
From what I understand this was typically handled by texture and asset duplication per area so that seek times were avoided and each level/area has the same duplicated assets built into the zone.

I wouldn’t be surprised were this the case on PC as well.
 
Back
Top