Blazing Fast NVMEs and Direct Storage API for PCs *spawn*

Will they though? Game install sizes don't look like they're going to go up that much and these consoles have twice as much vram as last generation.

Look at Miles Morales. 40GB install. Let's say 80GB after decompression. You can get 20% of the entire game in vram at any given point. I don't see why under those circumstances you'd need to be streaming much from SSD at all, merely enough to keep the vram topped up as you traverse the map.

If you were streaming at the PS5's maximum speed you'd have used your entire game content in less 10 seconds. I expect actual in game streaming speeds to be far, far below the peak burst speed used as loading screens.

This is a really good point and I see it as a perfectly valid consideration. I just want to add some nuance to it, despite agreeing on a broader range.

Even if you have 16Gb RAM on next gen machines, you won't be filling that up to the brim with static assets brought from storage. Of course we never did that before, we need a place to store game state, physics/animation data, dinamuc lighting, framebuffers, volumetric data etc... But the thing is, once you know you can grab random assets from storage at a finger's snap and at fine granularity, engines might start using much more of those 16Gbs for the aformentioned dynamic stuff and the actual pool for static asset data might actually shrink to as low as 4Gb or less. So that will trade in more agressive constant streaming for more available RAM to be used for the actual cool shit.

Also, I can think of many scenarios where you are constantly cycling through repeated data without even noticing it.

Imagine SpiderMan, you are cruising through NY. In the game, there are four types of food stand (hypothetical here) Hot Dog, Burgers, Ice-cream and Boiled goose. As you cruise across town at speed, you may cross through nultiple dozens of those within a minute or so of gameplay, yet there is usually only one of those on view at full detail LOD. That means the game is streaming in, out, and in again the data for those carts dozens of times every few seconds. Something a game would never do past gen, but now becomes sensible.
 
Last edited:
This is a really good point and I see it as a perfectly valid consideration. I just want to add some nuance to it, despite agreeing on a broader range.
Even if you have 16Gb on next gen machines, you won't be filling that up to the brim with static assets brought from storage. Of course we never did that before, we need a place to store game state, physics/animation data, dinamuc lighting, framebuffers, volumetric data etc... But the thing is, once you know you can grab random assets from storage at fine a finger's snap and at fine granularity, engines might start using much more of those 16Gbs for the aformentioned dynamic stuff and the actual pool for static asset data might actually shrink to as low as 4Gb or less. So that will trade in more agressive constant streaming for more available RAM for the actual cool shit.

Also, I can think of many scenarios where you are constantly cycling through repeated data without even noticing it.

Imagine SpiderMan, you are cryising through NY. In the game, there are four types of food stand (hypothetical here) Hot Dog, Burgers, IceCream and Boiled Goose. As you cruise across town at speed, you may cross through nultiple dozens of those within a minute or so of gameplay, yet there is usually only one of those on view at full detail LOD. That means the game is streaming in, out, and in again the data for those carts dozens of times every few seconds. Something we would never do past gen, but now becomes sensible.

Or an even simpler example here: Imagine a shooter where you constantly switch weapons. On last gen(s), all the weapon models (and related animations, sound, particle effects, UI graphics, etc) would stay resident in RAM constantly. Now it can all be pulled out from storage at weapon switch. That means, during a single level, a player can potentially switch weapons enough times for the game to pull thousands of GB through the streaming system, all by cycling through a relatively small set of repeating data.
 
Last edited:
Or an even simpler example here: Imagine a shooter where you contantly switch weapons. On last gen(s), all the weapon models (and related animations, sound, particle effects, UI graphics, etc) would stay resident in RAM constantly. Now it can all be pulled out from storage at weapon switch. That means, during a single level, a player can potentially switch weapons enough times for the game to pull thousands of GB through the streaming system, all by cycling through a relatively small set of repeating data.

Now imagine all sounds are streamed exactly on demand from SSD. Every time you shoot a gun, RAM only has the first 1/4 second of that sound clip resident in ram, and pulls the rest as it starts playing. That's what? A couple Kbs of data? But those couple Kbs of data will be streamed over and over and over throught a game session literally thousands of times.

Imagine every sound effect, gun shot, explosion, foot-step, line of dialogue is pulled on demand, in small chuncks, loaded just in time. All those are assets (sound assets in this case) that end up being re-used repeatedly millions of times during a game.

The same can be done with charcter animations, by the way...

Once you open up to modern, ambitious uses of fast streaming, the idea of the crazy speeds being used more often than only at level load becomes way more plausible.
 
Last edited:
Theoretically the game would upload the game audio into the x-ram so it didnt have to use ram or stream it off the hdd
the cheaper x-fi's had 2mb of x-ram and i think it was just used as a cache and the game didnt need to be coded to take advantage of it. (could be wrong on that)

Here's some info on X-Ram :
http://ixbtlabs.com/articles2/multimedia/creative-x-fi-part3.html

Oh ok thanks yes i remember that. I have to say, those higher end Xfi models did deliver. Not all games did support it, but quite many did, considering the niche market the Xfi 64mb versions where.
No idea about performance, but in BF2 for example, there was a huge difference in sound quality and placement (channels too).
 
This sounds like exactly what RTX-IO is doing. Do more evidence to suggest RTX-IO is just an Nvidia branding of Direct Strorage.


DirectX DirectStorage is another API, but one that's not dedicated to graphics.

This software library was originally created for Xbox Series X / S consoles and is being brought across to Windows 10 computers to offer similar advantages. DirectStorage will allow games and other programs more direct access to resources on a primary storage drive, than how things are currently done.
...
The key tricks DirectStorage employs to help out are bundling I/O requests into batches, rather than handling them in a serial manner, and letting games decide when they need to be told that a request has been completed. The CPU still has to work through of all this, of course, but it can now apply multiple threads in parallel to the task.
...
DirectStorage probably won't provide any boost for older PCs and games, but for the latest machines and titles to come, the use of this API will help to give us quicker loading times, faster data streaming, and a bit more CPU breathing space. All thanks to a new software library!
...
Meanwhile, you may have heard that Nvidia is developing something they're calling RTX IO. This system is not the same DirectStorage under a proprietary name, as it's about having a means to bypass copying resources to the system memory. Instead, a game could directly transfer data from the storage drive to the graphics card's local memory.

However, it is being designed to be used in conjunction with Microsoft's forthcoming API, to further reduce the amount of I/O requests the CPU manages. And where one vendor has started something, others soon follow.
November 30, 2020
What's New in DirectX 12? Understanding DirectML, DirectX Raytracing and DirectStorage (techspot.com)
 

Disappointing if accurate but they do come across as if they know this rather than are speculating.

That said, they do say this at the start of the article which suggests they don't have any more information than we do and thus could simply be drawing their own conclusions:

"The third enhancement we're going to cover hasn't been released yet, and it's still yet to hit the developer preview stage. That means details are still scarce besides what Microsoft has told us via their developer blog."
 
Disappointing if accurate but they do come across as if they know this rather than are speculating.

That said, they do say this at the start of the article which suggests they don't have any more information than we do and thus could simply be drawing their own conclusions:

"The third enhancement we're going to cover hasn't been released yet, and it's still yet to hit the developer preview stage. That means details are still scarce besides what Microsoft has told us via their developer blog."
True, at this point it's a lot of speculating whether the two features perform the same functionality. From past reading I know Nvidia was keen on developing a means of direct transfers to GPU memory in lieu of reliance on CPU memory. While the end result might end up being the same, it is possible Nvidia has their own take on the API's purpose which is less dependence on the CPU for memory transfers. Time will tell ...
 
So Direct Storage improves batching and I/O request handling on the CPU side, and allows Nvidia and AMD to "plug in" their respective GPU decompression technologies which gives the GPU access to local storage directly, further reducing CPU related I/O work.
 
So it seems that indeed DirectStorage on PC is crippled and NVIDIA is uncrippling it, shame on Microsoft.
 
True, at this point it's a lot of speculating whether the two features perform the same functionality. From past reading I know Nvidia was keen on developing a means of direct transfers to GPU memory in lieu of reliance on CPU memory. While the end result might end up being the same, it is possible Nvidia has their own take on the API's purpose which is less dependence on the CPU for memory transfers. Time will tell ...

I read the devblog again, they never said the CPU need to read the data before sending it to the GPU. It looks like techspot article is false.

In a world where a game knows it needs to load and decompress thousands of blocks for the next frame, the one-at-a-time model results in loss of efficiency at various points in the data block’s journey. The DirectStorage API is architected in a way that takes all this into account and maximizes performance throughout the entire pipeline from NVMe drive all the way to the GPU.

It does this in several ways: by reducing per-request NVMe overhead, enabling batched many-at-a-time parallel IO requests which can be efficiently fed to the GPU, and giving games finer grain control over when they get notified of IO request completion instead of having to react to every tiny IO completion.

In this way, developers are given an extremely efficient way to submit/handle many orders of magnitude more IO requests than ever before ultimately minimizing the time you wait to get in game, and bringing you larger, more detailed virtual worlds that load in as fast as your game character can move through it.
 
So Direct Storage improves batching and I/O request handling on the CPU side, and allows Nvidia and AMD to "plug in" their respective GPU decompression technologies which gives the GPU access to local storage directly, further reducing CPU related I/O work.

If that is the case, which still seems to be a bit up in the air, then the problem is that AMD haven't announced any GPU based decompression capability, and the ideal time to have done so has already passed.
 
If that is the case, which still seems to be a bit up in the air, then the problem is that AMD haven't announced any GPU based decompression capability, and the ideal time to have done so has already passed.
Ideal time possibly.

But as long as they announce and ready prior to direct storage coming out and used, it shouldn't be a problem at all.

May have some bar graphs on competing products but its not being used yet.
 
So it seems that indeed DirectStorage on PC is crippled and NVIDIA is uncrippling it, shame on Microsoft.
Except that what NVIDIA has described is literally nothing new, Vega did it all already.

If that is the case, which still seems to be a bit up in the air, then the problem is that AMD haven't announced any GPU based decompression capability, and the ideal time to have done so has already passed.
"GPU based decompression capability" is nothing but a program running on shaders of the GPU, what's to advertise there?
 
Except that what NVIDIA has described is literally nothing new, Vega did it all already.
Doesn't Vega (HBCC) use system memory as cache, where as RTX IO bypasses system memory?
amd-vega-ces-2017-press-deck_Page_36.jpg
 
Doesn't Vega (HBCC) use system memory as cache, where as RTX IO bypasses system memory?
Pretty sure it can but doesn't need to, like shown in Radeon SSGs with direct connected SSDs.
In that slide System DRAM is just another memory it can connect to, among NVRAM, Network Storage etc
 
Back
Top