Blazing Fast NVMEs and Direct Storage API for PCs *spawn*

Found this little bit of info over at TPU from the 6800XT review:

https://www.techpowerup.com/review/amd-radeon-rx-6800-xt/3.html

TechPowerUp said:
AMD is also introducing support for the DirectStorage API, which can accelerates game loading times by enabling the GPU to directly access game resource data from an NVMe SSD in its native compressed format, and perform decompression on the GPU, leveraging compute shaders. Since the GPU isn't performing much 3D rendering during level loading scenes, you won't feel its impact on frame rates, but the loading times will be cut down.

This sounds like exact;y what RTX-IO is doing. Do more evidence to suggest RTX-IO is just an Nvidia branding of Direct Strorage.

Excellent news for the PC platform as whole if true as it means take up of this feature will be much faster and more widespread.
 
Found this little bit of info over at TPU from the 6800XT review:

https://www.techpowerup.com/review/amd-radeon-rx-6800-xt/3.html



This sounds like exact;y what RTX-IO is doing. Do more evidence to suggest RTX-IO is just an Nvidia branding of Direct Strorage.

Excellent news for the PC platform as whole if true as it means take up of this feature will be much faster and more widespread.

Its what i and many have been saying. AMD is not going to sit and let NV take the blazing fast SSD advantage in ALL games going forward, while employing it on consoles.

Going to be intresting, i think raw speeds are already 7gb/s, with NV claiming 14 and higher. AMD is most likely doing exactly the same there with their hardware. Also, on pc, i assume we are not having to see reduced install sizes, which means less to decompress to begin with. As i assume pc gamers are not sitting with 600gb worth of internal storage.

But it will impact GPU performance during streaming.

NV suggested this would be extremely minimal, i have no doubt AMDs hardware will achieve close to that, they are experienced in this area since before (HBCC etc).
 
But it will impact GPU performance during streaming.

True, but the amount of data being streamed into vram, while actually gaming as opposed to a load or transition screen (which includes fast travel) will be minuscule and thus likely unnoticeable. Naturally at load time you might want to bring over 16GB of data as fast as possible into VRAM. But doing anything close to that on a constant basis during game play won't happen because game sizes install don't support it.
 
Found this little bit of info over at TPU from the 6800XT review:

https://www.techpowerup.com/review/amd-radeon-rx-6800-xt/3.html

This sounds like exact;y what RTX-IO is doing. Do more evidence to suggest RTX-IO is just an Nvidia branding of Direct Strorage.

Excellent news for the PC platform as whole if true as it means take up of this feature will be much faster and more widespread.

Like I said the second they introduced RTX IO, it's nothing but their marketing name for DirectStorage support.
And that's why I've also tried to correct people on these forums (too) who use it as if it was some fancy NVIDIA Secret Sauce.
And that AMD has had the capabilities at least on Vega already, regardless if it will ever get the official DirectStorage support or not.
 
Like I said the second they introduced RTX IO, it's nothing but their marketing name for DirectStorage support.
And that's why I've also tried to correct people on these forums (too) who use it as if it was some fancy NVIDIA Secret Sauce.
And that AMD has had the capabilities at least on Vega already, regardless if it will ever get the official DirectStorage support or not.

Yes, and we all can be glad you are correct.
 
Like I said the second they introduced RTX IO, it's nothing but their marketing name for DirectStorage support.
And that's why I've also tried to correct people on these forums (too) who use it as if it was some fancy NVIDIA Secret Sauce.
And that AMD has had the capabilities at least on Vega already, regardless if it will ever get the official DirectStorage support or not.

Yes but there's a difference between making assumptions with no evidence because you want them to be true, and coming to grounded conclusions based on the available evidence. Only now is the evidence starting to build that RTX-IO is just a standard implementation of DirectStorage, and as I've said all along, that is the best case scenario.

Also I'm not aware that Vega has demonstrated this capability before. The ProSSG may have an SSD connected directly to the GPU but I've seen no claims of real time compute shader based decompression from it. Also - it's not like DirectStorage anyway since they had to physically connect an SSD to the GPU to make it work. DirectStorage.... does not.
 
Yes but there's a difference between making assumptions with no evidence because you want them to be true, and coming to grounded conclusions based on the available evidence. Only now is the evidence starting to build that RTX-IO is just a standard implementation of DirectStorage, and as I've said all along, that is the best case scenario.

Also I'm not aware that Vega has demonstrated this capability before. The ProSSG may have an SSD connected directly to the GPU but I've seen no claims of real time compute shader based decompression from it. Also - it's not like DirectStorage anyway since they had to physically connect an SSD to the GPU to make it work. DirectStorage.... does not.
If by evidence you mean "someone spells it outright somewhere else than forums", then yes. Other than that, it was crystal clear from the beginning - RTX IO was described doing exactly what DirectStorage is meant for as per all the MS talks about DirectStorage and Project Velocity.

Realtime compute shader based decompression is nothing new, it's not some "built in capability in the shader core", it's just a program running on shaders. You don't advertise such things when it's not highly relevant for your use case, it became relevant because of DirectStorage.
Physical connection is just a medium, HBCC doesn't require directly connected SSDs (it supports even network storage), but it was a hassle free solution without the need for support from OS, CPU, chipset and whatnot for the SSG-models.
 
If by evidence you mean "someone spells it outright somewhere else than forums", then yes. Other than that, it was crystal clear from the beginning - RTX IO was described doing exactly what DirectStorage is meant for as per all the MS talks about DirectStorage and Project Velocity.

There is absolutely no detail from Microsoft, Nvidia, or anyone else of equal reliability from the September/October timescale that makes it "crystal clear", or even vaguely suggestive that the basic implementation of Direct Storage is identical to RTX-IO. You made that assumption and kudos to you for what is starting to appear to be a correct assumption - I'm glad you are likely correct. But please don't make out that this was a known fact because it really wasn't. In fact it still isn't but the signs are now looking positive, for which I'm very happy. I will of course acknowledge my mistake should you provide the evidence you're referring to.

Realtime compute shader based decompression is nothing new, it's not some "built in capability in the shader core", it's just a program running on shaders. You don't advertise such things when it's not highly relevant for your use case, it became relevant because of DirectStorage.
Physical connection is just a medium, HBCC doesn't require directly connected SSDs (it supports even network storage), but it was a hassle free solution without the need for support from OS, CPU, chipset and whatnot for the SSG-models.

You're taking disconnected technologies and loosely tying them together to say "look, Vega could already do DirectStorage" where-as the reality is we don't yet know the system requirements for DirectStorage. For the platforms sake, I hope Vega (and Pascal) will be able to support it but so far all we know for sure if that Ampere, Turing and RDNA2 support it. There may be some specific DMA requirements, or perhaps some DX12U based hardware requirement for the particular compression formats that DS uses. Or who knows what else. Let's not jump to conclusions before we have all the information.
 
As per TechPowerUps R6000 series Reviews

AMD is also introducing support for the DirectStorage API, which can accelerate game-loading times by giving the GPU direct access to game resource data from an NVMe SSD in its native compressed format, and performing the decompression on the GPU, leveraging compute shaders. Since the GPU isn't performing much 3D rendering during level-loading scenes, you won't feel its impact on frame rates, but loading times will be cut down.

They say the GPU isn't performing much rendering during level-loading scenes so you wont feel it's impact on framerates, which is as true as it is right now with CPU decompression, but next generation games will be streaming in far more data than ever before... so the real question is how will this affect framerates during gameplay when assets are being streamed in on the fly?

I believe Nvidia said there was a "negligible" performance impact, and in fact that overall framerates could possibly be improved overall due to the CPU being able to perform other tasks instead.

Can Tensor cores perform these functions, or does it have to be shader based?
 
but next generation games will be streaming in far more data than ever before...

Will they though? Game install sizes don't look like they're going to go up that much and these consoles have twice as much vram as last generation.

Look at Miles Morales. 40GB install. Let's say 80GB after decompression. You can get 20% of the entire game in vram at any given point. I don't see why under those circumstances you'd need to be streaming much from SSD at all, merely enough to keep the vram topped up as you traverse the map.

If you were streaming at the PS5's maximum speed you'd have used your entire game content in less 10 seconds. I expect actual in game streaming speeds to be far, far below the peak burst speed used as loading screens.
 
Will they though? Game install sizes don't look like they're going to go up that much and these consoles have twice as much vram as last generation.

Look at Miles Morales. 40GB install. Let's say 80GB after decompression. You can get 20% of the entire game in vram at any given point. I don't see why under those circumstances you'd need to be streaming much from SSD at all, merely enough to keep the vram topped up as you traverse the map.

If you were streaming at the PS5's maximum speed you'd have used your entire game content in less 10 seconds. I expect actual in game streaming speeds to be far, far below the peak burst speed used as loading screens.

That and PC games might actually be not compressed as much as console games, seeing the cost of SSD space.
 
Will they though? Game install sizes don't look like they're going to go up that much and these consoles have twice as much vram as last generation.

Look at Miles Morales. 40GB install. Let's say 80GB after decompression. You can get 20% of the entire game in vram at any given point. I don't see why under those circumstances you'd need to be streaming much from SSD at all, merely enough to keep the vram topped up as you traverse the map.

If you were streaming at the PS5's maximum speed you'd have used your entire game content in less 10 seconds. I expect actual in game streaming speeds to be far, far below the peak burst speed used as loading screens.

I don't think Miles Morales is a great example of a next generation game to be honest... but I understand your point. There's only so much content that developers can create, and they have to keep game sizes in check... but there's a reason why Cerny went with the SSD that he did. Being able to load smaller chunks of game worlds and stream data in at a higher rate allows for that VRAM to be used for content that's actually within view/reach of the player, rather than the next few rooms being loaded into memory. So visual fidelity goes up, despite RAM capacity being the same. Yes, there's only so much content, but developers are craft at reusing and repurposing content over and over again in different ways. Being able to get to it, get rid of it, and get to it again will be important, I think.

AAA Games could be pushing 100-150GB next gen easily.. Game sizes could double or triple, yet system RAM has only doubled. I guess we'll just have to see how devs go about designing their games, given this new freedom.


On a side note.. I wonder about next gen games like MS Flight Sim 2020 utilizing the net... games that stream in data from the internet to a allocated part of the SSD, that data gets loaded into memory super fast when the game needs it, then gets overwritten by new content as the game requires it. That could allow them to have crazy high detailed assets streamed in from the net on the fly, while only requiring a modest chunk of SSD space as a cache.

I know that's not likely to happen... but I think something like that could be really cool. I wish some developer would just go absolutely crazy just to show was is possible. :D
 
I don't think Miles Morales is a great example of a next generation game to be honest... but I understand your point.

Yeah I just used it as a convenient current gen example but I don't think the basic point changes if game size is scaled up though. Let's say we quadruple the install size to 160GB then double that with compression to 320GB. You could still stream the entire content of your game into VRAM in just over 30 seconds on the PS5.

but there's a reason why Cerny went with the SSD that he did. Being able to load smaller chunks of game worlds and stream data in at a higher rate allows for that VRAM to be used for content that's actually within view/reach of the player, rather than the next few rooms being loaded into memory. So visual fidelity goes up, despite RAM capacity being the same. Yes, there's only so much content, but developers are craft at reusing and repurposing content over and over again in different ways. Being able to get to it, get rid of it, and get to it again will be important, I think.

I agree with the point about re-using assets but I think for say a 10 hour game you're going to be re-using assets a hell of a lot if you will have seen everything the game has to offer in 30 seconds! So I suspect real streaming speeds will be much lower.

The main reason I think Sony went for such a high speed solution is to eliminate or at least vastly reduce load time, including transition loads like fast travel. There may be some corner cases where you want to suddenly grab GB of data from the SSD without a transition screen that you weren't able to predict and preload into memory, but I suspect those situations will be rare enough for it to not be a general concern for the GPU impact of DirectStorage based streaming decompression.

On a side note.. I wonder about next gen games like MS Flight Sim 2020 utilizing the net... games that stream in data from the internet to a allocated part of the SSD, that data gets loaded into memory super fast when the game needs it, then gets overwritten by new content as the game requires it. That could allow them to have crazy high detailed assets streamed in from the net on the fly, while only requiring a modest chunk of SSD space as a cache.

I think this is a good example of what I'm talking about above. Flight Sim is certainly doing a lot of streaming and it's clearly working well. But how fast will it be streaming that data? A typical internet connection is probably no more than 20MB/s and even very high end connections are less than 0.5GB/s. So nothing that will stress even a CPU for decompression, nevermind a GPU.
 
Yeah I just used it as a convenient current gen example but I don't think the basic point changes if game size is scaled up though. Let's say we quadruple the install size to 160GB then double that with compression to 320GB. You could still stream the entire content of your game into VRAM in just over 30 seconds on the PS5.



I agree with the point about re-using assets but I think for say a 10 hour game you're going to be re-using assets a hell of a lot if you will have seen everything the game has to offer in 30 seconds! So I suspect real streaming speeds will be much lower.

The main reason I think Sony went for such a high speed solution is to eliminate or at least vastly reduce load time, including transition loads like fast travel. There may be some corner cases where you want to suddenly grab GB of data from the SSD without a transition screen that you weren't able to predict and preload into memory, but I suspect those situations will be rare enough for it to not be a general concern for the GPU impact of DirectStorage based streaming decompression.



I think this is a good example of what I'm talking about above. Flight Sim is certainly doing a lot of streaming and it's clearly working well. But how fast will it be streaming that data? A typical internet connection is probably no more than 20MB/s and even very high end connections are less than 0.5GB/s. So nothing that will stress even a CPU for decompression, nevermind a GPU.

Bluepoint games was talking about needing to load sometimes 3 to 4 GB of compressed data but they don't need to preload. Basically they arrive near the place they need to load the data they do it.

The problem of the Cerny example or create a games with same assets quality than the UE 5 demo is the size on the SSD. You need to be able to have 10 TB game for this to work.
 
Bluepoint games was talking about needing to load sometimes 3 to 4 GB of compressed data but they don't need to preload. Basically they arrive near the place they need to load the data they do it.

I don't know the context but I'm curious as to why they can't predict and pre-load that data as the player approaches the area. Unless it's a fast travel type scenario where the play can just to anywhere in the map/game content with no mechanism to predict. In any case I expect these to be rare instances rather than the norm of streaming.
 
I don't know the context but I'm curious as to why they can't predict and pre-load that data as the player approaches the area. Unless it's a fast travel type scenario where the play can just to anywhere in the map/game content with no mechanism to predict. In any case I expect these to be rare instances rather than the norm of streaming.

They speak about turning on a corner, they probably "preload" the data one or two seconds before. They don't need to continously load data and it is a linear games not an Open world. And they use it like this to be able to have more RAM for the scenery they display. The game assets quality and density is high on the game.

On PS4 they would have been loading data far in advance and lower a lot the quality and density of assets.
 
The problem of the Cerny example or create a games with same assets quality than the UE 5 demo is the size on the SSD. You need to be able to have 10 TB game for this to work.

UE5 doesn't really fit with Cerny's example. A 1m poly Nanite object is around the size of a 4k texture.* The space taken up in the game package by geometry isn't going TBs, even though the raw assets might be.

On the streaming side it requires low latency, but memory allocation for Nanite is around 750MB in the flying scene of the demo. Not sure at what rate new level of detail is streamed in though. The scene has 100,000s of objects.

* That is rather ambiguous depending on compression, but we're still talking single figure MB.
 
Last edited:
Back
Top