DirectStorage GPU Decompression, RTX IO, Smart Access Storage

Question is, will RTX IO bring something new to the table on top of DirectStorage?
Now that they've walked this back and made it clear that any DX12 and SM6.6 capable GPU will work, I think it's not too dangerous to assume 'RTX IO' was a bit of a celebratory name for something far more simple in reality and that wont be using any RT/tensor/DX12U features at all, as expected. Seems like it's just standard shader work and given how much they dismissed the idea of this affecting performance, it's probably not gonna be very taxing, so nothing slightly older GPU's couldn't handle just fine.
 
Now that they've walked this back and made it clear that any DX12 and SM6.6 capable GPU will work, I think it's not too dangerous to assume 'RTX IO' was a bit of a celebratory name for something far more simple in reality and that wont be using any RT/tensor/DX12U features at all, as expected. Seems like it's just standard shader work and given how much they dismissed the idea of this affecting performance, it's probably not gonna be very taxing, so nothing slightly older GPU's couldn't handle just fine.
Why lock it to RTX cards then? We need more info.
 
Now that they've walked this back and made it clear that any DX12 and SM6.6 capable GPU will work, I think it's not too dangerous to assume 'RTX IO' was a bit of a celebratory name for something far more simple in reality and that wont be using any RT/tensor/DX12U features at all, as expected. Seems like it's just standard shader work and given how much they dismissed the idea of this affecting performance, it's probably not gonna be very taxing, so nothing slightly older GPU's couldn't handle just fine.

Nvidia have defined RTX-IO as requiring it's own API that working in tandem with DirectStorage though. So unless that's not true it must be doing something that DirectStorage isn't.
 
Nvidia have defined RTX-IO as requiring it's own API that working in tandem with DirectStorage though. So unless that's not true it must be doing something that DirectStorage isn't.
I think that just means the sort of GPU end of the DirectStorage pipeline rather than it being some totally separate process. I'd guess AMD had to do their own work for how to handle the incoming data and how to decompress it and all too, they just didn't bother coming up with a marketable name for it.
 
Did they? As far as I know RTX IO requires an RTX card. DirectStorage will presumably work on any GPU supporting SM 6.0.
MS walked back the requirements for DirectStorage is what I meant. No requirement for DX12U GPU's.

But I mean, what do you think RTX IO is, exactly? Everything that Nvidia talked about didn't mention any special hardware or anything that would be related to it being RTX-specific technology. And you know dang well they'd have tooted their own horn about advantages RTX IO has over what any other GPU could do. Plus we also know from Microsoft's talks that the decompression is just done on the compute shaders, of which there's nothing particularly special about RTX GPU's in that respect.

I still maintain the assumption that RTX IO was just a way for Nvidia to have a marketable name for technology that would work fine on somewhat older GPU's. Just seems like the most likely situation, all considered, and the fact that there's genuinely nothing to really suggest otherwise other than the name.
 
So at first it seemed as if "Sampler Feedback" was a key component in DirectStorage, but that seems to have disappeared.
I don't see why Sampler Feedback would be connected to DS in any way?
This was an error. Sampler Feedback (sometimes Sampler Feedback Streaming), Tiled Resources, texture compression, and DirectStorage were always presented as separate parts of what they call 'Xbox Velocity Architecture'.

The GameStack Live presentation for DirectStorage includes a real-time 'Game Asset Streaming demo', which shows how these APIs work together to reduce video memory usage and implement seamless on-demand disk streaming.
 
Last edited:
I sincerely believe the console wars have overstated (greatly) the "neeeeeeeeed" for directstorage in the PC space. I get why a memory-limited system like a console would want to ensure assets can be loaded directly from high speed storage, bypassing a need to work through ever more clever memory mapping and paging techniques to keep more and higher quality assets on screen.

I'm sure there will be reasons why directstorage could be interesting on the PC, it's going to take a lot of convincing to show me that it's any sort of game changer. Hell, it's been very well demonstrated the move from SATA 3 (6gbps) SSD to even the very fastest NVMe SSD's have a negligible performance impact on PC game loading times in the majority of titles. It simply isn't the thing a PC is waiting on...
 
I sincerely believe the console wars have overstated (greatly) the "neeeeeeeeed" for directstorage in the PC space. I get why a memory-limited system like a console would want to ensure assets can be loaded directly from high speed storage, bypassing a need to work through ever more clever memory mapping and paging techniques to keep more and higher quality assets on screen.

Additional memory absolutely would at least partially mitigate the need for an ultra fast IO system with a good pre-caching solution in place. I don't think it would fully mitigate the advantages of fast IO though (but would come with its own set of advantages too).

Hell, it's been very well demonstrated the move from SATA 3 (6gbps) SSD to even the very fastest NVMe SSD's have a negligible performance impact on PC game loading times in the majority of titles. It simply isn't the thing a PC is waiting on...

I think there are two primary reasons for this:

1. The legacy IO stack and CPU decompression are already bottlenecking IO performance as SATA SDD speeds, so making the SDD faster doesn't help anything. DirectStorage should largely remove that bottleneck, thus putting it back on the raw SDD speed and thus you'll see a direct improvement in that regard.

2. Because of the pre-existing IO bottlenecks, games aren't designed in such a way to take advantage of a super fast IO system. For example the world setup might be single threaded because even then its faster than the data load. So moving to a super fast IO solution may not result in the expected speed up. Weve seen this on the next gen consoles and we'll see it on PC too.
But make no mistake, once games start to be authored to take advantage of super fast IO, PC's without DirectStorage (which by default includes all SATA SDD based PC's) will be left in the dust. It is however, IMO, unlikely to have much impact on those PCs ability to play the vast majority of games outside of initial load and fast travel times.
 
1. The legacy IO stack and CPU decompression are already bottlenecking IO performance as SATA SDD speeds, so making the SDD faster doesn't help anything. DirectStorage should largely remove that bottleneck
I'm pretty sure it isn't "bottlenecking IO performance as SATA SSD speeds." There is empirical proof of NVMe drives getting massive increases in both bandwidth and IOPS rate on modern Windows operating systems. Converting those massive increases into application performance is what we're really talking about here and (as an example) SQL 2019 can chew straight through an entire NVMeOF frame without much issue. A million IOPS or even more? Yup, absolutely happens when a random business user crushes the cluster with a shitty adhoc query they built to scan a non-indexed column against a a four trillion row table. :(

DirectStorage is certainly about reducing I/O overhead on the CPU, however you must also continue to consider the context: a power limited and already heaviliy subscribed CPU built into consoles. Issuing 50,000 IOPS on any system is going to be power intensive, which is more meaningful in a console where the power budget is very specific and not what a desktop would deal with. Yes, reducing the CPU cycles necessary to operate those IOPS will be beneficial; they'll be most beneficial to power-limited systems like consoles.

One of the HP DL560's can eat up 2.5KW and plow straight through whatever I/O bottleneck might be alluded to. My desktop rig at home can crank up about a kilowatt of draw (enough to piss off the 1.3KVA UPS it's plugged into) and chew straight through it too. My home rig doesn't have an NVMeOF frame to abuse, but it can certainly beat down the NVMe drive I've attached to a PCIe 3.0 4x expansion card.

So yeah, it'll matter to the consoles for sure, and probably to laptop gamers. For someone who isn't on a tiny and strict power budget, it's going to matter less and less. Windows isn't inherently limited to SATA SSD speeds; that's pure hyperbole.
 
If directstorage was very much needed and Win11 has restrictive requirements then it should be ported back to Win10. What reason would any PC app or game developer have to learn it, support it, never mind build a game to really leverage it, if it cuts their prospective market with no upside. XBox's use of directstorage only impacts PC's adoption of it to the degree that AAA multi-platforms have influence on PC gaming. Look at the most played PC games on Steam or Twitch -- it's not like all the big games are console ports.
Almost the big names are running on Unreal and Unity, which, by nature of needing to support Xbox will have support for DirectStorage out of the box. Only really large developers use their own engines anymore, and, well, most of those are going to be on Xbox too.
I sincerely believe the console wars have overstated (greatly) the "neeeeeeeeed" for directstorage in the PC space. I get why a memory-limited system like a console would want to ensure assets can be loaded directly from high speed storage, bypassing a need to work through ever more clever memory mapping and paging techniques to keep more and higher quality assets on screen.

I'm sure there will be reasons why directstorage could be interesting on the PC, it's going to take a lot of convincing to show me that it's any sort of game changer. Hell, it's been very well demonstrated the move from SATA 3 (6gbps) SSD to even the very fastest NVMe SSD's have a negligible performance impact on PC game loading times in the majority of titles. It simply isn't the thing a PC is waiting on...

The entire reason why faster SSDs show no performance impact on PC game loading is because of the IO stack, which is what DirectStorage aims to address. It will absolutely result in much faster load times on PC. Keep in mind the issue for PC gaming is not access to the SSD by the CPU, or the time it takes to put data into the system ram, but the speed data can be transferred to the GPU's dedicated ram. It's not the same issue as your example. DirectStorage is about being able to feed the GPU directly instead of having to navigate through the system ram with all the overhead related to that. Consoles have a unified ram pool, and had lower IO overhead to begin with so it's the the other parts of the Xbox Velocity Architecture that matter the most there. DirectStorage being done the way it was done was more about making sure the APIs were handled the same across Xbox and PC. The changes to the PC IO stack are solving a real problem.
 
Last edited:
The entire reason why faster SSDs show no performance impact on PC game loading is because of the IO stack
Based on what verifiable evidence do you make this claim?
which is what DirectStorage aims to address.
DirectStorage is indeed about reducing CPU time on a per-IO request. I addressed this very specifically in my prior reply.

Keep in mind the issue for PC gaming is not access to the SSD by the CPU, or the time it takes to put it into the system ram, but the speed data can be transferred to the GPU's dedicated ram. It's not the same issue as your example.
The IO enhancements being made by Microsoft are batching and threading of bulk-rate disk requests, not changing the path of data from disk to GPU memory.

I'd like to see verifiable evidence of "stunted" transfer rate from system memory to GPU memory as well, because I'm not aware of any data which shows this as a bottleneck. Please educate me by showing me some data to support this claim.
 
Based on what verifiable evidence do you make this claim?

DirectStorage is indeed about reducing CPU time on a per-IO request. I addressed this very specifically in my prior reply.


The IO enhancements being made by Microsoft are batching and threading of bulk-rate disk requests, not changing the path of data from disk to GPU memory.

I'd like to see verifiable evidence of "stunted" transfer rate from system memory to GPU memory as well, because I'm not aware of any data which shows this as a bottleneck. Please educate me by showing me some data to support this claim.
DirectStorage is necessary to enable things like this: RTX IO: GPU Accelerated Storage Technology | NVIDIA

Though it isn't like Nvidia doesn't have a history of lying about stuff, so shrug.

APIs in a vacuum don't really do anything. If the RTX IO numbers are even in the ballpark of realistic, they'll matter a lot. But this sort of thing is why I said hardware support in the GPU was going to matter and that "slow" adoption wasn't really relevant. It's implementations of features like that that'll result in actual changes.
 
Last edited:
I'm pretty sure it isn't "bottlenecking IO performance as SATA SSD speeds." There is empirical proof of NVMe drives getting massive increases in both bandwidth and IOPS rate on modern Windows operating system

Why else would we see virtually no improvement in load times when when going from a 500MB/s drive to 5000MB/s drive?

I'm sure some workloads can take advantage of that speed but the many small, often compressed IO requests of a typical gaming workload are a worst case scenario for that IO old stack and CPU decompression.

The good news is that when freed of that bottleneck then any PC with an NVMe drive should absolutely fly from an IO perspective.
 
Why else would we see virtually no improvement in load times when when going from a 500MB/s drive to 5000MB/s drive?
Because, tada, I/O rate isn't a limiter on the PC platform. How can we know? Because I/O wait and service times are basically zero on the NVMe drive. There are tools directly built into Windows to track this data, and it's been measured, and it isn't a problem.

Also, you like many other people are too focused on megabytes per second. The "responsiveness" of I/O isn't measured in bandwidth, it's measured by IOPS and service times. NVMe crushes the IOP rate and service times of a legacy SATA SSD, by more than the paltry 10x increase in bandwidth lets on. Just like SATA SSD's over their spinning counterparts, the simplistic bandwidth increase was never why those drives were so much faster than their predecessors.

I'm sure some workloads can take advantage of that speed but the many small, often compressed IO requests of a typical gaming workload are a worst case scenario for that IO old stack and CPU decompression.
The smaller and more random the workloads, the faster NVMe performs over a traditional SATA interface. An enterprise-grade relational database software will absolutely crush all disk I/O requirements of something stupid like a video game, and Windows performs those workloads with aplomb.

I do enterprise storage and servers for a living, and have for more than tow decades. The I/O stack in Windows isn't perfect, just like it isn't perfect in Linux. It also isn't the bottleneck you and others seem to assume it is.
 
Last edited:
Back
Top