I wanted to start a thread focusing on the capabilities and questions surrounding the next-generation of NVMe SSD and I/O technology within the PC and console gaming space. Please feel free to post any additional information or videos that should be added to the following post. And of course, Mod's can add or change any information within these post.
So, this thread/topics are a work in progress.
----------------------------------------------------------------------------------------------------
PC
General SSD & I/O Architecture Overview
Nvidia RTX IO Technology
Articles
TomsHardware 09/04/2020 Article
Reveal and Deep Dive Videos
Queued @22:50
Queued @23:45
NVMe SSD & I/O Performance Showcase Videos (pending)
So, this thread/topics are a work in progress.
----------------------------------------------------------------------------------------------------
PC
General SSD & I/O Architecture Overview
Nvidia RTX IO Technology
Leveraging the advanced architecture of our new GeForce RTX 30 Series graphics cards, we’ve created NVIDIA RTX IO, a suite of technologies that enable rapid GPU-based loading and game asset decompression, accelerating I/O performance by up to 100x compared to hard drives and traditional storage APIs. When used with Microsoft’s new DirectStorage for Windows API, RTX IO offloads dozens of CPU cores’ worth of work to your GeForce RTX GPU, improving frame rates, enabling near-instantaneous game loading, and opening the door to a new era of large, incredibly detailed open world games.
Object pop-in and stutter can be reduced, and high-quality textures can be streamed at incredible rates, so even if you’re speeding through a world, everything runs and looks great. In addition, with lossless compression, game download and install sizes can be reduced, allowing gamers to store more games on their SSD while also improving their performance.
“Microsoft is delighted to partner with NVIDIA to bring the benefits of next generation I/O to Windows gamers. DirectStorage for Windows will let games leverage NVIDIA’s cutting-edge RTX IO and provide game developers with a highly efficient and standard way to get the best possible performance from the GPU and I/O system. With DirectStorage, game sizes are minimized, load times reduced, and virtual worlds are free to become more expansive and detailed, with smooth & seamless streaming.” - Bryan Langley - Group Program Manager for Windows Graphics and Gaming
NVIDIA RTX IO plugs into Microsoft’s upcoming DirectStorage API, which is a next-generation storage architecture designed specifically for gaming PCs equipped with state-of-the-art NVMe SSDs, and the complex workloads that modern games require. Together, the streamlined and parallelized APIs, specifically tailored for games, allow dramatically reduced IO overhead and maximize performance/bandwidth from NVMe SSD to your RTX IO-enabled GPU.
Specifically, NVIDIA RTX IO brings GPU-based lossless decompression, allowing reads through DirectStorage to remain compressed while being delivered to the GPU for decompression. This removes the load from the CPU, moving the data from storage to the GPU in its more efficient, compressed form, and improving I/O performance by a factor of 2.
GeForce RTX GPUs are capable of decompression performance beyond the limits of even Gen4 SSDs, offloading dozens of CPU cores’ worth of work to deliver maximum overall system performance for next generation games.
Articles
TomsHardware 09/04/2020 Article
Microsoft this week said that it would bring preview of its DirectStorage application programming interface that powers the company’s Xbox Velocity Architecture to Windows 10 developers in 2021. The API is designed to speed up game loading times and improve performance of games by eliminating storage API-related bottlenecks and reducing CPU involvement, but on a client PC it can do much more than that. Nvidia has also adopted the technology, branded Nvidia RTX IO, for its Ampere graphics cards.
Modern PC games use tens of gigabytes of storage and to load them quickly one needs an SSD that supports the NVMe protocol and boasts a high sequential read speed. To further optimize performance by ensuring that all the necessary data like textures and sounds fits into memory (both system RAM and GPU RAM), contemporary game engines break the assets into blocks and load only those that are needed for the scene being rendered. These blocks may be rather small, but they are still larger than 4 KB blocks used to rate random input/output (I/O) performance of SSDs.
According to Microsoft, the custom SSD used in the upcoming Xbox Series X console generates well over 35,000 64 KB I/O requests per second to hit its peak sequential read speed of 2.4 GB/s. The NVMe protocol and modern SSDs can handle multiple queues simultaneously (which is called queue depth) and each of them can contain many requests. But raw performance of the drive is only a part of the equation.
PCWorld 09/03/2020 ArticleExisting storage APIs require the application to manage its I/O requests sequentially: submit the request, wait for it to complete, handle its completion, move on to another request. Older games that generated hundreds of requests (as they were designed primarily with hard drives in mind) did not produce a significant overhead and therefore did not use too much CPU time. But with upcoming titles that generate tens of thousands of requests that overhead gets so substantial that it might prevent modern systems from taking full advantage of modern SSDs and/or leave no CPU horsepower for other tasks.
Nvidia’s Huang said that RTX IO offers “APIs for fast loading and streaming directly from SSD to GPU memory” and GPU lossless decompression. It’s unclear yet whether that’s a special sauce, or just Nvidia glomming onto the benefits of DirectStorage itself. Nvidia’s marketing did a killer job of tying real-time ray tracing to its RTX branding, but the technology is actually built on Microsoft’s underlying Direct Raytracing API, which is why you’ll be seeing it in the Xbox Series X and AMD’s RDNA 2-based “Big Navi” graphics cards later this year.
Microsoft’s post makes it clear that you’ll need an NVMe drive to tap into DirectStorage’s benefits, however. That’s because NVMe drives offer both extremely high bandwidth compared to traditional SATA-based storage, as well as multiple “NVMe queues” that can contain multiple IO requests, making them “a perfect match to the parallel and batched nature of modern gaming workloads”—and GPU capabilities.
HotHardwareDirectStorage looks like it’ll change that—when it arrives on PCs, that is. While the technology will be part of the Velocity Architecture inside the Xbox Series X this fall, Microsoft says it’s hoping to get a DirectStorage preview in the hands of PC developers sometime in 2021. If the dream of instantly loading worlds turns into a gaming reality, the wait will be worth it.
A demo to show the theoretical benefits of NVIDIA RTX IO, that works in conjunction with Microsoft's DirectStorage API, was also shown. During the demo, handling the level load and decompression took about 4X as long on a PCIe Gen 4 SSD using current methods and used significantly more CPU core resources. The demo was run on a 24-core Threadripper system and the standard load / decompress took over 5 seconds. With RTX IO, that time was cut to just 1.61 seconds. We won’t even talk about the hard drive’s performance here. Ouch – it hurts just to look at the chart.
Reveal and Deep Dive Videos
NVMe SSD & I/O Performance Showcase Videos (pending)
Last edited: