Next-Generation NVMe SSD and I/O Technology [PC, PS5, XBSX|S]

Discussion in 'Console Technology' started by Shortbread, Sep 18, 2020.

  1. PSman1700

    PSman1700 Legend

    Again, backup your bold claims with evidence. Just calling it arsed and lazy attempts isnt all that of a good discussion.

    @davis.anthony Lets take it like what Shifty implies, maybe they are lying, maybe they are not. The solution should be 'comparable to the PS5 IO', il take that at face value, if it werent true, they were lying.
     
    RootKit likes this.
  2. pjbliverpool

    pjbliverpool B3D Scallywag Legend

    For GPU performance we've already been given information by Nvidia in that the performance hit from the GPU based decompression is barely measurable. So no issues there.

    For CPU I agree, we need to wait for benchmarks but common sense should tell you that when you're removing over 80% of the decompression workload and significantly reducing the IO management workload, then the CPU is going to be significantly freed up

    Why did Nvidia release dedicated PhysX processors? Why did we still have standalone sound cards long after they were useful? Who knows what the business drivers are or even if it will ever actually materialise. My point isn't that there is zero benefit from a fully hardware based solution, but rather that my expectation is that the real world benefits will not be worth the increased cost and complexity.

    I didn't say it was dead, I said it was contrary to the general direction of the industry.

    This isn't remotely comparable. Any Shader Model 6 GPU will run the GPU decompression of Direct Storage. That's pretty much every modern GPU in every PC on the market today.

    The hardware based CPU solution would be starting from zero, expecting people to buy a new and more expensive (than if it didn't have the hardware unit) CPU to carry out a function their existing GPU already handles perfectly fine. I expect little appetite for that from the market.

    No, it doesn't. The PS5 CPU still has to handle all the other activities that need to be done when loading a game. It is these "CPU bottlenecks" along with the decompression remaining on the CPU that the GDC talk is most likely referring to.

    I don't think that needs to be assumed. While it likely was derived from a diagram where the block represented a separate switch, as others have pointed out, it could easily be used here to represent the data flowing through the PCIe root complex in the CPU but without any CPU intervention.

    It's not necessarily needed. If RTX-IO were somehow enabling P2P DMA through the root complex, the NVMe can use it's full bandwidth to transfer data to either direct to the GPU, or main memory, or both in any combination required.
     
    RootKit and PSman1700 like this.
  3. Remij

    Remij Regular

    Maybe Nvidia is privy to some knowledge that we aren't?

    Anyway... no sense in fighting over a diagram which in my mind was pretty clearly just a repurpose of their GPUDirect diagrams to illustrate a basic idea of how data could be routed in the future utilizing RTX I/O. RTX I/O could allow for anything... however, in its current form it's entirely dependent on DirectStorage, and we know precisely how DirectStorage is going to work in the short term.. so as it stands I say just take it as a general diagram of the optimal flow of data that RTX I/O could allow.
     
    dobwal and davis.anthony like this.
  4. Davros

    Davros Legend

    Except those who store their steam library on a nas
     
  5. Silent_Buddha

    Silent_Buddha Legend

    I'm waiting until I can move to 10 or 25 Gbit networking (with a NVME based NAS) in my home before doing that for most of my games. Although that said, I do have smaller, less load intensive games on a NAS.

    Although I guess you might just be storing them there, while I'm playing those less load intensive games directly off the NAS with the intention to move to most games being able to be played off of the NAS in the future with the NAS basically operating as my game drive.

    Regards,
    SB
     
    dobwal, PSman1700 and BRiT like this.
  6. Albuquerque

    Albuquerque Red-headed step child Veteran

    As someone who does high performance storage as a component of their job, I still see an outstanding problem that's hinted at in this picture here:
    [​IMG]

    Several of you in this thread have been picking on the storage being on the "other side" of the NIC in this diagram. By keeping the storage on the "other side" of a network link, we avoid all conversations about how a filesystem abstraction has to be handled. This diagram really, truly has no bearing on how a consumer PC is constructed and the complexity therein.

    Think about this: every file in a modern consumer-facing file system is a collection of hundreds, thousands, even millions of individual data blocks all mapped together by some sort of file system bitmap or journal index or similar. Basically, there's a master table somewhere in the filesystem inner workings which translates the operating systems' request for a file into the related zillions of pointers linking to the literal blocks inside a logical partition scheme. That single file access may very well live on multiple partitions simultaneously (spanned volumes in Windows are thing that does occur) and those partitions may span multiple underlying storage devices. These partitions then map downwards again into the physical storage layer, which sometimes has their own set of pointer remappings into even lower level storage devices (eg a RAID card.)

    We also must consider access and auditing controls built into modern filesystems. Are you permitted to read this file? If you do read this file, doesn't the last access time need to be updated in the file system metadata? Does the file access itself need to be logged for security reasons? Someone brought up disk encryption and TPM was hand-waved off into the solution, however TPM solutions presume whole disk encryption. As it turns out, file system encryption is very much a thing, and isn't linked to TPM-based whole disk encryption.

    All this to say: a native PCI-E transaction from disk to memory only works if you can fully map every single one of those pointers from the parent file descriptor at the file system level into the literal discrete blocks of physical storage attached to the PCIE bus, only after determining you're allowed to make that read, possibly in parallel with still having to update the file system metadata and logging needs, and assuming a file system encryption scheme (read also: Microsoft EFS) isn't being used.

    Flowing storage through a NIC as a data stream (presumably NVMEoF) completely removes all of this complexity.

    Transferring raw block storage into memory pages is actually quite simple. Making a GPU call translate into a full-stack file system read access is not the same at all.
     
  7. dobwal

    dobwal Legend

    I imagine that reads/writes through the file system is handled by DirectStorage and may not be a responsibility of RTX IO. However, GPUDirect makes access through the file system possible by enabling a distributed file system that runs in parallel to the OS managed system. GPUDirect has the cpu write commands to DMAs on the storage to drive data to and from the GPU. Nvidia states it minimizes interference with other commands that the CPU sends to the GPU.

    https://on-demand.gputechconf.com/s...-to-gpu-memory-alleviating-io-bottlenecks.pdf

    However, unless local game apps moving data through the cloud for rendering graphics actually becomes a thing, this isn’t all that relevant for consoles. But I can see a similar solution where apps that don’t need DS use the traditional file system and games using a DS system to create a low latency and more direct path from the SSD to the gpu.
     
    function, see colon, iroboto and 2 others like this.
  8. DSoup

    DSoup Series Soup Legend Subscriber

    I agree. Allowing any part of the graphics subsystem (GPU hardware, on-board controller, driver) to work around the filesystem's security/permissions model is unthinkable.
     
  9. nutball

    nutball Veteran Subscriber

    Yes I'm glad you've raised this. I've been thinking about the issue of peer-to-peer NVMe -> GPU transfers for a work project. My current conclusion is that the simplest solution is to treat the NVMe drives as raw block devices and write my own very simple file system to keep track of what is where. Having a GPU understand XFS, ext4 or ZFS seems like a bit of a stretch.

    I guess this sort of solution is also possible in a console, but on a general purpose computer it sounds ... complicated (and a potential security nightmare).
     
    Albuquerque likes this.
  10. Albuquerque

    Albuquerque Red-headed step child Veteran

    What this means, said another way, is GPUDirect would require its own reserved storage space and methods of access, unrelated to the "user" file system. For a machine that already exists and didn't already have this reservation built in, this means a lot of funky partition management business will need to happen to shrink the existing partition(s) to then create a new and physically contiguous partition for use by this GPUDirect functionality.

    I really can't see how Microsoft would push such a design into the commodity PC world. There's something else in this mix we aren't seeing yet...
     
    davis.anthony and nutball like this.
  11. DSoup

    DSoup Series Soup Legend Subscriber

    If there is a Windows/NTFS mechanism for this, I've never used or heard of it. That may be why both Nvidia and AMD have experimented with attaching a SSD direct to the GPU to augment onboard memory,.
     
  12. Albuquerque

    Albuquerque Red-headed step child Veteran

    It wouldn't be NTFS, that's at least part of the point. "Regular" file systems don't worry about file packing methods on the physical media; whatever this new tech is probably needs to consider it differently.

    Here's another characteristic of storage worth noting: modern physical storage devices are built in 4k blocks and have been for a decade or more. One of the reasons for this move was how difficult logical block addressing was getting; the old 512byte block sizes meant a single disk larger than 2TB needed internal LBA logic pointers bigger than 32 bits. Anyone who followed the tech at the time understood this obvious call out.

    But why did 4KB make sense for a modern storage block size?

    For multiple decades, operating systems have managed main memory in -- can you guess? -- 4KB chunks. Yet all modern operating systems offer another option called Huge Pages (Linux), Super Pages (BSD and MacOS) or Large Pages (Windows) to permit managing memory in far larger chunks, 2MB pages are the limit on BSD and Windows and a whopping 1GB page size is available in modern Linux distros. Aligning I/O request page sizes to memory page sizes is a significant efficiency play for "big IO" workloads and yet another example of how something like DirectIO / DirectStorage / GPUDirect / RTXIO needs even more non-trivial thought on how to accomplish the task set ahead of them.
     
    Shifty Geezer, davis.anthony and BRiT like this.
  13. DSoup

    DSoup Series Soup Legend Subscriber

    Ok, so not NTFS but my question remains. On mountable storage, Windows has ultimate authority about what processes access anything within the storage hierarchy. I am not aware of any Windows filesystem feature the allows Windows to release a portion of storage - bypassing the rest of the security/permissions model.

    Windows is going to need to have access in case the storage area needs maintenance, plus games need to be installed in the first place. You may wish to adjust the size parameters of the partition, or back it up.
     
  14. Albuquerque

    Albuquerque Red-headed step child Veteran

    Yup. Which is why I said earlier:
     
Loading...

Share This Page

Loading...