Next-Generation NVMe SSD and I/O Technology [PC, PS5, XBSX|S]

Discussion in 'Console Technology' started by Shortbread, Sep 18, 2020.

  1. Mihailjones

    Newcomer

    Joined:
    May 3, 2017
    Messages:
    82
    Likes Received:
    62
    Xbox SSD is "slow" so there have been as fast and faster ones for years already.

    So speed isnt an issue to find as fast or much faster, compatibility is. Someone probably tests soon what happens with different SSD
     
  2. mpg1

    Veteran Newcomer

    Joined:
    Mar 5, 2015
    Messages:
    2,248
    Likes Received:
    1,989
    The removable SSD is probably good from a console repairabilty perspective, but I'm not sure there is much of an advantage from a consumer perspective. That form factor of M.2 isn't widely available....
     
  3. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    18,264
    Likes Received:
    20,003
    Yeah, not much from consumer perspective except for having lower overall product price; it's said Microsoft uses it in their Surface products, so that should net them better prices.
     
  4. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,446
    Likes Received:
    2,626
    Location:
    Guess...
    On a typical Zen 2 system you're likely to be looking at around 51.2GB/s between the CPU and system RAM. CPU to GPU bandwidth is about 14-15GB/s in each direction over PCI3.0. Over PCIe4.0 it's double that so around 30GB/s in a single direction.

    So I wouldn't really describe it as a tiny fraction. Granted latency will be rubbish compared to local memory but we're not talking about the CPU rendering out of vram here, we're talking about the time to copy what is likely a few hundred MB of game data from where is is decompressed in VRAM to where it needs to be in main RAM for the CPU to work on it (assuming that's how DirectStorage/RTX-IO even works which is far from given). And all this is to be done at a loading screen so we're talking about timescales in full seconds, not the micro seconds of latency that are added by having to work over a PCIe bus rather than from local memory.

    You have 4 channels coming from SSD to GPU via the SSD<->CPU link and then the CPU<->GPU PCI link. So essentially 4 of your 16 channels from CPU->GPU are taken up by that. You still have 16 channels going back from GPU to CPU to move any data that needs to be in system RAM back. Given that data is now decompressed and thus potentially twice as large as when it came over, those 16 channels should still be double what you need to keep up with the maximum speed from the SSD into the GPU. Not that you're likely to need anything like that maximum speed as the data required by the CPU would only be a very small proportion of the total data streaming in from the SSD. MS say 80% of streamed game data is textures, so at the very most you're only looking at 20% of what you stream to the GPU having to go back over the 16x PCIe bus into main memory.

    As an example, lets say at the game load you need to pull 10GB off the SSD and into memory. 2GB of that is for the CPU and 8GB is for the GPU. To keep things simple lets say you have a 5GB/s SSD with an effective throughput of 10GB/s with compression.

    Provided you load and decompress the CPU data first, that will be in VRAM and decompressed in the first 0.4 seconds. You then push that back over the CPU<->GPU PCI link (4GB of it now) at a rate of ~30GB/s, so it takes about 0.13 seconds to put that decompressed data into system RAM. Meanwhile you're still spending the next 1.6 seconds bringing in the remainder of the GPU data into VRAM from the SSD.

    So I'm not seeing why the PCIe bridge between CPU and GPU is acting as a bottleneck in any way in this scenario. Even if you didn't transfer the CPU data from SSD first and push it back in parallel to streaming the remainder of the GPU data from the SSD, you're still adding at worse 0.13 seconds to your 2 second timeframe.

    You're doing this at a load/transition screen. Why would PCIe traffic between the CPU and GPU be heavily utilised at that point by audio and networking? More to the point, why would that be impacting the CPU<->GPU PCI link at all? Those functions sit on the south bridge which would have their own separate PCI link to the CPU.
     
    thicc_gaf, BRiT, turkey and 2 others like this.
  5. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    14,402
    Likes Received:
    10,447
    Location:
    London, UK
    I think your whole post is based on the I/O consistently achieving close to it's theoretical maximum throughput on modern hardware with data being optimally arranged that it needs only one PCIe exchange (SSD to GPU) or two (SSD to GPU to CPU) at most, which may be the case for some games package in future to work on this exact arrangement, but not every gaming shipped to date, nor shipping in the next 6-12 months. And probably not many after that isf Ubisoft's lazy arse port of Valhalla on nextgen consoles is anything to go by.

    Let's see if this is actually the case some DirectStorage's released.

    What else is your computer doing?
     
    BoardBonobo likes this.
  6. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,446
    Likes Received:
    2,626
    Location:
    Guess...
    Yes agreed, my post was just to illustrate the lack of hardware bottlenecks asscoiated with the CPU<->GPU PCIe link in this scenario (game loading/fast travel). It's down to the developers and DirectStorage to make the best use of the hardware and to enable anything close to these maximum throughput levels. Although I expect the SSD transfer speeds to be more susceptible to not hitting near their peak rates than transfers between VRAM and system RAM over the PCIe link.


    No doubt plenty, but in terms of moving Gigabytes of data across these specific links during a game loading screen I wouldn't expect there to be anything that would be heavily impacted. Taking my example above on a PCIe4 system you still have more than a full PCIe3 16x link of free bandwidth between the CPU and GPU even when the SSD is running at full pelt and data is being transferred back from vram to system RAM so there's more than enough to spare.
     
    PSman1700 likes this.
  7. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    3,346
    Likes Received:
    2,629
    XSX SSD format and layout


    On the surface of it, shouldn't be much trouble upgrading to a larger size.
    Especially be interesting for XSS in future.

    Biggest headache is probably getting to them in the console.
     
    turkey and chris1515 like this.
  8. Vega86

    Newcomer

    Joined:
    Sep 25, 2018
    Messages:
    135
    Likes Received:
    94
    Where do the dedicated ssd decompressors reside? GPU for both systems? How are they different from each other?
     
  9. liams

    Newcomer

    Joined:
    Jul 1, 2020
    Messages:
    181
    Likes Received:
    169
    Both have dedicated decompression hardware. For the PlayStation we know that its a separate block on the soc. For the xbox I don't think we know exactly where it is, but most likely as a separate piece aswell. And by separate I don't mean like a separate chip, just a specific portion of the silicon is dedicated to it. When they are talking about hardware decompression they both aren't talking about gpu based decompression.
     
    thicc_gaf, Vega86 and PSman1700 like this.
  10. glow

    Newcomer

    Joined:
    May 6, 2019
    Messages:
    32
    Likes Received:
    27
    I would guess in the SoC itself. I don't know if decompression happens before or after decryption, but either way, for the PS5, they don't have enough PCIe lanes to handle it off SoC (at least, a M.2 slot doesn't have enough PCIe lanes to do it), and the Xbox doesn't seem to either. MSFT claims 8 PCIe 4.0 lanes on the SoC itself. Not all are used at gen4 speeds, since 1 lane goes directly to the GbE controller (RTL8111HM - I am assuming it is a RTL8111H MAC+PHY with the extra M standing for Microsoft) and what looks like 2 lanes to the southbridge. This leaves 5 lanes, of which two are taken by the internal SSD and I'm guessing two by the external slot. Either way, not enough leftover lands to support an external decompression path.
     
    Vega86 likes this.
  11. Vega86

    Newcomer

    Joined:
    Sep 25, 2018
    Messages:
    135
    Likes Received:
    94
    didnt ms do a presscon for their chip? no info from there on what part it resides in? how certain is xbox not using gpu based decompression?
     
  12. turkey

    Veteran Newcomer

    Joined:
    Oct 21, 2014
    Messages:
    1,086
    Likes Received:
    866
    Location:
    London
    These will be on the Apu for cost and security reasons.

    Sony confirmed the IO is on the apu

    upload_2020-12-6_9-16-38.png

    MS have decompression and decryption on the Apu

    upload_2020-12-6_9-19-24.png
     
  13. liams

    Newcomer

    Joined:
    Jul 1, 2020
    Messages:
    181
    Likes Received:
    169

    Hey, the hot chips slides had a leak on it!

    They only announced Microsoft pluton, which is a hardware security processor that's going to get built into future cpus, last month.
    But it's right there under the Security and Decompression bullet point, with HSP/Pluton
     
    turkey and BRiT like this.
  14. glow

    Newcomer

    Joined:
    May 6, 2019
    Messages:
    32
    Likes Received:
    27
    Pluton has been public and acknowledged by MSFT officials since at least last year, and probably before that. Basically before Hot Chips 32. At least one MSFT security engineer has described the work they did on the Xbox One as the genesis of AMD PSP and Pluton.
     
    liams, turkey and BRiT like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...