Digital Foundry Article Technical Discussion [2020]

Discussion in 'Console Technology' started by BRiT, Jan 1, 2020.

Thread Status:
Not open for further replies.
  1. PSman1700

    Legend

    Joined:
    Mar 22, 2019
    Messages:
    7,118
    Likes Received:
    3,090
    We have no data, i can counter with the ue5 demo running on a nvme/rtx3080q (2070rtx dgpu) laptop. It was supposedly a ssd showcasing tech demo. It ran better on the laptop (higher res).
     
    Cyan likes this.
  2. chris1515

    Legend

    Joined:
    Jul 24, 2005
    Messages:
    7,157
    Likes Received:
    7,965
    Location:
    Barcelona Spain
    Like I said before, there is three duplicated frames at the beginning of each purple portal. I suppose some latency between the I/O request and the moment the data are really loading. They don't need this three duplicated frames if the loading begins before.

    All of this are compatible with SSD technology, latency in the tenth to the hundredth milliseconds.:wink4: 8 to 9 GB of data load by second from one to 1.6 seconds for loading level.

    Spiderman level on PS5 demo were loading in 0.8 seconds but the RAM doubled on PS5. All is logic...:wink4:

    Again we don't know if it was the same quality and resolution has nothing to do with the SSD.
     
  3. techuse

    Veteran

    Joined:
    Feb 19, 2013
    Messages:
    1,425
    Likes Received:
    908
    This is not correct IIRC. No concrete performance info was given out about UE5 on anything other than a PS5. An Epic employee said it would run pretty good on a RTX2070 from a GPU standpoint. It was never shown running on a laptop. The laptop was playing back the video of the PS5 version.
     
  4. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,235
    Likes Received:
    4,259
    Location:
    Guess...
    The compression ratio of RTX IO is 2:1. That's perfectly in line with the XSX's BCPACK. I strongly suspect that's not coincidental.
     
    pharma, PSman1700 and BRiT like this.
  5. chris1515

    Legend

    Joined:
    Jul 24, 2005
    Messages:
    7,157
    Likes Received:
    7,965
    Location:
    Barcelona Spain
    Nvidia told this is best case compression. It will be often less than this. The compression ratio will vary from level to level depending of the set of textures. There is nothing as a unique compression ratio even in the same game.

     
    #1245 chris1515, Sep 5, 2020
    Last edited: Sep 5, 2020
  6. PSman1700

    Legend

    Joined:
    Mar 22, 2019
    Messages:
    7,118
    Likes Received:
    3,090
    RTX IO is the fastest ssd tech of the bunch now, yes its marketing but so was sonys. Ratchet doesnt provide any data, ue5 doesnt either but a pcie nvme laptop did the same thing at a higher fps.
     
    Cyan likes this.
  7. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    Interested in seeing the source for this.
     
    goonergaz likes this.
  8. Globalisateur

    Globalisateur Globby
    Veteran Subscriber

    Joined:
    Nov 6, 2013
    Messages:
    4,592
    Likes Received:
    3,411
    Location:
    France
    Not even on paper as PS5 best case is 22GB/s. Besides, there are no bottlenecks on PS5, there are still plenty on PC + RTX. In practice PS5 I/O will still be quite faster. Currently the loading are so quicks, almost too quick (from 0.8 sec to 1.6 sec) that many people are in denial because it can't be, the loading is happening before and after, that kind of thing.

    We'll have a better assessment with games like The Witcher 3 with its quite long loading times. It's going to be hard to deny anything then with such a fair comparison.

    Also finally don't forget that 22GB/s is using the custom hardware on PS5. They could have even better compression if they used the GPU shaders like RTX IO.
     
    DSoup likes this.
  9. Remij

    Regular

    Joined:
    May 3, 2008
    Messages:
    677
    Likes Received:
    1,256
    I would imagine the difference between them would be inconsequential past a point. PC's throughout the generation will have far more RAM and VRAM and could keep more in memory reducing the need to swap full sets of assets in and out at any given time. And why would they even bother using compute resources for decompression on PS5 if they have a dedicated hardware block? It's already plenty fast enough.. They're going to want to keep all those precious resources for the GPU. Also, with RTX I/O since the decompression is done with the shader cores, there's plenty of resources to decompress assets.. which Nvidia has already stated has a negligible performance hit. It should also scale as more powerful GPUs and faster storage drives are released. There's also a chance that at some point Nvidia and AMD could include a dedicated decompression block right on the GPU in the future as well.

    That of course doesn't change the fact that PS5's implementation is more efficient... but efficiency doesn't always mean "faster" or "better".
     
    PSman1700, BRiT and pharma like this.
  10. DSoup

    DSoup Series Soup
    Legend Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    16,777
    Likes Received:
    12,691
    Location:
    London, UK
    I could have done with Nvidia's slide PC architecture bottlenecks about six months back to the folks who struggled to understand it (or even believe it). Unless I'm missing something, RTX I/O only solves half the problem and requires a new implementation to game data structuring issue to achieve that.

    For compressed data read off storage that is for sole use by the GPU (textures, geometry, shaders, data for compute tasks etc), rather than the data flow being 1) storage, 2) bus, 3) main memory, 4) CPU (unpacking), 5) bus, and 6) GPU/VRAM that data flow can now be 1) storage, 2) bus, 3) GPU/VRAM.

    But for compressed data read off storage that is for sole use by the CPU, or is needed by both CPU and GPU like geometry data where AI, collision detection and any other interactions are handled by the CPU, you're still waiting on the CPU - which will have had some load lifted. I think the conundrum comes to how games, or rather installers, package data. If you have a supported Geforce RTX card (this only shows the GeForce 30xx series on Nvidia's site but surely must include first generation RTX cards), a PCIe 4.x board and fast NVMe drive all of your games ever released still have all the data shoved together in one pack, stored for the CPU to pick apart. You know need GPU-only and CPU-only data stored separately so they can be routed to the appropriate RAM pool right up front.

    If this takes off, will this result in a proliferation in games patches that re-organize game data to support Nvidia's brand of DirectStorage? What if AMD come up with a different implementation?


    I like Nvidia's approach but you only do so much when most games store all their data together and the CPU and GPU have their own RAM pools. Once games start to support it, it should be an overall win but it's still a distant step from the simplified, unified architecture of nextgen consoles that don't have this problem to solve. This is an architectural design approach when one model's advantages is the other model's disadvantages and vica-versa.
     
    #1250 DSoup, Sep 5, 2020
    Last edited: Sep 5, 2020
    chris1515, Jawed and Lalaland like this.
  11. Rootax

    Veteran

    Joined:
    Jan 2, 2006
    Messages:
    2,400
    Likes Received:
    1,845
    Location:
    France
    My understanding was DirectStorage was sort of a api / "norm", like Direct3D ? So, if nvidia solution utilise DirectStorage, and AMD too, it should not be a problem for the devs since for them they only need to make it work for DirectStorage ?
     
  12. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,235
    Likes Received:
    4,259
    Location:
    Guess...
    That's the peak rate of the decompression block, relevant to extreme corner cases only, not to the average.The average is 8-9GB/s, Sony have been unambiguous about that.

    So you're saying that Sony's 10TF GPU could outperform the 22GB/s of it's hardware block but for some reason a 30TF Ampere couldn't?

    I assume you have detailed insider knowledge of exactly how RTX IO and Direct Storage work in order to know this. Would you care to share the details?

    It depends how much they've re-architectured the game to take advantage of the new IO paradigms. Sony certainly has a better chance of that but even Cerny has said that games won't automatically benefit from ultra fast loading times because older software simply isn't designed with these new paradigms in mind.
     
    tinokun, Cyan, BRiT and 1 other person like this.
  13. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,235
    Likes Received:
    4,259
    Location:
    Guess...
    I seem to remember some members struggling to believe something like RTX IO was even possible on a PC ;)

    Yes, they've specifically stated that games need to be "Direct Storage enabled" to take advantage of this. The good news is that PC's don't need to be Direct Storage capable in order to run Direct Storage enabled games. So developers can use it without worrying about whether a PC can run it or not. That should help adoption significantly.

    It'll be interesting to see how RTX IO handles this. i.e. does everything go through the GPU for decompression first and then get doled out to the CPU or GPU as required? Or does it work as you say with the CPU handling the decompression of it's own data? Nvidia's claims of the overhead reduction suggest the former but if it's the latter then you're still talking about an 80%+ reduction in the load on the CPU (typical percentage of streamed game content made up of textures according to Microsoft).

    PCIe 4.x isn't a requirement, and yes Turing is also supported. Time for me to upgrade!

    Can you have your own brand of Direct Storage? The whole point of an API like this is so that any game that supports it will run on any hardware that supports it, regardless of how it's implemented. I'm sure AMD will have their own implementation of Direst Storage but I don't expect games to have to cater to one or the other.
     
  14. DSoup

    DSoup Series Soup
    Legend Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    16,777
    Likes Received:
    12,691
    Location:
    London, UK
    Then why are Nvidia branding this as Nvidia RTX I/O and not just DirectStorage? Perhaps it's just marketing. But some standard needs adhering too for interoperability wit the GPU. Maybe this is the equivalent of Nvidia API extensions for DirectStorage.

    Touché ;-). I know this is in jest but it's solving half of the problem but the lifting the load will help in equal amounts I'd have thought.

    This may take a while to gain much traction, or supporting will transition in over time. But you have to start somewhere. Direct3D took a while to gain traction. I'd expect DirectStorage 1.0 to be the beginning of a more comprehensive solution which will require further tweaks to the PC's architectural arrangement.

    This will be interesting to watch. By flipping the role of who decompresses, the worst case scenario should be that it is no worse than situation now (with the CPU taking this role), but better because the GPU is faster at decompressing but decompression is half of the equation, it depends how much of the data you decompressed needs to be in the other RAM pool. Having literally no data at all to look at, I would be really quite surprised if more data was required by the CPU/man RAM than the GPU/VRAM. We know how massive geometry and texture data is. There may be some edge cases but this surely also has to be a win.

    But there is a question of how much data in existing games is packed optimally for the GPU decompressors. It's not always the case that textures are here, shaders are there, geometry data is here - I've seen data packed in very weird ways, some games pack shader data and other graphics tech into the world geometry data (thank you Witcher 3 and Infamous Second Son that's genius :nope:)

    PCIe 4.x isn't a technical requirement but without it you're losing a lot of potential bandwidth without it.

    Many of the DirectX set of APIs have a core set, plus a method of extension. You want an API to set the standard but not limit future hardware. Nvidia and AMD-specific graphics extensions have been common on graphics cards for many years remember. :yes:
     
    Pete, PSman1700 and pjbliverpool like this.
  15. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,235
    Likes Received:
    4,259
    Location:
    Guess...
    I imagine in stating "best case compression" they're referring to the physical speed of the drive rather than the compression ratio. I'm fairly sure I've seen them state elsewhere that the typical compression ratio is 2:1 (just like MS claims for BCPACK) but that will only achieve 14GB/s on a best case 7GB/s PCIe4 NVMe drive.
     
    BRiT and PSman1700 like this.
  16. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,235
    Likes Received:
    4,259
    Location:
    Guess...
    Microsoft claim 80% of streamed game data is textures. So yes, certainly seems like a big win.

    My guess would be none as Nvidia have already stated a game needs to be Direct Storage compatible to work with RTX IO. Looks like we're looking at a whole new paradigm for developers.

    Fair point, it'll be interesting to see how that turns out. It'd certainly be a massive handicap if games have to specifically support RTX IO rather than just Direct Storage. I don't see it going that way personally but it does seem to be a possibility.
     
  17. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,235
    Likes Received:
    4,259
    Location:
    Guess...
    This is particularly interesting and possibly gives us a hint at the peak performance of RTX IO.

    Andrew Goossen told Digital Foundry that the XSX was doing work equivalent to 5x Zen2 cores at the drives max speed (2.4GB/s) if Direct Storage and the hardware decompression block weren't involved. That comes out at 480MB/s per CPU core. Nvidia's claims of needing 14 CPU cores to do the same for a 7GB/s drive fall right in line with that (anyone else wonder if they're singling from the same hymn book?).

    RTX IO is 3.33x faster than 24 Zen 2 cores in this example. Or the equivalent of 80(!) Zen 2 cores. 80*480MB/s = 38.4GB/s.


    Actually since only 14 24 cores is required to max out the decompression requirements of a 7GB/s drive, the above makes no sense since a 24 core threadripper should already completely remove the decompression bottleneck unless they're using some kind of crazy RAID setup to achieve 23GB/s+ SSD throughput. So perhaps what we're seeing here is the non-decompression related benefits of RTX IO / Direct Storage.
     
    #1257 pjbliverpool, Sep 5, 2020
    Last edited: Sep 5, 2020
    Cyan and PSman1700 like this.
  18. Osamar

    Newcomer

    Joined:
    Sep 19, 2006
    Messages:
    231
    Likes Received:
    43
    Location:
    40,00ºN - 00,00ºE
    What I understand is that RTX I/O is the marketing name to the changes/drivers Nvidia have made to be DirectStorage compatible.
     
    Cyan, Man from Atlantis, BRiT and 3 others like this.
  19. DSoup

    DSoup Series Soup
    Legend Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    16,777
    Likes Received:
    12,691
    Location:
    London, UK
    Yup, and normally this would be an ask for developers - asking them to change the way data is organised, structured and compressed. If Nvidia had rolled this out in isolation I'd be skeptical of it's adoption (speaking as a 2080Ti owner) but with this benefiting nextgen consoles as well. Hopefully devs will make the effort to embrace the paradigm shift.

    It is a shame that few existing games will benefit but we don't yet know what performance leaps console games will get on nextgen hardware either. Given their dog slow HDDs I'd expect massive leaps forward but equally I'm not expecting miracles.
     
  20. Rootax

    Veteran

    Joined:
    Jan 2, 2006
    Messages:
    2,400
    Likes Received:
    1,845
    Location:
    France
    About the pcie gen3 or gen4 and RTX I/O, I guess even with gen3 it's great. Yeah the max bandwitdh is lower, but you can still benefit from a lower cpu usage. It's equally important imo.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...