davis.anthony
Veteran
It's the loaded file size (likely original uncompressed dataset size), 5.65 GB, divided by the decompression time, 0.8s and 2.36s respectively, giving a GB/s final data provided per second. At 100% CPU utilisation it can decompress 2.4 GBs. The GPU uses 15% CPU to decompress 7 GB/s. No mention of GPU utilisation or power draw though, which would be interesting comparison-points espeically in relation to the value of custom decompression hardware.
Edit: Is also worth noting the text describes it as a 'highly optimised sample', implying a best case comparison rather than general case. Oh, and there's no mention of the GPU used either! Very little comparative data and this is just a nice preview of the potential and indicator that, at some level at least, PC should scale okay with next-gen storage at the raw data level.
I thinking the bandwidth cost could be the bigger consumer of GPU resources rather than the decompression task itself as that'll like be done with a-sync.
3GB of compressed assets in VRAM couldn't fit in to GPU at once so will need to be decompressed in chunks which will mean larger write cost than read.
Read 1MB in to GPU cache > Decompress > 2MB output written to VRAM
Likely won't be a problem for the monster GPU's that have a tone of bandwidth but to the GPU's on smaller 192bit busses with lower bandwidth those few GB/s might be precious for actual rendering performance, especially if RT is on.
And not disclosing the specs I'm not surprised about, they've never been particularly open about Direct Storage.
Last edited: