DirectStorage GPU Decompression/ RTX IO

Discussion in 'Rendering Technology and APIs' started by DavidGraham, Apr 21, 2021.

  1. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,535
    Likes Received:
    4,195
    What are the chances NVIDIA is intending to use their myriad of Tensor cores to do the decompression heavy lifting? It would make sense since they are expanding the support to all RTX GPUs (Turing + Ampere), and Tensor cores are only found on RTX GPUs, they also set mostly idle unless DLSS is engaged, and even with that, they work for a fraction of a second after each frame.
     
    PSman1700 likes this.
  2. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,242
    Likes Received:
    1,675
    Location:
    msk.ru/spb.ru
    I doubt that tensor cores would be able to do decompression due to both their precision limitations and the fact that they are matrix multiplication units.
     
    DmitryKo, chris1515 and iroboto like this.
  3. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    902
    Likes Received:
    1,076
    Location:
    55°38′33″ N, 37°28′37″ E
    Tensor cores use reduced-precision floating point processing, intended for training of neural networks.
    https://blogs.nvidia.com/blog/2016/08/22/difference-deep-learning-training-inference-ai/

    But LZ-family compression algorithms treat the data as ASCII text (8-bit integer), and decoding is a simple dictionary look-up. There's no use for reduced-precision arithmetic during decompression.


    Now neural network algorithms can be used during the encoding process to improve compression ratios, as they can discover additional repeating patterns with a larger dictionary - at the cost of reducing the processing bandwidth by an order of magnitude.
     
    #23 DmitryKo, Apr 22, 2021
    Last edited: Apr 22, 2021
    BRiT likes this.
  4. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,242
    Likes Received:
    1,675
    Location:
    msk.ru/spb.ru
    They actually do support INT8 and INT4 at double the rates from FP16. But I doubt that their MM nature would fit the decompression algorithm.
     
  5. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    3,531
    Likes Received:
    2,861
    I also wonder if the new type of compression is dictionary based, as wouldn't that make it harder to work well on GPUs?

    It would make sense that it's BCPack, but the fact that they went out of their way to not say BCPack muddies the waters for me.

    How much details do we have on BCPack?
     
    #25 Jay, Apr 22, 2021
    Last edited: Apr 22, 2021
  6. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,819
    Likes Received:
    3,976
    Location:
    Finland
    What exactly do you consider fast? For example RTX has achieved 10% share of NVIDIA GPUs in little over a 2 years, is that fast?
    Often comments like these are caused by tech forums giving wrong kind of image of how most people update and upgrade hardware.
     
  7. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,242
    Likes Received:
    1,675
    Location:
    msk.ru/spb.ru
    Most games will still be targeting previous gen consoles h/w by the time PC will get parts with dedicated decompression units. I consider this "fast".
     
    PSman1700 likes this.
  8. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    609
    Likes Received:
    320
    what about having automatic programmable xpress8K compression? So far since it is NOT part on NTFS re-compression after write must be done manually (could be done by OS? So far it's not the case). Other compression methods outside xpress4K for slower CPUs is about meh (xpress16 and lzx have worst compression/overhead ratio).
     
  9. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    902
    Likes Received:
    1,076
    Location:
    55°38′33″ N, 37°28′37″ E
    The DEFLATE algorithm - i.e. LZ77/LZSS+Huffman coding, used in the ZIP format (RFC 1951) - is already a combination of dictionary coding and entropy coding (Huffman). Dictionary coder works well with repeating patterns of bytes (i.e. text data), while entropy coding uses a smaller 'prefix code', a 3-4 bit integer, to encode bytes with high occurrence.
    Modern lossless image and audio compression formats also use entropy coding, specifically arithmetic coding.

    Other lossless compression methods are run-length encoding (RLE), which only works for streams of repetitive data, and wavelet encoding which is good in decoding transients (i.e. audio signals and smooth gradients) but is computationally expensive. These are not supported by DEFLATE though.

    Still not much. GameStack's DirectStorage for Windows session featured a slide where texture compression method is described as DEFLATE over standard BC (i.e. S3TC/DXTC); they didn't specifically name it BCPACK though (see 11:50 time mark in the video, and slide #12 in the PDF file).

    I'd still stand by my earlier assumption that BCPACK uses a two-stage process similar to Oodle Leviathan/Kraken, i.e. the LZ-family compression pass over a "lossless transform" step, which reorders bytes in a BCn texture to improve compression ratio of LZ pass. It could also include an improved lossy texture compression algorithm, similar to Oodle Texture Rate Distortion Optimization (RDO) processing, with finer control of quality/compression ratios, which decodes to BCn formats.
     
    #29 DmitryKo, Apr 23, 2021
    Last edited: Apr 23, 2021
    Krteq, DegustatoR and BRiT like this.
  10. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    902
    Likes Received:
    1,076
    Location:
    55°38′33″ N, 37°28′37″ E
    We've discussed this earlier in the DirectStorage for Windows thread.

    CompactOS compression is currently implemented as a file system filter driver in the Windows I/O Manager stack. It's not just a command-line tool - actually it's available to any Windows application through a well-documented 'Compression API'. Yes, it's true that any write will decompress the file, but game assets only need to be written once during installation process (or by game updates), and new user data can be re-compressed by the application using the Compression API.

    As an added benefit, CompactOS uses contiguous write allocations, so the compressed file is not fragmented and is written in one single chunk (provided there is enough free space on the disk) - whereas the older NTFS cluster-based compression results in a very heavily fragmented file (and it only works with 4K clusters, even though up to 64 KB, and recently up to 2 MB, have been supported by NTFS).


    I believe the CompactOS filter driver could be refactored into using GPU compute (or a hardware decoder) to process the data directly in the GPU memory, using some 'fast path' in the I/O Manager stack. But it remains to be seen if actual Windows implementation of DirectStorage includes improvements to CompactOS.
     
    #30 DmitryKo, Apr 23, 2021
    Last edited: Apr 23, 2021
    DegustatoR, Alessio1989 and BRiT like this.
  11. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    609
    Likes Received:
    320
    Would be nice to have automatically re-compression even on those folders that are not directly controlled by a digital distribution client like steam/gog/egs etc.. A lot of game would gains something, at least on main loading times, if not on almost all resource loading from disk.Unfortunately this is only possible with the old and inefficient NTFS compression.
    Hardware decoder would be sweet for xpress16K or lzx, but for xpress4k and xpress8k even on low-tier cpu we get some improvements.
     
    #31 Alessio1989, Apr 23, 2021
    Last edited: Apr 23, 2021
Loading...
Similar Threads - DirectStorage Decompression
  1. Kelemit
    Replies:
    20
    Views:
    3,461
  2. Dave Baumann
    Replies:
    42
    Views:
    6,122

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...