In all the history of GPUs, or any larger chip really, there are very few instances of fixed-function provision of decompression functionality. Basically all but one have been added after decades of plannung or use of the same coding scheme. The lossy block-codes are naturally consortium driven, for a variety of reasons, among them patent issues, as well as longevity concerns. That's why these procedures are normally driven by standard bodies like the ISO or Khronos. STC, ETC and ASTC are good examples of this. As well as video codes using arithmetic coders or whatnot. This is a lesson from the arithmetic coder wars of the 80s and 90.
Other lossless variants are almost always only used when being in embedded systems. The GPU in a PS4 is embedded in that context, it can't change, it's always there; and likewise products on the PS4 are sort of embedded software, because it only works on that machine. In this context you can do whatever you as a product producer want, because you own it, and it'll never change. This is why Kraken is possible (that easily), and on other chips LZ77 or Huffman or whatever you fancy.
Because the creator of stuff is responsible for the encoding of data, it's enormously difficult to introduce changes to a system wired in a fixed way. While there is the possibility to transcode from one coding to the next, I've never heard of transcoding hardware really (zip to rar? huffman to arithmetic?). The problem in general is, that encoding is a at times very hard problem and a general purpose CPU is perfectly suited to do it in within a couple of hours, with often large memory use. Not the stuff for a hardware block.
A graphics card is replaceable, and people tend to replace it fairly often, so whatever scheme you come up with, should *never* change. At all. Because you can't transcode later realistically, or you just question the usefulness of a proprietary scheme, when on most of the replaced hardware it falls back to the general purpose CPU.
GPU decompression schemes have been around since GPGPU became a term, unsurprisingly the "G" for
general reflect here in that there were actual transcoders run on GPUs (JPEG-XR to BTC in Rage). Huffmans and arithmetic coders are trivially runnable in GB/s excess on a computer shader since a long while. It's an easy to investigate topic.
Now, I see basically no reason, why a developer would go through the hassle of compression data offline in a proprietary scheme, which might disappear with the next card, or become utterly bad relative to new inventions. Nobody wants to be locked in on a non-embedded platform. Nvidia developers knows this very well, their not amateurs and have intimate knowledge about the tradeoffs and the history of compression schemes in chips (they are part of it). Even something as innocent as a (hypothetical or accidentially guessed
) NVlink peer-to-peer compression scheme, has major ramifications for product design, in regards to compatibility and maintenance aspects.
Any type of compression scheme on something like a PC, would either go through a standard body a decade before it's in a chip, or it would be in software. Whatever compute based solution is shipped with a game is
embedded in that product, so if you don't update your product, you are free of any side-effects. If you do, all the same applies, but at least it's easy to transcode, because it's all software.
The GPU compressor Nvidia mentions, might be an original invention (and might even only be practical with tensor core are whatnot), but it is unlikely anything done with dedicated hardware units; and is entirely embedded in the products using it.