Just wanted to point out something. The thread started out about Shader Compilation now it's moved on to decompresion
do we even know how direct storage even works? security models ? file systems ? access methods etc?
Seems a bit premature to make such grand superiority statements when Driver/OS/security play can have such a large impact on performance
do we even know how direct storage even works? security models ? file systems ? access methods etc?
Seems a bit premature to make such grand superiority statements when Driver/OS/security play can have such a large impact on performance
No. Nvidia did not specify the algorithm or the maximum bandwidth, but if they're using LZ-family block compression, CUDA-based libraries are typically reaching a few GByte/s according to academic papers, so 28 GByte/s would be too high even when you consume the entire GPU, and not just 'a fraction of GPU'.
The entire idea of DirectStorage is to free the CPU from loading and decompression tasks by streaming the data directly to video memory as fast as possible and using dedicated hardware chip (on the Xbox) or compute units (on the PC) - which means their block compression algorithm has to be designed for simplicity and low decompression overhead, not for best possible compression efficiency or processing bandwidth. Not sure why it is so hard to understand.
do we even know how direct storage even works? security models ? file systems ? access methods etc?
Seems a bit premature to make such grand superiority statements when Driver/OS/security play can have such a large impact on performance
No word on Silicon Motion SM2264 and SM2267 products though ADATA showed a few prototypes at CES-2020
The updated ASUS Hyper M.2 x16 Gen4 cardGigabyte AORUS Gen4 AIC Adaptor
GC-4XM2G4
https://www.gigabyte.com/Solid-State-Drive/AORUS-Gen4-AIC-Adaptor
XBTC/BCPack in the Xbox Series X SDK tools, just like the Oodle RDO and BC7Prep in PlayStation 5, is not a general-purpose lossless data compression algorithm - it's a lossy texture compression format similar to the S3TC/DXTn/BCn texture compression which is strictly limited to specific resource formats, unlike the Oodle Kraken algorithm licensed for PlayStation 5 SSD controller which is a LZMA-family decoder that runs solely on the CPU threads and only goes up to about 1 Gbyte/s.interestingly matching BCPACK (which is not art of the LZ-family)
These numbers come from the RTX IO slides above, but 14 Gbyte/s with 2:1 compression ratio using only 32 compute cores (<1% of the total) just doesn't add up for a lossless data compression algorithm working on binary (non-text) data, like textures, normal maps, and geometry/meshes.14GB/s output is explicitly stated as something they can exceed and which consumes a very small percentage of GPU resources.
Compression efficiency is 2:1 if that's what you're referring to
High coding efficiency (compression rate) requires high complexity; vice versa, low complexity will result in low compression rate. It's that simple.In terms of processing bandwidth/requirements. I'd say that would be critical considering whatever the decompression costs is, is subtracted from your rendering capabilities. So efficiency in terms of processing requirements would have to be a high priority.
XBTC / BCPack, just like BC7Prep, is not a general-purpose lossless data compression algorithm - it's a lossy texture compression format similar to the S3TC (BC/BCH),
so that's strictly limited to specific resource formats - unlike the Oodle Kraken algorithm licensed for PlayStation 5, which is a LZMA-family decoder that runs solely on the CPU and only goes up to several megabytes per second.
These numbers come from the botched RTX IO slides, but 14 Gbyte/s with 2:1 compression ratio using only 32 compute cores (<1% of the total) just doesn't add up for a lossless data compresion algorithm working on binary (non-text) data, like textures, normal maps, and geometry/meshes.
It's a lossless transform for BC7 compressed textures, and BCn algorithms are lossy. Have you read the links for Ooodle Texture tools posted above by @BRiT?But BC7Prep is lossless
BCPack is a block texture compression used by Xbox Texture Compressor (XBTC) tool - what's the point of making the compression algorithm lossless when it needs to produce textures in lossy DXTn/BCn compression formats for the hardware TMUs to consume?I've not seen any confirmation as to whether BCPACK is lossy or not
I'd guess BCn textures would be further processed by LZ compression. For example in Oodle tools, the original RGB texture resources are first encoded by the lossy BCn compression tools, which use RDO (rate-distortion optimisation) metrics to trade off some image quality for improved compression ratios. Then the entire game assets are further compressed by LZ-family lossless data compression (Oddle Kraken/Mermaid/Selkie), and specifically BC7-compressed resources can be additionally rearranged by BC7Prep to improve LZ-family compression ratios.I think it works on all BCn formats which make up the vast majority of a modern games texture set as far as I'm aware. Non-texture data (around 20% of total) is handled by zlib in the XSX decompression unit I believe.
Again, without giving any details about the compression algorithm(s) used. So far only the names of BCPack block texture compression algorithm and Xbox Texture Compressor tool were disclosed, but it's not known how DirectStorage handles LZ format decoding on the PC - whereas we know that Sony licensed the entire Oodle Texture Compression (RDO/BC7Prep) and Oodle Data Compression (Kraken) toolset for the PlayStation 5 developer kit, and included a hardware Kraken decompressor chip in the disk I/O data path.Nvidia has been explicit about the 2:1 compression ratio which would result in 14GB/s on the fastest PCIe 4.0 drives
Just an approximation of <1% GPU compute resources.where the 32 compute cores come from?
It's a lossless transform for BC7 compressed textures, and BCn algorithms are lossy.
BCPack is a block texture compression used by Xbox Texture Compressor (XBTC) tool - what's the point of making the compression algorithm lossless when it needs to produce textures in lossy DXTn/BCn compression formats for the hardware TMUs to consume?
I'd guess BCn textures would be further processed by LZ compression. For example in Oodle tools, the original RGB texture resources are first encoded by the lossy BCn compression tools, which use RDO (rate-distortion optimisation) metrics to trade off some image quality for improved compression ratios. Then the entire game assets are further compressed by LZ-family lossless data compression (Oddle Kraken/Mermain/Selkie), and specifically BC7-compressed resources can be additionally rearranged by BC7Prep to improve LZ-family compression ratios.
Again, without giving any details about the compression algorithm(s) used.
So far only the names of BCPack block texture compression algorithm and Xbox Texture Compressor tool were disclosed, but it's not known how DirectStorage handles LZ format decoding on the PC - whereas we know that Sony licensed the entire Oodle Texture Compression (RDO/BC7Prep) and Oodle Data Compression (Kraken) toolset for the PlayStation 5 developer kit, and included a hardware Kraken decompressor chip in the disk I/O data path.
My point was that BC7Prep performance at 100+ Gbyte/s is not relevant for assesment of RTX IO, because BC7Prep is not a compression algorithm at all.Yes, so as I said, the BC7Prep part is lossless.
But these three technologies (RDO-aware BCn lossy texture compression, BC7Prep transform, and LZ-family lossless data compression) work in accord with each other - RDO texture compression is aware of both image quality and compression ratio metrics, so it can choose a color encoding that makes the resulting BCn texture more compressible by the LZ family algorithms, while BC7Prep rearranges BC7 data format to additionally increase the compression ratio of the LZ pass.The fact that it's working to further compress an already lossy compressed texture isn't relevant as ... Kraken would also further losslessly compress a lossy compressed BC7 texture.
just like Kraken, BCPACK is further compressing already lossy compressed BCn textures to reduce their size
BCPACK only works on BCn textures whereas Kraken works on all files types, while BCPACK gets a much higher compression rate.
It's not used instead of BCn, it's used on top of it.
My point exactly. We know how LZ/DEFLATE compression works on the consoles - i.e. using hardware decompressor in the I/O controller - but PC/Windows implementation is still a mystery.It'll certainly be interesting to learn what's going on here when the information is made public.
does RTXIO support general compression (LZ family) routines on the GPU for decompression of all data types or is it purely for texture data only (perhaps using BCPACK)? And if RTXIO does support general compression routines, does it decompress everything streamed from the SSD and pass the relevant data sets back to system memory for the CPU? Or does everything go via the CPU first where the GPU data is separated out and sent on before decompression?
Ah. Sorry, missed that discussion entirely.yes, it was the topic of quite a detailed discussion in another thread some time ago
Samsung finally released their pcie gen4 nvme ssd. 7GB read, 5GB write(2GB/s write on tlc), 229$+tax for 1TB. Not too bad. I know this is wrong thread but by the time I will run out of disk space on ps5 the ssd upgrade will not be too bad assuming this samsung drive or something cheaper works.
This drive in 2TB size would be decent enough for my next pc build. I still have dreams of optane as boot/apps drive, but maybe that is not anything but fun with specs.
https://www.anandtech.com/show/16087/the-samsung-980-pro-pcie-4-ssd-review
I wouldn't hold your breath on Samsung Pro drives getting significantly cheaper before their replacement comes out. Samsungs Pro drives tend to stay at a high premium throughout their life with little to no decrease in price.
That may possibly change this generation as they've moved to TLC versus MLC on their Pro drives and people aren't happy about it. But I doubt it will change much as their EVO drives were still popular using TLC and those didn't really drop much in price either.
Regards,
SB
If my google mojo didn't turn out all sour it looks like 970pro 1TB msrp was 449$. Current selling price in newegg is 313$. If 980pro follows similar path it's going to get a bit cheaper but not necessarily cheap.
Just an update : I was installing some games and the setup.exe on some of them would just exit with no error, I did some troubleshooting clear temp folder, compatibility modes, run as admin, disable anti virus ect nothing worked then I remembered I disabled my swap file so i set it to system managed and everything worked fine.Just an update. 1 month ago I disabled my virtual memory (have 32gb) so far I have had zero problems