Next-Generation NVMe SSD and I/O Technology [PC, PS5, XBSX|S]

SSDs were too expensive in 2013. Compression should have been in 2013 though. That would have been a large benefit.

Yes like i mentioned, it was just too costly, costs have stagnated SSD tech for too long. Atleast the 2013 consoles got quite much ram for the time.
 
SSDs were too expensive in 2013. Compression should have been in 2013 though. That would have been a large benefit.
PS4 definitely has hardware zlib decompression and I'm pretty sure Xbox One does too.
 
I don't think Sony clarified where the decompression hardware was architecturally located, Mark Cerny just said "special hardware".
Hmmm.
To further help the Blu-ray along, the system also has a unit to support zlib decompression -- so developers can confidently compress all of their game data and know the system will decode it on the fly. "As a minimum, our vision is that our games are zlib compressed on media," said Cerny.
were you referring to the above quote?

I think still DMA
 
Hmm yea I wonder if LZ and zlib are foundationally the same.

kraken on Jaguar according to their blog withoodle is still much faster on Jaguar than using the hardware compression units on both.
I could be wrong in my reading for the article though.

http://cbloomrants.blogspot.com/2018/02/oodle-260-leviathan-performance-on-ps4.html
And on the Sony PS4 (clang x64 AMD Jaguar 1.6 GHz) :
Oodle 2.6.0 -z8 :

Leviathan : 2.780 to 1 : 271.53 MB/s
Kraken : 2.655 to 1 : 342.49 MB/s
Mermaid : 2.437 to 1 : 669.34 MB/s
Selkie : 1.904 to 1 :1229.26 MB/s

non-Oodle reference (2016) :

brotli-11 : 2.512 to 1 : 77.84 MB/s
miniz : 1.883 to 1 : 85.65 MB/s
brotli-9 : 2.358 to 1 : 95.36 MB/s
zlib-ng : 1.877 to 1 : 109.30 MB/s
zstd : 2.374 to 1 : 133.50 MB/s
lz4hc-safe : 1.669 to 1 : 673.62 MB/s
LZSSE8 : 1.626 to 1 : 767.11 MB/s


The Microsoft XBox One has similar performance to the PS4. Mermaid & Selkie can decode faster than the hardware DMA compression engine in the PS4 and Xbox One, and usually compress more if they aren't limited to small chunks like the hardware DMA engine needs.
 
It may be faster but could come out worse if the Jags can't do anything else during that time where as DMA engine is done in parallel. So essentially the built in DMA Engines are "free". Really difficult to compete with free.
 
so zlib is like 1.8 : 1, kraken is further to like as far as 2.5 : 1 etc.
is the major difference here how much faster it can decode vs zlib making it possible to have super high throughput whereas zlib is sort of limited in decompression speed?
 
SSDs were too expensive in 2013. Compression should have been in 2013 though. That would have been a large benefit.

Compression was available. Both the One and PS4 (I remember someone mentioning it) have lossless decompression (LZ based). The One had hardware in one of its DME.

However when you are talking 20-50 MBps off the HDD it would have had limited impact. The decompression plays a bigger role now because the SDD offers way more bandwidth.
 
Last edited:
so zlib is like 1.8 : 1, kraken is further to like as far as 2.5 : 1 etc.
is the major difference here how much faster it can decode vs zlib making it possible to have super high throughput whereas zlib is sort of limited in decompression speed?

Compression ratios for zlib aren’t etched in stone. Ratios can change and are dependent on different factors. FPGA and ASIC hardware for Zlib compression is a common area of research. And greater compression ratios under that circumstance often comes with greater complexity resulting in more silicon area.
 
Compression ratios for zlib aren’t etched in stone. Ratios can change and are dependent on different factors. FPGA and ASIC hardware for Zlib compression is a common area of research. And greater compression ratios under that circumstance often comes with greater complexity resulting in more silicon area.
Right. Neither is kraken at the same time. You can compress more at the cost of speed.

I will say that the discussion around compression seems to have gone too far into marketing numbers here. The discussion seems to be around compression how fast and how much and using the largest numbers to represent real world performance. the real question is what are you compressing.

As I understand, modern games all use BC7 texture compression and it’s very difficult to compress it; even with oodle texture bc7prep with kraken can only get 5-15% compression more on bc7. That’s not really a lot; and that takes 1 additional step to decode the bc7prep.

so it’s a question of how many developers are comfortable with lossy RDO textures vs lossless. When I look at the modern landscape of game with quality modes, performance modes and phot modes; to me it makes sense to still stay lossless. It takes more space to duplicate the texture, and if you use bc7prep you are wasting a compute shader to decode the texture after retrieval; so to me raw throughout 5.5GB/s is actually the most important number here. I see 9 and 11, but what are the chances developers are willing to lose that texture quality? Ie UE5 was likely lossless textures I assume at least for the landscape.

It seems like if you want to make a graphical tour de force; you’re going to stick very close to lossless for the things that matter; you’ll optimize and use the RDO oodle for fast and heavy lossy compression for lower quality fewer channel ones like normals. Heck; IIRC no normal maps for UE5. You’ll still need BC6H if you want HDR; not sure how well these do on kraken either. But it just seems like the discussion focused on how high compression can go.
 
Last edited:
Right. Neither is kraken at the same time. You can compress more at the cost of speed.

I will say that the discussion around compression seems to have gone too far into marketing numbers here. The discussion seems to be around compression how fast and how much and using the largest numbers to represent real world performance. the real question is what are you compressing.

As I understand, modern games all use BC7 texture compression and it’s very difficult to compress it; even with oodle texture bc7prep with kraken can only get 5-15% compression more on bc7. That’s not really a lot; and that takes 1 additional step to decode the bc7prep.

so it’s a question of how many developers are comfortable with lossy RDO textures vs lossless. When I look at the modern landscape of game with quality modes, performance modes and phot modes; to me it makes sense to still stay lossless. It takes more space to duplicate the texture, and if you use bc7prep you are wasting a compute shader to decode the texture after retrieval; so to me raw throughout 5.5GB/s is actually the most important number here. I see 9 and 11, but what are the chances developers are willing to lose that texture quality? Ie UE5 was likely lossless textures I assume at least for the landscape.

It seems like if you want to make a graphical tour de force; you’re going to stick very close to lossless for the things that matter; you’ll optimize and use the RDO oodle for fast and heavy lossy compression for lower quality fewer channel ones like normals. Heck; IIRC no normal maps for UE5. You’ll still need BC6H if you want HDR; not sure how well these do on kraken either. But it just seems like the discussion focused on how high compression can go.

The 8-9GB/s figure Sony originally stated is apparently using Kraken alone, not Oodle Texture. Kraken alone is lossless so there's no reason not to use it. They're stating a real world average compression ratio of ~1.5:1 for Kraken alone which gets them to 8-9GB/s. I've seen examples of Kraken compression rates on actual textures which make that sound at least feasible so I'd probably take it at face value (although examples can vary widely in results).

To get to 11GB/s they have to use Oodle Texture + Kraken, and Oodle Texture is lossy so your points above come into play there.

As yet we don't know whether BCPACK is lossless or lossy so a comparison to what MS is doing in that regard is difficult. And we don;t even know what compression scheme RTX IO uses, or even if it's restricted to a specific one at all.
 
The 8-9GB/s figure Sony originally stated is apparently using Kraken alone, not Oodle Texture. Kraken alone is lossless so there's no reason not to use it. They're stating a real world average compression ratio of ~1.5:1 for Kraken alone which gets them to 8-9GB/s. I've seen examples of Kraken compression rates on actual textures which make that sound at least feasible so I'd probably take it at face value (although examples can vary widely in results).

To get to 11GB/s they have to use Oodle Texture + Kraken, and Oodle Texture is lossy so your points above come into play there.

As yet we don't know whether BCPACK is lossless or lossy so a comparison to what MS is doing in that regard is difficult. And we don;t even know what compression scheme RTX IO uses, or even if it's restricted to a specific one at all.
Kraken is nearly 1:1 for compressing BC7 textures. So once again, you're looking at best case results here. it basically can't compress BC7 any further without oodle texture, and oodle texture can't really compress BC7 (bc7prep) either without a compute shader decode step if you want to keep things lossless - and even then it's only 5-15%.

I have the feeling that the raw throughput is the winner here for PS5. Compression numbers are great, but the heavy lifting will be all the things that can't be compressed. And I suspect developers will be willing to trade off compression/hard drive space in order to keep that quality high and just rely on the fast streaming speed of the SSD.

I've no clue what BCPack is, very little to no information. I'm not sure if it's a separate compression mechanism like kraken/zlib, or an encoding like oodle texture is, as in a BC8 if there was such a naming convention.
 
Other upside of better compression in addition to read performance is installation sizes. Better compression allows far more reasonable installation sizes. Miles morales ps4 version is 52GB and ps5 version is 50GB. PS5 install is smaller despite better assets thanks to compression and less duplication of assets. Having the best possible asset in ram to use should not be a problem with that type of install size and io-speed. Gone are the days of popup and blurry textures suddenly becoming sharper? Or so one could hope.
 
Other upside of better compression in addition to read performance is installation sizes. Better compression allows far more reasonable installation sizes. Miles morales ps4 version is 52GB and ps5 version is 50GB. PS5 install is smaller despite better assets thanks to compression and less duplication of assets. Having the best possible asset in ram to use should not be a problem with that type of install size and io-speed. Gone are the days of popup and blurry textures suddenly becoming sharper? Or so one could hope.
popin is not necessarily a function of not having enough I/O. it can be, but is not entirely responsible for it. Blurry textures etc, all of those things, cost bandwidth. Ie if you only had 76GB/s of bandwidth but an SSD of 15GB/s, you won't resolve the popin/blurry textures. You just don't have the bandwidth or possibly compute to perform all the things you need to on all the textures.
 
popin is not necessarily a function of not having enough I/O. it can be, but is not entirely responsible for it. Blurry textures etc, all of those things, cost bandwidth. Ie if you only had 76GB/s of bandwidth but an SSD of 15GB/s, you won't resolve the popin/blurry textures. You just don't have the bandwidth or possibly compute to perform all the things you need to on all the textures.

You can see streaming fall behind on some games. You will see lower level mipmap textures. Higher detail levels gradually load as streaming catches up when you stop moving. Happens for example in gtav if you move very fast in the world.

edit. I'm not saying faster streaming will solve all issues, but it will solve some.
 
You can see streaming fall behind on some games. You will see lower level mipmap textures. Higher detail levels gradually load as streaming catches up when you stop moving. Happens for example in gtav if you move very fast in the world.

edit. I'm not saying faster streaming will solve all issues, but it will solve some.
indeed. but that can also be attributed to mip-map selection algorithms as well however.
 
Back
Top