Digital Foundry Article Technical Discussion [2021]

Status
Not open for further replies.
DF Article @ https://www.eurogamer.net/articles/...ysis-remastered-next-gen-console-patch-tested

Crysis Remastered gets upgraded for Xbox Series X/S and PlayStation 5
DF gets an early look at the Xbox patch with PS5 to follow.

Crysis Remastered is being patched with support for the new wave of consoles and Digital Foundry managed to get access to Xbox Series X and Series S builds of the upgrade ahead of launch. Owing to the way 'back-compat plus' titles are distributed, we can't check out the PlayStation 5 build until it launches but thankfully, the wait there shouldn't be too long: we were informed today that the patch is out now - and we'll update this article with PS5 impressions as soon as we are able.

Similar to the recently released 2.1 update for the PC version, there are plenty of additions, tweaks and improvements to the game beyond the support for the new consoles. Taking pride of place in the list of upgrades is the inclusion of the Ascension level, a stage so taxing that it was previously removed entirely from all console versions of the game. Also welcome is the inclusion of Nanosuit mode swapping more in line with the PC original (this may have arrived in a prior update, but certainly wasn't present at launch - regardless, it's a nice feature and works well).

However, there is the sense that we're still missing some features found in the 2007 game, removed for the Xbox 360 and PlayStation 3 versions and which still haven't been restored for Crysis Remastered. The granularity of destruction found in the original still hasn't been patched, volumetrics still aren't on par with the PC original and this effect is completely missing in the resurrected Ascension stage, even it is present on the PC version of Crysis Remastered. Other OG Crysis features are also pared back or missing: vegetation animation still runs at a lower update speed than the rest of the game, while explosions still don't have any impact on foliage.


...

 
I wasn't talking about a specific game, just general performance differences between ZLib and Kraken based on publicly available information. Does anyone know the performance of the compression hardware in PS4/Xbox One? What about real world IO performance from their HDDs. There could be scenarios where a software decompression system could be faster in real world terms if the compression ratio is better because the IO is so limited.
I don't believe Sony or Microsoft every published specifications for their custom zlib hardware. Mark Cerny said it works "on the fly" which suggests it's decompresses as quickly as data is pulled off the HDD. The hardware in both PS4 and XBO would have to be utter shite to be outpaced by a Jaguar APU software-based solution. :yep2:
 
I don't believe Sony or Microsoft every published specifications for their custom zlib hardware. Mark Cerny said it works "on the fly" which suggests it's decompresses as quickly as data is pulled off the HDD. The hardware in both PS4 and XBO would have to be utter shite to be outpaced by a Jaguar APU software-based solution. :yep2:

So on-the-fly means input at a rate of at most 120 MB/s, more likely around 60 MB/s as input? That times whatever compression factor to get your output rate.
 
I don't believe Sony or Microsoft every published specifications for their custom zlib hardware. Mark Cerny said it works "on the fly" which suggests it's decompresses as quickly as data is pulled off the HDD. The hardware in both PS4 and XBO would have to be utter shite to be outpaced by a Jaguar APU software-based solution. :yep2:
I would've thought it's not that simple.

Maybe for the way it's being compressed it's not as efficient using the hardware.
Maybe dictionary is bigger than hardware supports for the way Kraken is using it.
 
I don't believe Sony or Microsoft every published specifications for their custom zlib hardware. Mark Cerny said it works "on the fly" which suggests it's decompresses as quickly as data is pulled off the HDD. The hardware in both PS4 and XBO would have to be utter shite to be outpaced by a Jaguar APU software-based solution. :yep2:

MS may have not published it. But I believe details were published by VGleaks.

Edit:

Here u go.

https://vgleaks.com/world-exclusive-durangos-move-engines/


Generic lossless compression and decompression
One move engine out of the four supports generic lossless encoding and one move engine supports generic lossless decoding. These operations act as extensions on top of the standard DMA modes. For instance, a title may decode from main RAM directly into a sub-rectangle of a tiled texture in ESRAM.

The canonical use for the LZ decoder is decompression (or transcoding) of data loaded from off-chip from, for instance, the hard drive or the network. The canonical use for the LZ encoder is compression of data destined for off-chip. Conceivably, LZ compression might also be appropriate for data that will remain in RAM but may not be used again for many frames—for instance, low latency audio clips.

The codec employed by the move engines is LZ77, the 1977 version of the Lempel-Ziv (LZ) algorithm for lossless compression. This codec is the same one used in zlib, glib and other standard libraries. The specific standard that the encoder and decoder adhere to is known as RFC1951. In other words, the encoder generates a compliant bit stream according to this standard, and the decoder can decompress certain compliant bit streams, and in particular, any bit stream generated by the encoder.

LZ compression involves a sliding window and operates in blocks. The window represents the history available to pattern-match against. A block denotes a self-contained unit, which can be decoded independently of the rest of the stream. The window size and block size are parameters of the encoder. Larger window and block sizes imply better compression ratios, while smaller sizes require less calculation and working memory. The Durango hardware encoder and decoder can support block sizes up to 4 MB. The encoder uses a window size of 1 KB, and the decoder uses a window size of 4 KB. These facts impose a constraint on offline compressors. In order for the hardware decoder to interpret a compressed bit stream, that bit stream must have been created with a window size no larger than 4 KB and a block size no larger than 4 MB. When compression ratio is more important than performance, developers may instead choose to use a larger window size and decode in software.

The LZ decoder supports a raw throughput of 200 MB/s compressed data. The LZ encoder is designed to support a throughput of 150-200 MB/s for typical texture content. The actual throughput will vary depending on the nature of the data.



Read more https://vgleaks.com/world-exclusive-durangos-move-engines/
 
DF Article @ https://www.eurogamer.net/articles/...ysis-remastered-next-gen-console-patch-tested

Crysis Remastered gets upgraded for Xbox Series X/S and PlayStation 5
DF gets an early look at the Xbox patch with PS5 to follow.

Crysis Remastered is being patched with support for the new wave of consoles and Digital Foundry managed to get access to Xbox Series X and Series S builds of the upgrade ahead of launch. Owing to the way 'back-compat plus' titles are distributed, we can't check out the PlayStation 5 build until it launches but thankfully, the wait there shouldn't be too long: we were informed today that the patch is out now - and we'll update this article with PS5 impressions as soon as we are able.

Similar to the recently released 2.1 update for the PC version, there are plenty of additions, tweaks and improvements to the game beyond the support for the new consoles. Taking pride of place in the list of upgrades is the inclusion of the Ascension level, a stage so taxing that it was previously removed entirely from all console versions of the game. Also welcome is the inclusion of Nanosuit mode swapping more in line with the PC original (this may have arrived in a prior update, but certainly wasn't present at launch - regardless, it's a nice feature and works well).

However, there is the sense that we're still missing some features found in the 2007 game, removed for the Xbox 360 and PlayStation 3 versions and which still haven't been restored for Crysis Remastered. The granularity of destruction found in the original still hasn't been patched, volumetrics still aren't on par with the PC original and this effect is completely missing in the resurrected Ascension stage, even it is present on the PC version of Crysis Remastered. Other OG Crysis features are also pared back or missing: vegetation animation still runs at a lower update speed than the rest of the game, while explosions still don't have any impact on foliage.


...

So, still drops to <60. I don't really think this is a problem of DRS, more a "single-thread-old-gameengine" problem.
 
https://vgleaks.com/world-exclusive-durangos-move-engines/
[SNIP]

These facts impose a constraint on offline compressors. In order for the hardware decoder to interpret a compressed bit stream, that bit stream must have been created with a window size no larger than 4 KB and a block size no larger than 4 MB. When compression ratio is more important than performance, developers may instead choose to use a larger window size and decode in software.
Sounds about right. Plus, a more modern format like Kraken might offer better performance and a better ratio. Or, cross platform compatibility if you are shipping on PC, mobile, or somewhere else.
 
I would've thought it's not that simple. Maybe for the way it's being compressed it's not as efficient using the hardware. Maybe dictionary is bigger than hardware supports for the way Kraken is using it.

Lz compression was designed to have one decompression method, it's only the method of compression that varies. So how the data is compressed does not matter, it's not like variable compression or encoding technologies where the complexity of compression impacts decompression. This is why Lz and zlib is so popular. You can be sure that a zlib decompression hardware built now (or ten years ago) will work forever and benefit from improved compression techniques.
 
Logically why would rad advertise oodle for ps4/xbo if it was not competitive with zlib hw in those devices? PC has a different sku so no problem doing oodle there and zlib on last gen console if the performance indicated it. Its just at build time/creation you need to define what to use. And then calling different decompression methods
 
Logically why would rad advertise oodle for ps4/xbo if it was not competitive with zlib hw in those devices? PC has a different sku so no problem doing oodle there and zlib on last gen console if the performance indicated it. Its just at build time/creation you need to define what to use. And then calling different decompression methods
Oodle has nothing to do with zlib. Currently on PS4 and Xbox all data is compressed by zlib which is ideal because both those machines have hardware zlib decompressors than can work on the fly, albeit at a rather reduced pace because they have being first designed to decompress data from the disc.

Cerny said:
To further help the Blu-ray along, the system also has a unit to support zlib decompression -- so developers can confidently compress all of their game data and know the system will decode it on the fly. "As a minimum, our vision is that our games are zlib compressed on media," said Cerny.

https://www.gamasutra.com/view/feature/191007/inside_the_playstation_4_with_mark_.php?page=3
 
Lz compression was designed to have one decompression method, it's only the method of compression that varies. So how the data is compressed does not matter, it's not like variable compression or encoding technologies where the complexity of compression impacts decompression. This is why Lz and zlib is so popular. You can be sure that a zlib decompression hardware built now (or ten years ago) will work forever and benefit from improved compression techniques.
As mentioned in @dobwal post [URL="https://forum.beyond3d.com/posts/2198819/"]Digital Foundry Article Technical Discussion [2021][/URL]

I wasn't talking about compatibility but efficiency.
In that post you can see the block limit (i referred to it as dictionary size) and also the throughput limit.
Both things that may make using Kraken on cpu a better solution.
 
As mentioned in @dobwal post Digital Foundry Article Technical Discussion [2021]

I wasn't talking about compatibility but efficiency.
In that post you can see the block limit (i referred to it as dictionary size) and also the throughput limit.
Both things that may make using Kraken on cpu a better solution.

Using compression/decompression on non-fixed function hardware is probably the way forward. Yes ineffective decompression is a magnitude slower, but newer decomp tech could/is much more flexible and doesnt require the need for fixed function hardware/blocks.
GPU's can be used for decompression aswell, its much faster, more flexible and doesnt require the need for additional die.
 
Using compression/decompression on non-fixed function hardware is probably the way forward. Yes ineffective decompression is a magnitude slower, but newer decomp tech could/is much more flexible and doesnt require the need for fixed function hardware/blocks.
GPU's can be used for decompression aswell, its much faster, more flexible and doesnt require the need for additional die.

I would like to see decompression in amd's next gen io-die. Then forward the uncompressed data over pcie5 bus either to gpu or main ram depending on use case. Should be highly efficient, flexible and low power. Ideally make the decompression block also support zip and other formats and hook it up so that application installs etc. can also benefit from hw decompression. It frustrates me to no end to see those progress bars when installing(decompressing) apps/updates. Microsoft could integrate this into windows so any app using the windows built in decompression would hit fast path if decompression hw is available.
 
Oodle has nothing to do with zlib. Currently on PS4 and Xbox all data is compressed by zlib which is ideal because both those machines have hardware zlib decompressors than can work on the fly, albeit at a rather reduced pace because they have being first designed to decompress data from the disc.



https://www.gamasutra.com/view/feature/191007/inside_the_playstation_4_with_mark_.php?page=3

From Oodle site http://www.radgametools.com/oodlekraken.htm
Kraken is designed to run at blazing speeds on modern CPUs. It's great on the AMD Jaguar chip in the PS4 and Xbox One, which is a platform most compressors struggle on. Kraken achieves its amazing performance from new ideas on how to do LZ compression, and carefully optimized low level routines for x86, x64, Jaguar and ARM.

I read that as RAD want you to use Oodle Kraken instead of zlib and that it will be competitive, probably/maybe even better than using the zlib HW on the PS4/XBO.
Why would they highlight AMD Jaguar chip in PS4 and XBO if it was "worse" than the builtin stuff?


Going on, if we assume that we are looking at it from a total cost pov, and with even more assumptions.

1. With Kraken you create smaller files
2. With Kraken you decode a file quicker than LZ if the files are the same size on the cpu, but HW LZ is quicker per 1MB by some ratio.

Now if Kraken file is 1MB and the LZ file is 2MB due to #1.
And you decode a 1MB in 8ss on Kraken CPU and 6s on LZ HW (silly numbers, but easy to follow)
So even if the CPU is slower to decode on PS4/XBO the total win in total is still with Kraken.
Of course you need to factor in other resources like cpu time (multi-threaded), bandwidth etc etc but Kraken on CPU can then be better than LZ HW.
 
I wasn't talking about compatibility but efficiency. In that post you can see the block limit (i referred to it as dictionary size) and also the throughput limit. Both things that may make using Kraken on cpu a better solution.

I'm lost. Efficiency of zlib PS4/XBO hardware compressor vs CPU, or zlib decompressor (hardware/CPU) vs software kraken on PS4/XBO? Or something else? :runaway:

I can well believe software kraken decompression is faster in the round than software LZ decompression because LZ is ancient. For previous gen consoles I think the key comparison is software kraken decompression versus hardware LZ decompression. And even if software kraken is faster, is there sufficient CPU overhead to shift from hardware LZ decompressions to software kraken decompression?

And to be clear, I don't the know the answer to any of these questions! :nope:
 
Status
Not open for further replies.
Back
Top