Current Generation Games Analysis Technical Discussion [2022] [XBSX|S, PS5, PC]

Their subsequent pc versions wont dissapoint either ;)
DirectStorage is going to have a very positive effect on future Sony games on PC I believe.

Jurjen Katsman: I am not certain if decompression is actually the bottleneck for load times as we did do some things to make decompression go faster, but we backed out on some of those that were hurting the in-game loading (streaming). As in-game (as in streaming while in game, moving around) with those decompression speed-ups in place, we were taking too much CPU away from the game. So we backed that away, and it didn't really meaningfully impact loading screens.
The upfront loading is already quite fast in the Spider-Man games, and with DirectStorage, I'm sure will cut that down even more so with the I/O stack improvements and files remaining compressed across the PCIe.

But it's the in-game streaming where I feel like it's going to make a massive different. Even in that quote they talk about how increasing the decompression rate was taking too much resources away from the CPU causing performance issues.

Spider-Man and Miles Morales may be last gen PS4-based games... but they are still taking advantage of the I/O stack and hardware inside the PS5.
 
DirectStorage is going to have a very positive effect on future Sony games on PC I believe.


The upfront loading is already quite fast in the Spider-Man games, and with DirectStorage, I'm sure will cut that down even more so with the I/O stack improvements and files remaining compressed across the PCIe.

But it's the in-game streaming where I feel like it's going to make a massive different. Even in that quote they talk about how increasing the decompression rate was taking too much resources away from the CPU causing performance issues.

Spider-Man and Miles Morales may be last gen PS4-based games... but they are still taking advantage of the I/O stack and hardware inside the PS5.

Ratchet and Clank Rift Apart goes further than this and the I/O bottleneck is on the CPU side because like I said before the game engine architecture of Insomniac Games on CPU side can be improved a lot and they are bottlenecked by a single thread here it is the main thread with initialization of entity in a level. I hope we will see a big jump with Spiderman 2 on this side.

The uncompressed data speed is lower than the SSD speed, 5 GB/s. For Demon's Soul's Remake it is 3 to 4 GB/s. Bottleneck is somewhere else for the moment.

EDIT: Dev only see uncompressed data speed into the performance analyser of PS5, one Insomniac dev told it speaking about Ratchet and Clank.
 
Last edited:
Ratchet and Clank Rift Apart goes further than this and the I/O bottleneck is on the CPU side because like I said before the game engine architecture of Insomniac Games on CPU side can be improved a lot and they are bottlenecked by a single thread here it is the main thread with initialization of entity in a level. I hope we will see a big jump with Spiderman 2 on this side.

The uncompressed data speed is lower than the SSD speed, 5 GB/s. For Demon's Soul's Remake it is 3 to 4 GB/s. Bottleneck is somewhere else for the moment.

That's typicially how things go. New bottlenecks only reveal themselves as you begin to remove old ones, and in the end you just make changes and compromises where you can/have to. I bet it's going to be stunning though. If the teaser trailer is any indication.

How does Demon's Souls stream during gameplay? Is it massive, chunk based level loads which happen only a few times per area, or is it more granular than that? Because with those figures, I'm assuming they're from upfront loads between areas. Streaming in-game is probably far lower. Maybe 1GB/s or so.
 
Who'd thought SSD hype discussions would emerge still. Yeah maybe due to cross-gen or whatever were kinda dry, however there are games like Rift Apart that supposedly take full advantage of the IO/NVME tech according to the game's developers. I have zero doubt any PS5 game, be it Rift Apart or future title, will be a problem for a modern pc system. To the contrary, you probably can scale things up in that regard if developers want to.
 
That's typicially how things go. New bottlenecks only reveal themselves as you begin to remove old ones, and in the end you just make changes and compromises where you can/have to. I bet it's going to be stunning though. If the teaser trailer is any indication.

How does Demon's Souls stream during gameplay? Is it massive, chunk based level loads which happen only a few times per area, or is it more granular than that? Because with those figures, I'm assuming they're from upfront loads between areas. Streaming in-game is probably far lower. Maybe 1GB/s or so.

They are from upfront load. Average streaming speed is probably lower than 1 GB/s.
 
Again talk to me about performance when will see real game. I don't care about demo. The advantage of consoles they don't need to use any CPU or GPU power for I/O(just a little bit on Xbox Series side).

And in the future they talk about using ASIC* to do the decompression on PC too.;)

* A hardware decompressor

The advantage of an hardware ASIC as I see it would be that if it were part of the CPU IO die then it could also handle the CPU destined data decompression as well as the GPU data. That's cool and all, but such a solution would also use near double the PCIE bandwidth to the GPU than the current solution as well. As you imply it would also eliminate any decompression load on the GPU, but from what we've seen so far, that's not significant, and also is only relevant to the GPU as opposed tot he CPU which is what we were originally discussing.

This is for the PC version specifically? Or PS5? I know PS4 can only do about 1/10th that due to how they had to arrange things to account for HDD.

Did they really rearrange their streaming system to such a degree for the remastered? Hard to believe that there are still basic pop in issues at that speed

PC. Alex's Digital Foundry video showed the data streaming throughput in real time. As you rightly point out it must be much lower on the PS4 due to the physical limitations of the HDD which is proof that we can't use the PC/PS5 version of Spiderman as an example of "last gen" streaming throughput in an attempt to claim that "current gen" will stream much more data. In fact, Spiderman already streams much more data than the previous gen was ever capable of (thanks to higher res textures and higher frame rates).

Ok. Well, I guess we'll just have to see how things unfold. Should be pretty exciting as I really can't wait to see what the real next gen benchmark set by Sony is going to be. I fully believe Spider-Man 2 is going to look unbelievable, and will be the best looking game out regardless of platform. I'm pretty confident in that haha.

But I'm feeling pretty good about PC with DirectStorage for the future as well.

Callisto Protocol, is my current "Next Gen" WTF moment I think. Looks like a current gen Doom 3 to me.
 
That's typicially how things go. New bottlenecks only reveal themselves as you begin to remove old ones, and in the end you just make changes and compromises where you can/have to. I bet it's going to be stunning though. If the teaser trailer is any indication.

How does Demon's Souls stream during gameplay? Is it massive, chunk based level loads which happen only a few times per area, or is it more granular than that? Because with those figures, I'm assuming they're from upfront loads between areas. Streaming in-game is probably far lower. Maybe 1GB/s or so.

The bottleneck of Insomniac CPU game engine is due to technical debt and the fact they release games very fast. For example ND has solve the problem since 2014 but they release much fewer game and often with delay.
 
UE5 is notoriously light on streaming requirements. If I recall correctly the Matrix demo was something like 200MB/s? I might be over estimating that. Low latency is the main requirement for Nanite.

Matrix demo is 300 MB/s.

Spiderman 2 number will probably be bigger than this but people talk about impressive game visually nothing to do with I/O. From a visual point of view I think next WTF moment of the gen will be the first AAA release of a UE 5 game.
 
Who'd thought SSD hype discussions would emerge still. Yeah maybe due to cross-gen or whatever were kinda dry, however there are games like Rift Apart that supposedly take full advantage of the IO/NVME tech according to the game's developers. I have zero doubt any PS5 game, be it Rift Apart or future title, will be a problem for a modern pc system. To the contrary, you probably can scale things up in that regard if developers want to.
I want to see what devs can do with that kind of tech.

I think every platform will benefit when they don't have to worry about the same kind of bottlenecks from the storage they have historically had to plan their entire engines around.

I don't think we have actually seen the true benefits of it yet tho.

The SSD will be nowhere near as fast as ram but it should have a big impact as is.

I am speaking directly from hype of course so I don't know anything but it makes sense
 
There is always going to be some cpu overhead with data movement that isn’t exclusive to the gpu. The CPU either has to be informed that data move has been completed or has to check for itself. Either way it’s cost cycles.
 
Matrix demo is 300 MB/s.

Spiderman 2 number will probably be bigger than this but people talk about impressive game visually nothing to do with I/O. From a visual point of view I think next WTF moment of the gen will be the first AAA release of a UE 5 game.

300 MB/s provides little context. PS5 games may never get anywhere near saturating PS5 SDD over long periods of time but may see significant benefits from the bandwidth offered.
 
Can you provide a source for this as this is new information to me?
 
Can you provide a source for this as this is new information to me?


It seems Xbox Series use this DRAM less controller and it has one ARM Cortex R5 CPU. The configuration of PS5 is a bit different. There is custom Marvell control flash controller with some DRAM and it was a surprise. And on the SOC a complex I/O with two ARM CPU probably R5, one DMAC controller, some coherency engine and the hardware decompressor and some SRAM. The SRAM is used from some translation table and as cache for compressed data. In a tweet Fabian Giesen(Epic RAD tools game for Kraken technology) told when dev package the data it is needed to be in chunk of 256kb maximum for fit inside the SRAM for be accessed by the I/O complex. Because of the cache, there is no buffer for compressed data in PS5 RAM, only the final uncompressed data.


32-bit ARM Cortex R5 (Single CPU)
 
Last edited:
It seems Xbox Series use this DRAM less controller and it has one ARM Cortex R5 CPU. The configuration of PS5 is a bit different. There is custom Marvell control flash controller with some DRAM and it was a surprise. And on the SOC a complex I/O with two ARM CPU probably R5, one DMAC controller and the hardware decompressor and some SRAM. The SRAM is used from some translation table and as cache for compressed data. In a tweet Fabian Giesen(Epic RAD tools game for Kraken technology) told when dev package the data it is needed to be in chunk of 256kb maximum for fit inside the SRAM for be accessed by the I/O complex. Because of the cache, there is no buffer for compressed data in PS5 RAM, only the final uncompressed data.
Question: Is the PS5 solution over-engineered?
 
Question: Is the PS5 solution over-engineered?

I don't know. Devs look like happy, very easy to use. We just need to wait exclusives pushing the envelope. It let more processing power available on CPU side and it helps usage of RAM to be more efficient between the SRAM and the coherency engine with cache scrubbers inside GPU. I forget the coherency engine into the I/O complex. With coherency engine and cache scrubber it means it doesn't need a buffer for new data from the SSD because they can replace the existing data only the deleted data will be flushed into the GPU cache. Data in memory will continue to be coherent with the data in the GPU cache.

Devs don't have any access to the SRAM, I/O complex, flash controller VRAM. Everything is managed by Sony API, they just do some API call with the uncompressed files and some priority level. The constraint comes from the way they package the data with oodle Kraken and oodle texture or Zlib/deflate. Hardware decompressor support zlib too.

And from a cost perspective, it looks like this is ok. Sony don't lost money on PS5 with Bluray disk. ;)

EDIT: Naughty Dog told they don't use the sophisticated streaming engine they designed on PS5 because they don't need it. It is in an interview about the Uncharted legacy collection on PS5. It is by a French youtuber. They load file by batches and and manage priority and all work.

EDIT2: In some patent they explain how they set up a system of priority and because of the very low latency they are able to load data inside the current frame. They have a worst case latency and it will be guaranteed to the dev the data won't take more time to be available.

EDIT·: I forget the SRAM cache help save some RAM bandwidth, if the compressed data was there in the more extreme case it means 5.5 GB/s of data to write into it and 5.5 GB/s for the hardware decompressor to read inside RAM and 11 GB/s of uncompressed data to write in memory. It means double the memory bandwidth usage.
 
Last edited:
Back
Top