Digital Foundry Article Technical Discussion [2023]

Status
Not open for further replies.
Didn't digital foundry manage to do something like tape a pin on an ssd drive to hamstring the performance to below 2.5GB/s and it didn't change rift apart loading or performance at all? I can't for the life of me find the video and thinking it may have been a small talking point in a DF direct. I remember watching Rich talking about it and how a patreon member bought it to his attention. Someone with some younger grey matter might be able to remember where exactly this was from.
Yeah i remember that too. Like you said it indicated that the game is streaming less then 2,5GB/s . Insomniac actually pointed out that they did not max out PS5 with their first exclusive and have to further improve their engine to make it use more of PS5s ressources.

I think the results of that DF video was even discussed on twitter. Can't remember exactly and cannot find it for now but i think it was a tech guy from insomniac and one from oodle.
They found that the kraken controller must have made it possible since they knew the game must pull more than the hamstrung ssd was able to deliver.
Something something .. i hope i find it ..
 
All tests so far are not valid ones since we had no real stress test of a Game who would demand continuously streaming of high GB/s numbers as PS5 is capable of.

That's because no game is ever going to behave in that way. Multi GB/s operations will be very short bursts at initial loading screens. Games are only in the 10's to a couple hundred GB at most on disk. So continuous multi GB/s levels of steaming would result in the entire game content being spent in a matter of seconds or minutes.

We need to wait until Rift Apart lands on PC.

Well thanks to @HolySmoke we already have an idea of Rift Aparts streaming requirements and they're no-where near as high as you seem to be expecting:

Post in thread 'Next-Generation NVMe SSD and I/O Technology [PC, PS5, XBSX|S]' https://forum.beyond3d.com/threads/...o-technology-pc-ps5-xbsx-s.62015/post-2284368

So around 50MB/s on average to around 500MB during the portal transitions. For reference Digital Foundry and others have shown Foresaken is transferring that much data a lot of the time using CPU decompression (for reasons unknown which may be a bug). DF also showed us that Spiderman can hitcose to that in its heaviest moments, while we've been told by epic that the streaming requirements on the UE5 based matrix demo is only around 150MB/s.
 
All tests so far are not valid ones since we had no real stress test of a Game who would demand continuously streaming of high GB/s numbers as PS5 is capable of.
We need to wait until Rift Apart lands on PC.
Or now that we have with the TLOU PC Port a PS5 Exclusive ignoring (for now) your Holy Grail Direct Storage and therefore fails to even come close to PS5 in terms of Streaming, we could have the phenomenal Chance that they maybe include DS support with a patch. Then we would see what kind of difference it can make. If that scenario actually happens , then the ignorant PC port was actually worth something, if for nothing else then at least supplying us with an additional ( and by then still best) Datapoint.
Btw - is it known already how much data is actually streamed at all times on PC? Did someone measure this?
Would be interesting to know.
Technically things should only be streamed in and out as the scenes change. If your player character can’t move quickly and the scene is moving relatively slowly, the stream in and out will be minimal. Turning a corner, dropping stuff in or loading levels will result in the largest burst, but there’s no way you’re going to see high throughput the whole way. The game is only so big, and you cannot render from the ssd. The bandwidth is not there.
 
Last edited:
...you cannot render from the ssd. The bandwidth is not there.

This is the thing that a lot of people still don't really understand when it comes to fast I/O. BTW - many here do know these things, but considering how people constantly bang on about it, there's obviously a lot of people that still don't.

Rendering a scene requires many things, at an extremely simplistic level.
  • Read into memory
    • Fast I/O can make this much faster and allow for lowered VRAM use as the textures can be loaded just prior to needing them rather than seconds/minutes prior to needing them or allow for streaming in of larger textures.
  • Rendering
    • This generally involves multiple reads and writes into memory as assets are "modified" multiple times prior to the final output render being sent to the display.
    • Fast I/O cannot do this because it is massively slower than memory with massively more latency which is prohibitive if you need to make multiple changes to an asset prior to rendering the final output frame.
There are multiple factors that will make it difficult to "fully" utilize the (IMO) excessive bandwidth we have on PS5 and PC.

How much space do you want your game to take up? 100 GB? 500 GB? 1 TB? 5 TB? This is important as it means the opportunities to constantly stream in new "unique" assets will be limited by how much disk space you want to allocate to your asset pool.

How much RAM do you have available? If you have enough RAM that you can do all of your render operations and still have space left over, it's wasteful to evict all currently loaded assets if you think you'll need them in the next few seconds. Sure, you "can" evict them and then load them back in, but that's a waste of energy (electrical power) and time.

So, yeah, we can have better looking portal transitions (2021's Rift Apart versus 2006's Prey) because we don't have to pre-load assets into memory for the transition, but how often are you going to need to swap in an entire new location with, say, 90% change in all assets? Even quickly turning your head will still result in most assets still being the same. And more importantly how much disk space do you want to dedicate to different and unique assets? And, how much memory do you have available to render that one output frame considering you'll need more RAM than the size of your assets as you potentially have to hold multiple versions of a modified asset prior to render out? That last will be an absolute limit on how much you can load for any given frame depending on what you plan to output.

I've been a huge proponent of fast I/O even before the current consoles (PS5 and XBS-S/X) announced they would have SSDs. There's some threads here where I talked about the need to move away from mechanical HDDs at the start of the PS4/XBO generation. But it's not a magic bullet and still has limitations. But used within those limitation it can certainly enable greater visual fidelity than would be possible with a mechanical HDD (assuming you can't have unlimited RAM) or in the most simplistic case, make it not take forever and a day to load a level. :p

Regards,
SB
 
Didn't digital foundry manage to do something like tape a pin on an ssd drive to hamstring the performance to below 2.5GB/s and it didn't change rift apart loading or performance at all? I can't for the life of me find the video and thinking it may have been a small talking point in a DF direct. I remember watching Rich talking about it and how a patreon member bought it to his attention. Someone with some younger grey matter might be able to remember where exactly this was from.

Yes, they did do this comparison and I applaud them for including the very important reminder that even though the lower bandwidth SSD was sufficient, you still have to consider that the PS5 i/o is still at play here. The hardware decompression, i/o co processors, cache scrubbers, are still playing their part. If you stuck that same SSD into another console or PC without the other, far more important parts of PS5 i/o system, it wouldn't work.

That's because no game is ever going to behave in that way. Multi GB/s operations will be very short bursts at initial loading screens. Games are only in the 10's to a couple hundred GB at most on disk. So continuous multi GB/s levels of steaming would result in the entire game content being spent in a matter of seconds or minutes.

100%. The SSD isn't the key part of PS5, which is why when others claim PS5 is eclipsed because of higher bandwidth SSDs on the market, or why 5gb is overkill because a game will never need to stream a 5gb of data in a second, I shake my head. Think of it this way - If you need a total of 90mb of data to be streamed into video memory within the next second of a 60fps game, why might a 100mb/s SSD not suffice? Even though I only need a total of 90mb over the next second, what if the 90mb is actually needed to render the next frame? In that case, that would require a ~5.5gb transfer speed, not 100mb/s

Even still, focusing on bandwidth alone misses the entire point Cerny was trying to make. It is the entire I/O block that surrounds the SSD to massively reduce latency, i.e. removing the bottlenecks of the end-to-end data movement process, independent of the secondary storage and it's bandwidth capabilities.

old process.jpgNew process.jpg
 
Yes, they did do this comparison and I applaud them for including the very important reminder that even though the lower bandwidth SSD was sufficient, you still have to consider that the PS5 i/o is still at play here. The hardware decompression, i/o co processors, cache scrubbers, are still playing their part. If you stuck that same SSD into another console or PC without the other, far more important parts of PS5 i/o system, it wouldn't work.

The "far more important parts? :p

Keep in mind that hardware decompression is just another way to increase effective bandwidth. It isn't any more or less important than raw bandwidth which is the most obvious way to increase raw bandwidth. Also keep in mind that all consoles (even PS4/XBO and PS3/X360) had hardware decompression. Now, the PS5's hardware decompressor is capable of decompressing slightly more compressed data compared at higher rates than past consoles so it is certainly an improvement in that area, but it also requires developers to use it. And if the latest reports from the maker of the algorithm are correct, most developers still aren't using Kraken.

The cache scrubbers are there just to evict things from memory, sure it helps, but it's impact is going to be significantly less than raw bandwidth since it's entire purpose is to help the system utilize the raw bandwidth provided by the SSD. I'd struggle to take seriously any claim that it's one of the "far more important parts of the PS5 i/o system". Similarly the i/o processors are there to facilitate moving the data so that the PS5 can actually process the incoming stream of raw data from the SSD. In other words, it is purely dependent on the raw bandwidth.

Basically, most of those PS5 I/O subsystems that you mentioned are there to help the PS5 process and move data at the rates their provisioned SSD allows. They don't amplify it in anyway. So if the SSD is only required to read X amount of data in Y amount of milliseconds, all of those subsystems don't change that one single bit other than the hardware decompressor but that hardware decompressor is only amplifying data compared to past consoles if it's using the newer compression algorithm, which according to the algorithm maker, most developers still aren't doing despite it being free for them to do so.

Regards,
SB
 
Basically, most of those PS5 I/O subsystems that you mentioned are there to help the PS5 process and move data at the rates their provisioned SSD allows.

You said a lot just to agree with me. The i/o allows SSD bandwidth to become more realized and less theoretical. That has always been the difficult part. It is why your (or anyone else's) PC doesn't offer much benefit as you increase speed/upgrade your SSD. Again refer to the slides I posted. You referencing the different i/o features as "just" this and "just" that is very strange.

old process.jpgNew process.jpg
 
You said a lot just to agree with me. The i/o allows SSD bandwidth to become more realized and less theoretical. That has always been the difficult part. It is why your (or anyone else's) PC doesn't offer much benefit as you increase speed/upgrade your SSD. Again refer to the slides I posted. You referencing the different i/o features as "just" this and "just" that is very strange.

View attachment 8711View attachment 8712

Yes, so, regardless of what is there, the same SSD in a PC can still achieve the same results due to the PC being able to brute force things whereas the PS5 requires those bits because it has to fit within a certain price range and power envelope. IE your assertion that the same SSD put into a PC wouldn't operate as well is false.

And what was the point even mentioning the rest of the PS5 IO subsystem in the first place? Regardless of whether it was there or not Rift Apart didn't need more than 2.5 GB/s of bandwidth. Those IO subsystems don't make better use of that bandwidth. Other than the hardware decompression, they're purely there for the PS5 to process data at the higher bandwidths the SSD is capable of.

Regards,
SB
 
You said a lot just to agree with me. The i/o allows SSD bandwidth to become more realized and less theoretical. That has always been the difficult part. It is why your (or anyone else's) PC doesn't offer much benefit as you increase speed/upgrade your SSD. Again refer to the slides I posted. You referencing the different i/o features as "just" this and "just" that is very strange.

View attachment 8711View attachment 8712
If there is ever a day that there is significant enough bandwidth to render from SSD you would be right. But realties have it that L0 is much faster than L1 which is much faster than L2 whihc is much faster than L3. Which is magnitudes orders faster than GDDR which is magnitudes faster than SSD.
Faster SSD releases footprint pressure on GDDR, but it’s not a replacement for it. Removing latency is a nice trick for when you are trying to cut things as close as possible with using the least amount of footprint as possible. For everyone else, they would just use a little more footprint, or slow things down just enough to make latency.
 
You said a lot just to agree with me. The i/o allows SSD bandwidth to become more realized and less theoretical. That has always been the difficult part. It is why your (or anyone else's) PC doesn't offer much benefit as you increase speed/upgrade your SSD. Again refer to the slides I posted. You referencing the different i/o features as "just" this and "just" that is very strange.

View attachment 8711View attachment 8712
Here's the thing though.. and let's see if you'll admit this. Your entire premise of Sony's superiority.. lies in the fact that it depends on developers being able to ignore the other platforms and code the games specifically around the PS5's I/O and memory architecture.. right?

Well then I hate to break it to you... but if I said the same thing and developers were instead able to code specifically around PC's I/O and memory architecture, then our raw SSD bandwidth, and bigger memory pools.. becomes even more realized and less theoretical than the PS5's. Developers could pre-load and cache far more assets in memory up front if they wanted to, and during gameplay streaming they could pre-fetch much further in advance because they can hold many more assets resident in memory.

The entire argument about PS games requiring so much RAM and VRAM on PC is entirely due to how they were coded... and not what they would actually require, if the same games were developed around PC instead. Same goes for PS5 games and their "bandwidth requirements"...
 
Yes, so, regardless of what is there, the same SSD in a PC can still achieve the same results due to the PC being able to brute force things whereas the PS5 requires those bits because it has to fit within a certain price range and power envelope. IE your assertion that the same SSD put into a PC wouldn't operate as well is false.

You mentioned "the PC" can brute force and achieve similar results, and I agree with you, sometimes. But which PC are you referencing? Or is "the PC" a monolith?

You see, I'm noticing a pattern with some here having a difficult time maintaining apples to apples comparisons. Using your logic, what makes NVidia RT and Tensor cores special when a compute cluster can achieve far greater results by "brute forcing"?

And what was the point even mentioning the rest of the PS5 IO subsystem in the first place? Regardless of whether it was there or not Rift Apart didn't need more than 2.5 GB/s of bandwidth.

You keep getting tripped up because you are emphasizing bandwidth when the real winner here is low latency.

Those IO subsystems don't make better use of that bandwidth. Other than the hardware decompression, they're purely there for the PS5 to process data at the higher bandwidths the SSD is capable of.

Ok. Do any of my previous comments contradict what you're saying here? In fact, I don't know why you're excluding the decompression chip as it is chiefly responsible for allowing effective bandwidth targets
 
You keep getting tripped up because you are emphasizing bandwidth when the real winner here is low latency.

Which is still the latency of the drive itself. None of those things make the access latency of the drive any lower. If it takes X time units in order to access data on the SSD, it will at best take X time units to access data on the SSD. All of those IO subsystems don't change that.

Regards,
SB
 
100%. The SSD isn't the key part of PS5, which is why when others claim PS5 is eclipsed because of higher bandwidth SSDs on the market, or why 5gb is overkill because a game will never need to stream a 5gb of data in a second, I shake my head. Think of it this way - If you need a total of 90mb of data to be streamed into video memory within the next second of a 60fps game, why might a 100mb/s SSD not suffice? Even though I only need a total of 90mb over the next second, what if the 90mb is actually needed to render the next frame? In that case, that would require a ~5.5gb transfer speed, not 100mb/s

Even still, focusing on bandwidth alone misses the entire point Cerny was trying to make. It is the entire I/O block that surrounds the SSD to massively reduce latency, i.e. removing the bottlenecks of the end-to-end data movement process, independent of the secondary storage and it's bandwidth capabilities.

View attachment 8709View attachment 8710

As others have already noted, the SSD itself is absolutely the most important component of the PS5's IO system. It is that which is primarily responsible for the massive decrease in latency the new consoles have brought over the previous generation. Epic have talked at length about how the use of an SSD with it's associated low latency (vs HDD) is what makes Nanite possible.

As to the rest of it:

  • The "IO co-processors" aren't even unique to the PS5. Every off the shelf SSD ships with ARM processors to help manage the IO operations.
  • The decompression unit simply moves a task that would be done on the CPU in a PC (or the GPU using DirectStorage 1.1) onto a dedicated hardware unit. It does nothing to actually increase the throughput or decrease the latency of that decompression step, but it does mean you can get by with much less CPU power for decompression compared to a PC doing this on the CPU. That's why PC GPU decompression is being introduced to address this.
  • The cache scrubbers are a genuine innovation but nothing that's likely to be in any way game changing. They increase the efficiency of overwriting cache lines that can have potential CPU and GPU load benefits under the right circumstances. But those benefits are likely to be marginal o final performance.
  • The whole firmware/software stack on PS5 pulling all of this together is obviously very efficient and easy to use from a developer standpoint. That's arguably the systems biggest leg up over PC's but again, that's what Direct Storage was designed to at least partially address. As noted previously the PS5 is always going to have some level of efficiency advantage (in terms of being able to do the same task with less CPU power). But with all the latest innovations on the PC side, that efficiency advantage is likely to be fairly modest now, and of course, with sufficiently powerful hardware a PC would be able to overcome that efficiency deficit to achieve results that go beyond what the PS5 is capable of. How much additional CPU power is likely to vary from implementation to implementation so will never be something we can pin down precisely.
 
I had Duckstation, Flycast, PPSSP, XBSX2, Xenia, and XeniaCanary on Series X and used it once for a comparison video for Chrono Cross. Usually, what they did before was they would take down the emulators once Microsoft found them in the store, but if you were lucky enough to grab them in time and made sure that you opened them at least once, they would remain playable. If Microsoft could have found some way to ensure that those with the emulators could keep using them, the move would have been less controversial. For now, Dev mode is an ok compromise. Though I wouldn't say I like how it alters the console to work, which is why I have stayed away from it. So if you purchased your Series console strictly for emulation, the change should not be as big of a deal. The changes Dev mode makes to the console should not be that detrimental. There would be a higher learning curve in comparison to "Download the emulator from the store and play games from a USB stick."

For those of us that are heavily into the Xbox ecosystem, this sucks. This sucks even more because we got Xenia not too long ago, allowing us to play games that did not make the BC list. Bringing Xenia to the Xbox did not make sense until Microsoft gave up on adding more games to the BC list. Xenia wasn't perfect, and I would have much rather had the games on the BC list, even if the only way to play those games would be to own a disc, but it worked.
 
I don't know why you're excluding the decompression chip as it is chiefly responsible for allowing effective bandwidth targets

It's more accurate to say "the decompression chip is chiefly responsible for reducing CPU requirements while achieving effective bandwidth targets". It doesn't in itself do anything to increase effective bandwidth, the compression scheme is doing that. The decompression unit is just the tool (one of several options) being used to decompress the data stream.
 
Status
Not open for further replies.
Back
Top