Velocity Architecture - Limited only by asset install sizes

If you could fill the RAM pool in 0.5 seconds instead of 1.0 seconds it would be better. If you could fill the RAM in 0.1 seconds instead of 1 second that would be even better still. The faster it is, the more developers can depend on having assets streamed into the GPU/RAM in time for what they want to do.

The SSD speed chosen wasn't limited in any way by the available RAM pools in the consoles.

The SSD speeds chosen were a compromise between the cost and tech available.

Regards,
SB
I'm not saying faster SSD speeds are not good. I'm saying if you have higher effective throughput than the available RAM its a waste! Why would you spend money on an expensive SSD yet you could fill up RAM with a combination of a slower but cheaper SSD plus a decompression block? Thats what Sony and MSFT had to look at when determining the speed of the SSD. In any case RAM is a memory cache so its not like you're going to be trying to refill it constantly. Thats not how RAM works. Best case scenario is having a system where the combination of the SSD speed and the decompression block results in an effective throughput equivalent to the size of RAM. (For example a PS6 or Next gen Xbox with 32GB of unified memory. 12GB/s SSD speeds. A 2.5:1 decomp ratio resulting in 30GB/s effective throughput. This would be equivalent to getting an SSD with 32GB/s speeds but with a huge saving.)

An example is you have two systems. Everything is the same except the speed of the SSD(So Same memory controllers, OS, etc)
System 1: 8GB RAM, SSD speed 5.5GB/s, decomp block capable of 2:1. The result is effective throughput of 11GB/s for a system that only has up to 8GB of RAM.
System 2: 8GB RAM, SSD speed 3.2GB/s, decomp block capable of 2:1. The result is effective throughput of 6.4GB/s for a system that has 8GB RAM.

System 2 ceteris paribus is more cost effective and will perform the same as the first one. You only need a certain amount of working set of RAM, you can't have all of it being saturated with data all the time. It's ineffective. The main benefit of SSDs is having a larger working set of RAM.

The SSD speeds were chosen based off cost and amount of RAM required for next gen games. If they were going to be keeping the size of RAM constant at 8GB you would 100% definitely not be having a 5.5GB/s SSD in the PS5. It would be a complete waste of money. Something like 3.2GB/s would be much more cost effective. The processors are limited by what is in memory. And you want to cache some data or have close data resident in RAM. No need to refill it all up constantly.
 
@rntongo I think the mistake you're making is thinking that seconds are the best unit of time for thinking about games. What you actually want is data per millisecond, data per microsecond or data per nanosecond. If you want to read 4k or 8k textures off a disc on the fly, for example, even into a tile cache, there are benefits in being able to read many smaller files/blocks faster.
 
@rntongo I think the mistake you're making is thinking that seconds are the best unit of time for thinking about games. What you actually want is data per millisecond, data per microsecond or data per nanosecond.

I don't think I get what you're saying. You're most likely going to demand page smaller data sizes. I put that into consideration. Thanks to the instant seek times of SSDs you can demand page in data from any part of the 100GB game install thats part of the virtual address space for example.

If you're requesting for large amounts of data its going to help when you're loading a game, after that you'll likely have a larger working set of RAM with the faster SSD. Then you can start demand paging any data you need as a player progresses throughout a game. So thats the PS5's advantage. Faster load times and a larger working set of RAM. The real world performance is what I wan't to see though. Because if the PS5 cannot demand page textures at the same granularity as the SFS enabled Series X, it will be interesting.

@rntongo If you want to read 4k or 8k textures off a disc on the fly, for example, even into a tile cache, there are benefits in being able to read many smaller files/blocks faster.

Thats why you won't always be refilling RAM. Sometimes you just need to update a few textures and geometry etc. Not everything. It doesn't invalidate anything I've said. If anything it supports what I was saying. For example demand paging 120MB worth of 4K textures won't fully need even 2.4GB/s let alone 5.5GB/s. With such small data sizes it doesn't matter. Thats why the Dirt 5 dev mentioned figuring out ways of requesting for large amounts of data in order to fully utilize the SSDs.
 
@rntongo The PS5 drive will load smaller files faster. 5.5 MB/ms vs 2.4 MB/ms. Not exactly that because it depends on sequential reads etc, Those are the types of reads you'd be doing if you were loading texture data on demand in a virtual texturing system etc. That is my point. In the domain of a smaller segment of time the PS5 drive will be faster and in a useful way. Not to mention any other type of data that would be streamed in/out.
 
@rntongo The PS5 drive will load smaller files faster. 5.5 MB/ms vs 2.4 MB/ms. Not exactly that because it depends on sequential reads etc, Those are the types of reads you'd be doing if you were loading texture data on demand in a virtual texturing system etc. That is my point. In the domain of a smaller segment of time the PS5 drive will be faster and in a useful way. Not to mention any other type of data that would be streamed in/out.

how do you arrive at 5.5MB/ms for example?
 
how do you arrive at 5.5MB/ms for example?

divide by 1000. I’m not an expert in ssd io by any means, It really boils down to operations per second which the ps5 drive can do more. It can do more operations per millisecond or whatever period of time you want to think of. If you think of it per frame the ps5 will be able to read more data per frame or the same amount of data faster. Either scenario is useful.
 
Time Units from Wiki

A millisecond (from milli- and second; symbol: ms) is a thousandth (0.001 or 10−3 or 1/1000) of a second

A microsecond is an SI unit of time equal to one millionth (0.000001 or 10−6 or 1⁄1,000,000) of a second. Its symbol is μs, sometimes simplified to us when Unicode is not available

A nanosecond (ns) is an SI unit of time equal to one billionth of a second, that is, 1⁄1 000 000 000 of a second, or 10−9 seconds.
 
If you could fill the RAM pool in 0.5 seconds instead of 1.0 seconds it would be better. If you could fill the RAM in 0.1 seconds instead of 1 second that would be even better still. The faster it is, the more developers can depend on having assets streamed into the GPU/RAM in time for what they want to do.

The way the vast majority of data is stored on disc requires some work by the CPU and/or GPU before it is usable. I.e. you don't just load the 'game world' into RAM and it's ready to render, collectively this data will comprise a lot of different data types like geometry/models, textures, sounds, AI scripts, shaders, BVH and other light maps. Game engines are complex and keeping track of all these assets in RAM is itself complex and requires a lot of work so there is a [CPU/GPU] computational limit to how fast you put load into RAM in a usable state even if you can double the I/O.

There certainly are cases where faster is better, I'd image for sleep/resume (game switching) would benefit quite a lot as I'm quite sure this is jump a memory dump/restore operation - what is loaded, is the data as it was already stored in memory.

It's early days though, it'll take some take and experience to eek more performance out of the new console's I/O. We're seeing pretty good game switching times on Series S|X but we're also seeing games like Miles Morales just load in 6-7 seconds which is nuts when you remember this is a complex, dense, open world superhero game.
 
I'm not saying faster SSD speeds are not good. I'm saying if you have higher effective throughput than the available RAM its a waste! Why would you spend money on an expensive SSD yet you could fill up RAM with a combination of a slower but cheaper SSD plus a decompression block? Thats what Sony and MSFT had to look at when determining the speed of the SSD. In any case RAM is a memory cache so its not like you're going to be trying to refill it constantly. Thats not how RAM works. Best case scenario is having a system where the combination of the SSD speed and the decompression block results in an effective throughput equivalent to the size of RAM. (For example a PS6 or Next gen Xbox with 32GB of unified memory. 12GB/s SSD speeds. A 2.5:1 decomp ratio resulting in 30GB/s effective throughput. This would be equivalent to getting an SSD with 32GB/s speeds but with a huge saving.)

An example is you have two systems. Everything is the same except the speed of the SSD(So Same memory controllers, OS, etc)
System 1: 8GB RAM, SSD speed 5.5GB/s, decomp block capable of 2:1. The result is effective throughput of 11GB/s for a system that only has up to 8GB of RAM.
System 2: 8GB RAM, SSD speed 3.2GB/s, decomp block capable of 2:1. The result is effective throughput of 6.4GB/s for a system that has 8GB RAM.

System 2 ceteris paribus is more cost effective and will perform the same as the first one. You only need a certain amount of working set of RAM, you can't have all of it being saturated with data all the time. It's ineffective. The main benefit of SSDs is having a larger working set of RAM.

The SSD speeds were chosen based off cost and amount of RAM required for next gen games. If they were going to be keeping the size of RAM constant at 8GB you would 100% definitely not be having a 5.5GB/s SSD in the PS5. It would be a complete waste of money. Something like 3.2GB/s would be much more cost effective. The processors are limited by what is in memory. And you want to cache some data or have close data resident in RAM. No need to refill it all up constantly.

The bigger the pool of RAM, the less of an impact the bandwidth of the SSD has on the overall system by mitigating the need to call data from the SSD that may be needed immediately.

The size of RAM becomes almost irrelevant the moment the APU/GPU/CPU requests data thats not in it and must be called from the SSD.

Bandwidth between SSD and RAM is one of the variables that dictate how quickly calls to the SSD can be serviced. The higher the bandwidth of the SSD the more robust the virtual memory system becomes.

It hard to image the concept when you are talking 32GB of gddr and we still in the midst of games that are built to contend with a fraction of that memory. But in a reality where games moves enough data that a pool of 32GB of RAM becomes a constraint the more influence the bandwidth of the SSD will have on the overall system. The higher the better unless the increase in bandwidth somehow comes at a cost of latency.
 
Last edited:
Time Units from Wiki

A millisecond (from milli- and second; symbol: ms) is a thousandth (0.001 or 10−3 or 1/1000) of a second

A microsecond is an SI unit of time equal to one millionth (0.000001 or 10−6 or 1⁄1,000,000) of a second. Its symbol is μs, sometimes simplified to us when Unicode is not available

A nanosecond (ns) is an SI unit of time equal to one billionth of a second, that is, 1⁄1 000 000 000 of a second, or 10−9 seconds.
Thank you

millisecond. A frame will be 33.3 ms for 30 fps or 16.6 ms for 60 fps.

Okay so my issue was with trying to use these other units of measurements to determine SSD performance yet typically you're supposed to use input output operations per second(IOPs). The only times I've seen nanoseconds and microseconds used is in comparing latency of different memories on the memory hierarchy pyramid. Can a dev request for data midframe? Yes very possible. The dirt 5 dev has spoken about doing so on the Series X. But when a game is running it all boils down to determining during runtime which data to demand page into RAM. That is not determined by the speed of the SSD but by the CPU/GPU. Is having a 2x faster SSD helpful, absolutely. But if the SSD is still sending uneeded textures into RAM then you're wasting resources. It all comes down to texture streaming as well since that's what makes up the largest bulk of games.


divide by 1000. I’m not an expert in ssd io by any means, It really boils down to operations per second which the ps5 drive can do more. It can do more operations per millisecond or whatever period of time you want to think of. If you think of it per frame the ps5 will be able to read more data per frame or the same amount of data faster. Either scenario is useful.

The part I've bolded isn't as straight forward as that! The PS5 CPU makes a request for data who's transfer is fulfilled by the DMA controller. We don't know which system can make better asychronous I/O requests! If anything the Series X has a higher(and sustained) clocked CPU so more cycles for IOPs. It comes down to the DMA controllers and the CPUs. And you can bet both companies are going to be super quiet about how their DMA controllers work. Look at games like NBA 2k21 and Fifa 21 where they've been optimized to use the SSDs, you don't see 2x higher load times. It's a toss up.

But I do agree the PS5 will be able to send in more data per frame. I think thats very unequivocal. The thing though is which system is streaming in textures more efficiently into RAM. If the PS5 doesn't have the same hw/sw for texture streaming it could be sending in some unused textures for example and thus wasting its I/O throughput.

Thats why I don't think we're going to be seeing a huge gap equivalent to 2x in things like and asset streaming. Definitely the PS5 has the faster SSD and can send in more data into RAM though.
 
The bigger the pool of RAM, the less of an impact the bandwidth of the SSD has on the overall system by mitigating the need to call data from the SSD that may be needed immediately.

The size of RAM becomes almost irrelevant the moment the APU/GPU/CPU requests data thats not in it and must be called from the SSD.

Bandwidth between SSD and RAM is one of the variables that dictate how quickly calls to the SSD can be serviced. The higher the bandwidth of the SSD the more robust the virtual memory system becomes.

It hard to image the concept when you are talking 32GB of gddr and we still in the midst of games that are built to contend with a fraction of that memory. But in a reality where games moves enough data that a pool of 32GB of RAM becomes a constraint the more influence the bandwidth of the SSD will have on the overall system. The higher the better unless the increase in bandwidth somehow comes at a cost of latency.

That doesn't invalidate my point. All I was saying is MSFT and Sony looked at how much memory was needed or possible in next gen systems, looked at the cost of SSDs and chose to use a combination of faster SSDs and better hw decompression hw in order to send data into RAM. They'll do the same for 10th gen systems.

The size of RAM becomes almost irrelevant the moment the APU/GPU/CPU requests data thats not in it and must be called from the SSD.

I was talking about size of RAM with regards to disk I/O not paging. You're talking about size of RAM with regards to paging!!

The size of RAM absolutely has an effect on the disk I/O throughput required. We had RAM increase by 16 times from 8th to 9th gen(512MB to 8GB) while the HDD speeds remained below 140MB/s. So long load times, inefficient asset streaming. With larger sizes of RAM, it would have been worse. Now we have 2x the RAM at 16GB and the effective disk I/O throughput is 4.8-9GB/s. That is more than sufficient at for the size of RAM. Thats all.
 
MSFT and Sony looked at how much memory was needed or possible in next gen systems

Well yeah makes sense doesnt it. They made the most reasonable balance between cost and performance, they found 16gb and a 2.4 respective 5gb/s ssd suited their machines the best. Its all about (the right) compromises. If cost etc wasnt an issue there would probably be different numbers but thats never the case for a 500 dollar console.
 
The size of RAM absolutely has an effect on the disk I/O throughput required. We had RAM increase by 16 times from 8th to 9th gen(512MB to 8GB) while the HDD speeds remained below 140MB/s. So long load times, inefficient asset streaming. With larger sizes of RAM, it would have been worse. Now we have 2x the RAM at 16GB and the effective disk I/O throughput is 4.8-9GB/s. That is more than sufficient at for the size of RAM. Thats all.
This is not 100% correct here. Yes, the memory size increased a lot (5.5 GiB available to games) but the I/O also heavily increased. Games had to be read from discs on PS3/360. The bandwidth wasn't that charming and the latency ... oh boy ....
Yes, HDD (installations) were also possible, but after all games had to work from discs, so they couldn't really rely on the much better HDD speed and latency.

With PS4 and XB1 the HDD became the new norm for games. And that you could work with lower loading times showed some games that got patches to load quicker just before PS5 launch. Loading times were just not a priority before but it was possible to get the done much quicker. It was just "good enough" for the last gen.


To the IO-Bandwidth topic:
Both consoles have good enough IO to get more or less the same work done. Yes in some edge cases the PS5 might have and edge because of it generally higher IO bandwidth. But that is only important up to a specific point. But it is always better to have a headroom there.
At the same time, the xbox has lower latencies for their IO solution. There is only one memory chip. Data is not split between several chips that means the clock speed much be much higher than one chip of the PS5 system. That should lead to reduced latencies but not higher IO. But e.g. this could mean you can call data in (at least a small chunk) more closely to the point where you actually need it. At least in theory. It might be just a coincidence that also SFS is more or less optimized to read only as much data as needed, so many small read-requests are incoming and that's where the lower latency could actually mean something. I don't say that this solution is better than the other, but it just seems to be optimized to reduce the latency wherever possible.

In the end, both solutions should deliver enough speed so the main memory shouldn't be a problem to soon. The biggest problem I see is just the size of the SSD. Games over a generation tend to grow in size ...
 
I suppose if the SSDs can organize and retrieve asses without redundantly writing them into streams, there'd be some games that could share large pool of asset resources.. Especially with things being so dynamic, and not needing to bake so much into the textures. (like ray-tracing, lighting, reflections, etc) Or model things specifically into character models. (character can wear X clothing and have it drape and move naturally)
So an extreme example, would be Ocarina of Time and Majora's Mask. Majora's Mask reuses a good chunk of Ocarina's assets, with additional resources over the top of it.. Their install size could be much smaller, if they draw from the same pool.

At some point, you might not even need things like normal / parallax maps, and other hand-tuned models.. So the effort is in designing things, rather than the hand-tuning of them to squeeze into a game world..

*Actually, seeing that game Dreams, they're probably doing alot of that already. (and Mario Maker, etc..)
 
how do you arrive at 5.5MB/ms for example?

They converted seconds into milliseconds, I'm guessing to keep in with frame timing measurements for framerates (8.3 ms, 16.6 ms, 33.3 ms, etc.).

So it's basically looking at the transfer rate in relation to how it might impact frame times.
 
That doesn't invalidate my point. All I was saying is MSFT and Sony looked at how much memory was needed or possible in next gen systems, looked at the cost of SSDs and chose to use a combination of faster SSDs and better hw decompression hw in order to send data into RAM. They'll do the same for 10th gen systems.



I was talking about size of RAM with regards to disk I/O not paging. You're talking about size of RAM with regards to paging!!

The size of RAM absolutely has an effect on the disk I/O throughput required. We had RAM increase by 16 times from 8th to 9th gen(512MB to 8GB) while the HDD speeds remained below 140MB/s. So long load times, inefficient asset streaming. With larger sizes of RAM, it would have been worse. Now we have 2x the RAM at 16GB and the effective disk I/O throughput is 4.8-9GB/s. That is more than sufficient at for the size of RAM. Thats all.

I don't think RAM size is the only factor or possibly even the biggest one though, personally. IMO RAM bandwidth and latency has the bigger impact, RAM capacity is likely more a secondary factor.

Take one system with 4 GB of HBM2E RAM at 1 TB/s and another system with 8 GB of GDDR6 RAM at 224 GB/s; the disk I/O for the former is going to need to be a lot faster than that of the latter to ensure it can refresh the HBM2E quick enough for the HBM2E to keep the GPU fed with data. If the disk I/O is too slow and the GPU is looking for new data in the HBM2E that isn't there, that's a bottleneck.

This of course is also assuming you have appropriately-powerful GPUs in both of those examples. Say the system with the weaker GPU has the faster RAM and vice-versa, well now they're both bottlenecked. Former because the GPU might not even be able to access the RAM that frequently if it's going to take longer to process the read data for output, the latter because the GPU won't be serviced quickly enough.

There's always a few ways around that type of stuff; for example it doesn't matter as much if the more powerful GPU has slower RAM and even slow disk I/O if it has a fat amount of cache on the die. But calling that a balanced design would probably be stretching it.
 
They converted seconds into milliseconds, I'm guessing to keep in with frame timing measurements for framerates (8.3 ms, 16.6 ms, 33.3 ms, etc.).

So it's basically looking at the transfer rate in relation to how it might impact frame times.
Thank you.

I don't think RAM size is the only factor or possibly even the biggest one though, personally. IMO RAM bandwidth and latency has the bigger impact, RAM capacity is likely more a secondary factor.

Take one system with 4 GB of HBM2E RAM at 1 TB/s and another system with 8 GB of GDDR6 RAM at 224 GB/s; the disk I/O for the former is going to need to be a lot faster than that of the latter to ensure it can refresh the HBM2E quick enough for the HBM2E to keep the GPU fed with data. If the disk I/O is too slow and the GPU is looking for new data in the HBM2E that isn't there, that's a bottleneck.

This of course is also assuming you have appropriately-powerful GPUs in both of those examples. Say the system with the weaker GPU has the faster RAM and vice-versa, well now they're both bottlenecked. Former because the GPU might not even be able to access the RAM that frequently if it's going to take longer to process the read data for output, the latter because the GPU won't be serviced quickly enough.

There's always a few ways around that type of stuff; for example it doesn't matter as much if the more powerful GPU has slower RAM and even slow disk I/O if it has a fat amount of cache on the die. But calling that a balanced design would probably be stretching it.

I am not against what you're saying. But at the end of the day you want a large amount of RAM such that the data is closer to the CPU/GPU. At the end of the day disk storage is just an I/O device and RAM is actual global memory of the CPU. Thats the most important thing alongside enough memory bandwidth for whatever processor will be accessing the device. You want higher memory bandwidth for highly parallel workloads in GPUs so more memory bytes per cycle on GPUs than CPUs.

But at the end of the day I was focused on what developers wanted after MSFT & Sony ascertained the amount of RAM they needed(At least 16GB) i.e the data path between memory and disk I/O. So memory bandwidth didn't really play a role in this argument. We know the disk I/O was definitely becoming a bottleneck because of the use of HDDs. The aim of adding SSDs is not to have SSDs replace RAM or constantly fill up RAM, its simply to have them fast enough such that devs can utilize the RAM more efficiently. Larger working sets of RAM and much better demand paging. At the end of the day it is much more cost effective to simply use an SSD with decompression hw to augment the disk I/O than aiming to go as fast as possible with the SSD. And thats what they did and will do with 10th gen hw as well.
 
Back
Top