Next-Generation NVMe SSD and I/O Technology [PC, PS5, XBSX|S]

Your forgetting the single most important number here, memory Bandwidth.
32GB at the same bandwidth as the 16GB, gets you almost nothing.

My favorite way to think about these things is actually memory bandwidth / Per Frame.
lets take the PS5 as an example.

Mem bw = 448 GB/s,
At 60fps, that means a single frame can access a total of 7.466, GB of data.
Thats plenty of textures, models, render targets etc.
This also means that if you have a total of 16Gb ram, your NEVER going to access your entire ram in a the
process of rendering a single frame.
now a 30fps game, can access pretty close to the entire 16GB in the process of rendering a frame,
but even @ 30fps, using a 32GM ram over half would be wasted.

Now lets think about what a fast SSD gives us.
1. The ability to refresh our smaller pool of ram faster
2. The ability to cache more game state at any single time.

The faster refilling of actual RAM, is good because it simplifies the entire process.
HD or even slow SSD, are not quick enough to enable on demand paging systems for textures or other data structures.

The ability to cache more game state is a HUGE step forward - imho.
Richer, more living worlds, where game state only needs to be updated on a 1-10 fps basis, ie "the living world" that isn't immediately on screen.
will see a huge step forward this gen.

Now if your talking about 32GB @ 2 x 448GB/s, yeah sure the double the ram starts to make sense,
but now your no longer at price parity, even with no SSD. Thats 3090 prices.

Overall i think almost all developers, would opt for the 16GB + SSD option.
modern computing is well versed in the techniques for handling multi-level caching, and addressing across
memories of different BW and latency. But only once you get to decent SSD speeds, it all falls apart with a HD.

I think the MS approach of letting the SSD act as sort of a victim cache for the RAM is a very smart idea,
and does in theory give them access to 100+GB of total memory, which is all HW managed for best performance.
but it's a somewhat unique mechanism, and may not be fully exploited.
( I dunno if you can sue the PS5 the same way.. )
What?

Memory bandwidth needs to scale with GPU performance and not necessarily memory size. Since we're not increasing GPU performance in my scenario then memory bandwidth doesn't really matter.

Having more RAM would most definitely help offset slower SSD performance with intelligent loading/caching techniques.
 
What?

Memory bandwidth needs to scale with GPU performance and not necessarily memory size. Since we're not increasing GPU performance in my scenario then memory bandwidth doesn't really matter.

Having more RAM would most definitely help offset slower SSD performance with intelligent loading/caching techniques.
Of course RAM bandwidth needs to scale with actual GPU perf,
exactly why more RAM is a bad design decision for this gen.

More ram makes no sense without a significantly more powerful GPU AND more memory bandwidth.
The decision to invest in high speed SSDs is by far the smarter choice.
Especially in a console ecosystem, where there is no "throw more HW at the problem to make things go faster"

Sorry if i got a bit focused on 1 specific reason why.
 
Wouldn't that in itself require more bandwidth as you're going to be accessing more data?
No. Most of the time (in last gen games) data in memory was just cached data that might have been needed but never was used. That is why techs were developed to load only things that are really needed.
But having data cached in ram solves many problems without the need to always use them. Wouldn't be the most efficient way of using the storage but the easiest without the need of higher bandwidth.
You have less problems caching Last frame info, ...

More bandwidth is always good but more memory also.
 
Sebbbi is a huge advocate of and expert in streamed data though! In designing a console you need to factor in what all the devs can and will do for all the different games. I expect in their ideal platform Bethesda would want more RAM, and then more, and then double that...

His reasoning was you can only access.......in fact.......It's just easier to paste his reply.

More bandwidth. You don't have the memory bandwidth to access most of the memory anyways (not even once per frame). The extra memory just sits there for future needs. If you use techniques such as DirectStorage for fast streaming, you don't need a lot of excess memory

Future needs = other areas of the same level. For example 400GB bandwidth at 60 fps allows you to access 6.6GB of memory per frame. You can't access more than that. The bus would starve. And this is theoretical max, real bandwidth is around 5GB/frame.
 
I messaged the wonderful Sebbi on Twitter with this same question and he went with more memory bandwidth over more physical RAM.
More bandwidth is never wrong and always helps the GPU/CPU.
More memory is also almost always a good idea (but expensive). Yes you need bandwidth so the GPU doesn't starve but at a certain point both are good enough.
Just look at the smaller GPUs in PC market. They don't really have high bandwidths but most of the time they have more than enough memory e.g. to use high res textures. Those don't need much more bandwidth but much memory as they are decompressed in memory.
Just think of it that you have something like a jpg on the SSD and when you put it into memory you have it as a bitmap there (not exactly like that but just to give you a direction).
To stay with that example, the faster the SSD is, the better this is to get it into memory. But at a certain point it doesn't matter anymore as the compressed file itself is small but the full picture needs much more memory and therefore bandwidth to write. If you now have enough memory you can cache the picture there until it is needed again. If it is needed it safes bandwidth that would have else been required to transfer (and decompress) the image. Therefore the image can be used directly so you safe at least one full image write of bandwidth.
 
It kinda goes without saying that you need more bandwidth to use more memory in a game. I don’t think the argument for more memory ignores that but is rather that these consoles were released somewhat prematurely. Not that it actually matters now and I’m sure MS/Sony both considered that.

That said, PS5 has 2.5x more bandwidth than PS4 so it kinda makes sense that it would also need 2.5x more memory. PS4 games had access to 5GB which translates to 12.5GB on PS5 and lines up with what Richard Leadbetter said regarding game allocations. Doubling that would need twice faster GDDR6 and Samsung only announced 24gbps variants earlier this summer.
 
That said, PS5 has 2.5x more bandwidth than PS4 so it kinda makes sense that it would also need 2.5x more memory.
Not if you're streaming data more. PS4 had to preload and cache larger amounts of data per level as it couldn't fetch said data JIT. With fast enough storage and a perfect streaming architecture, you could even get away with less RAM. Well, for the graphics. Game worlds will hopefully require more to represent bigger, more complex worlds.
 
Not if you're streaming data more. PS4 had to preload and cache larger amounts of data per level as it couldn't fetch said data JIT. With fast enough storage and a perfect streaming architecture, you could even get away with less RAM. Well, for the graphics. Game worlds will hopefully require more to represent bigger, more complex worlds.
Need more bandwidth as the CUs will starve. Memory bandwidth scales with CU amount, it usually doesn’t scale with memory amount. I think this is typical for any type of processor design and in terms of cost; compute has always scaled better than bandwidth.

The PS5 with 1/2 memory size with SSD would not be able to produce ps5 level performance. That’s too optimistic. The assets don’t represent a significant portion of the vram footprint.

Wrt space; I would disagree you could get away with less without annoyance; see the challenges of the split pool configuration on series consoles.
 
Last edited:
It kinda goes without saying that you need more bandwidth to use more memory in a game. I don’t think the argument for more memory ignores that but is rather that these consoles were released somewhat prematurely. Not that it actually matters now and I’m sure MS/Sony both considered that.

That said, PS5 has 2.5x more bandwidth than PS4 so it kinda makes sense that it would also need 2.5x more memory. PS4 games had access to 5GB which translates to 12.5GB on PS5 and lines up with what Richard Leadbetter said regarding game allocations. Doubling that would need twice faster GDDR6 and Samsung only announced 24gbps variants earlier this summer.

Need more bandwidth as the CUs will starve. Memory bandwidth scales with CU amount, it usually doesn’t scale with memory amount. I think this is typical for any type of processor design and in terms of cost; compute has always scaled better than bandwidth.

The PS4 with SSD would not be able to produce ps5 level performance.

Obviously 32GB+nvme would have allowed for more than the (somewhat) lacking 16gb+nvme. Theres allegedly around 12gb for games, which translates into 8 to 10 for graphics.... There are previous gen titles who exceed that kind of vram usage on a dedicated GPU with dedicated vram and bandwidth. Yeah optimization and all, but were also going into another generation now.
SSD's just do not substitute gddr or even ddr ram in full, its too slow for that, more latency sensitive aswell. What the SSD does is remove one of the many bottlenecks. The CPU/GPU and RAM configuration are still the most important components to what you finally see on-screen.

The consoles going with the hardware they got (low to midrange gpu/2x ram increase and a mid end cpu) has alot more to do with cost-balance-value at the time then what developers would have dreamed to have. Its the best they could do in 2019/2020 for a 500 to 600 dollar consumer-priced box.

And yes, bandwidth/vram size often do scale with GPU performance. Having say a full 16GB on a 10TF gpu wouldnt be all that efficient, the whole GPU would need to scale up to something RX6800/XT class. Then in turn the CPU would be needing an upgrade aswell since that 3.5ghz zen 2 would actually bottleneck a 6800xt class gpu.
 
Obviously 32GB+nvme would have allowed for more than the (somewhat) lacking 16gb+nvme. Theres allegedly around 12gb for games, which translates into 8 to 10 for graphics.... There are previous gen titles who exceed that kind of vram usage on a dedicated GPU with dedicated vram and bandwidth. Yeah optimization and all, but were also going into another generation now.
SSD's just do not substitute gddr or even ddr ram in full, its too slow for that, more latency sensitive aswell. What the SSD does is remove one of the many bottlenecks. The CPU/GPU and RAM configuration are still the most important components to what you finally see on-screen.

The consoles going with the hardware they got (low to midrange gpu/2x ram increase and a mid end cpu) has alot more to do with cost-balance-value at the time then what developers would have dreamed to have. Its the best they could do in 2019/2020 for a 500 to 600 dollar consumer-priced box.

And yes, bandwidth/vram size often do scale with GPU performance. Having say a full 16GB on a 10TF gpu wouldnt be all that efficient, the whole GPU would need to scale up to something RX6800/XT class. Then in turn the CPU would be needing an upgrade aswell since that 3.5ghz zen 2 would actually bottleneck a 6800xt class gpu.
I don’t think there is enough CU power to be able to make use of so much memory. As per sebbbi, 5GB of data per frame can be used at that particular bandwidth. To use anywhere close to 32GB of space per frame, you would need bandwidth nearly 4 /6 times higher or 1600GB/s - 2400GB/s. The CU size for such bandwidth would be enormous; at least 40-48TF of power.

The consoles have done it correct. This is the best setup they could have. The split pool is less ideal, but it’s a cost saving measure as per Shifty refers to. The 6700+ series also employ a similar strategy.
 
Your forgetting the single most important number here, memory Bandwidth.
32GB at the same bandwidth as the 16GB, gets you almost nothing.

My favorite way to think about these things is actually memory bandwidth / Per Frame.
lets take the PS5 as an example.

Mem bw = 448 GB/s,
At 60fps, that means a single frame can access a total of 7.466, GB of data.
Thats plenty of textures, models, render targets etc.
This also means that if you have a total of 16Gb ram, your NEVER going to access your entire ram in a the
process of rendering a single frame.
now a 30fps game, can access pretty close to the entire 16GB in the process of rendering a frame,
but even @ 30fps, using a 32GM ram over half would be wasted.

Now lets think about what a fast SSD gives us.
1. The ability to refresh our smaller pool of ram faster
2. The ability to cache more game state at any single time.

The faster refilling of actual RAM, is good because it simplifies the entire process.
HD or even slow SSD, are not quick enough to enable on demand paging systems for textures or other data structures.

The ability to cache more game state is a HUGE step forward - imho.
Richer, more living worlds, where game state only needs to be updated on a 1-10 fps basis, ie "the living world" that isn't immediately on screen.
will see a huge step forward this gen.

Now if your talking about 32GB @ 2 x 448GB/s, yeah sure the double the ram starts to make sense,
but now your no longer at price parity, even with no SSD. Thats 3090 prices.

Overall i think almost all developers, would opt for the 16GB + SSD option.
modern computing is well versed in the techniques for handling multi-level caching, and addressing across
memories of different BW and latency. But only once you get to decent SSD speeds, it all falls apart with a HD.

I think the MS approach of letting the SSD act as sort of a victim cache for the RAM is a very smart idea,
and does in theory give them access to 100+GB of total memory, which is all HW managed for best performance.
but it's a somewhat unique mechanism, and may not be fully exploited.
( I dunno if you can sue the PS5 the same way.. )

I think this is a really great explanation. Still, I guess more RAM with the same speed SSD and memory bandwidth would still be advantageous because you can cache more game data into RAM thus negating the need to read from the SSD as often and presumably saving memory bandwidth from all those saved writes. Plus the data in RAM is more readily available to be used by the game (more bandwidth/lower latency) than even the fastest NVMe's can offer which presumably opens up further game design options. In the extreme case you potentially fit the entire game, or at least all assets from an entire level into RAM and thus eliminate the need for streaming at all.
 
I don’t think there is enough CU power to be able to make use of so much memory. As per sebbbi, 5GB of data per frame can be used at that particular bandwidth. To use anywhere close to 32GB of space per frame, you would need bandwidth nearly 4 /6 times higher or 1600GB/s - 2400GB/s

The consoles have done it correct. This is the best setup they could have. The split pool is less ideal, but it’s a cost saving measure as per Shifty refers to. The 6700+ series also employ a similar strategy.

From my working out PS5 should have just enough bandwidth to access all it's RAM at 30fps and a little over half of it at 60fps (Based on 12GB of useable developer RAM)

So giving it extra RAM (Say bumping it to 32GB) wouldn't really make much difference as it couldn't access it with the current bandwidth it has and it would be merely 16GB of cache for future data.
 
I don’t think there is enough CU power to be able to make use of so much memory. As per sebbbi, 5GB of data per frame can be used at that particular bandwidth. To use anywhere close to 32GB of space per frame, you would need bandwidth nearly 4 /6 times higher or 1600GB/s - 2400GB/s. The CU size for such bandwidth would be enormous; at least 40-48TF of power.

The consoles have done it correct. This is the best setup they could have. The split pool is less ideal, but it’s a cost saving measure as per Shifty refers to. The 6700+ series also employ a similar strategy.

Its what i am saying, this 'if it had more memory', it wouldnt be effective. A 10TF GPU usually isn't paired to more than say 8 to 12GB vram at most. In special when their lacking infinity cache. You would need a higher-class GPU which in turn would be more effective with a stronger/faster CPU, etc etc. Also with more ram you want more bandwidth.
As i said its the best they could do at the time for the BOM they had.
The closest to PS5 (6600XT) is teamed to 8GB gddr6, its probably enough vram for the entire generation (as per sebbi) for that class gpu and what you do with it settings-wise. 6600XT does outperform PS5 in most games, even spiderman.
 
I don’t think there is enough CU power to be able to make use of so much memory. As per sebbbi, 5GB of data per frame can be used at that particular bandwidth. To use anywhere close to 32GB of space per frame, you would need bandwidth nearly 4 /6 times higher or 1600GB/s - 2400GB/s. The CU size for such bandwidth would be enormous; at least 40-48TF of power.

The consoles have done it correct. This is the best setup they could have. The split pool is less ideal, but it’s a cost saving measure as per Shifty refers to. The 6700+ series also employ a similar strategy.
I’m curious where you’re getting 4-6x from? PS5 only has 2.5x more bandwidth than PS4 but with twice the memory. Less if we’re counting PS4 Pro.
 
Its what i am saying, this 'if it had more memory', it wouldnt be effective. A 10TF GPU usually isn't paired to more than say 8 to 12GB vram at most. In special when their lacking infinity cache. You would need a higher-class GPU which in turn would be more effective with a stronger/faster CPU, etc etc. Also with more ram you want more bandwidth.
As i said its the best they could do at the time for the BOM they had.
The closest to PS5 (6600XT) is teamed to 8GB gddr6, its probably enough vram for the entire generation (as per sebbi) for that class gpu and what you do with it settings-wise. 6600XT does outperform PS5 in most games, even spiderman.
I mean the base consoles are fine. The real question isn't what we have today and whether it's good enough. I think this discussion has done a good job at showcasing this was the right solution. The problem I see is where we go from here.
The next generation of console or mid gen refresh - it's not like getting even more SSD will make things even better. We will need more compute and more bandwidth to increase graphical fidelity. This SSD imo is a 1 time boost. I think PS5 getting a larger step forward here on this front likely means it can sit at this particular speed for a generation or two, maybe more. XBox may find this solution inadequate by next generation, I dunno.

But now that the I/O bottleneck is alleviated, the next generation of consoles is just going to push more on compute and bandwidth until I/O is a problem again, and then we'll fix it and the cycle continues.
The challenge I see for next consoles, is OT however, I just don't see how we can get a mid-gen refresh hitting 20TF of power with the appropriate supporting bandwidth for $499. That's one hell of a chip, and we'd likely have to be having some sort of on-chip cache, I don't see how there is another way around it.
 
I mean the base consoles are fine. The real question isn't what we have today and whether it's good enough. I think this discussion has done a good job at showcasing this was the right solution. The problem I see is where we go from here.
The next generation of console or mid gen refresh - it's not like getting even more SSD will make things even better. We will need more compute and more bandwidth to increase graphical fidelity. This SSD imo is a 1 time boost. I think PS5 getting a larger step forward here on this front likely means it can sit at this particular speed for a generation or two, maybe more. XBox may find this solution inadequate by next generation, I dunno.

But now that the I/O bottleneck is alleviated, the next generation of consoles is just going to push more on compute and bandwidth until I/O is a problem again, and then we'll fix it and the cycle continues.
The challenge I see for next consoles, is OT however, I just don't see how we can get a mid-gen refresh hitting 20TF of power with the appropriate supporting bandwidth for $499. That's one hell of a chip, and we'd likely have to be having some sort of on-chip cache, I don't see how there is another way around it.

Its always enough, they have been every generation. MS/Sony do the most for the given budgets they have at their respective time era's. If they wanted more then the whole system need to scale up. Last gen didnt get SSD's because it was too expensive at the time. We will most likely see another generation of consoles, storage will get faster but so does ram/vram and everything else.... Guess nvme will just scale along. We're already at 7.5gb/s today with pcie4 before compression, PCIE5 supposedly is going to bring us much faster speeds even so theres room. If theres any need for 7+gb/s drive speeds, probably not much.
Also, its hard to say yet if the MS or Sony solution is ahead of eachother, their approaching IO differently.
 
Back
Top