Next-Generation NVMe SSD and I/O Technology [PC, PS5, XBSX|S]

They are owned by Epic now, so I'm no longer sure what the licensing looks like. I do wonder if they make them free for UE4/5 games.
 
PC NVME exceeding 13 GB/s.
And that's just under a quarter of the theoretical bandwidth of a 16 lane PCIe Gen 5 bus. The PCIe Gen 6 specifications was finalised in October and slightly doubles those numbers.

You don't want to be buying a PCIe 4.0 motherboard right now. :nope:
 
And that's just under a quarter of the theoretical bandwidth of a 16 lane PCIe Gen 5 bus. The PCIe Gen 6 specifications was finalised in October and slightly doubles those numbers.

You don't want to be buying a PCIe 4.0 motherboard right now. :nope:
Depends on what you're planning on doing. Let's be real.. Gen 5 and 6 aren't going to be utilized by games any time soon. We're still waiting for developers to start making use of 2.4GB/s raw bandwidth first and foremost. Let's pass that barrier before we ask for more lol.
 
And that's just under a quarter of the theoretical bandwidth of a 16 lane PCIe Gen 5 bus. The PCIe Gen 6 specifications was finalised in October and slightly doubles those numbers.

You don't want to be buying a PCIe 4.0 motherboard right now. :nope:

If you want to use a GPU with that PCIe 5.0 SSD you'll also need yo make sure you get a mobo with two 8x PCIe 5.0 slots as opposed to one 16x slot too. It will limit the GPU bandwidth but you'll still have as much as a 16x PCIe 4.0 so shouldn't be a problem.

It's a shame Intel chose to limit its dedicated 4x storage link on Alderlake to PCIe 4.0.
 
What speed become excessive/redundant/wasteful though? Video editing of multiple 8K streams - yep, very useful. But as talked about with the PS5's paltry 5.5 GB/s, even 10 GB/s compressed yada yada, that much unique content is too expensive to even make, filling RAM requires too much expensive RAM, and yet using streamed assets instead reduces the BW requirements to a minimum. I can't think of any application beyond high def video editing where that much BW is valuable and useable (and you still need to processing power to be able to process all that content too). Anything else you do with a computer including gaming, it seems to me overkill.
 
What speed become excessive/redundant/wasteful though?
For gaming, essentially anything beyond the baseline of what games on console are doing this gen, I'd say.

It wont be like last generation where consoles had HDD's and we could run SSD's on PC to cut loading times dramatically. Ultra fast loading is going to become the norm this generation on consoles already, so the advantages of minimizing loading times further will be miniscule.

Pop-in is also something that's going to be heavily addressed this generation through the use of fast SSD's in the consoles. It's not impossible that faster SSD's could reduce this further, but I'd again bet that the advantages will be minimal thanks to diminishing returns.

I suppose you could argue the ability to maybe push higher texture resolutions and whatnot, but *again* - diminishing returns. Such diminishing returns were already pretty noticeable this past generation already in terms of texture resolution, where it honestly stopped being a limiting factor to the impressiveness of graphics in lots of bigger games, especially in the back half of the gen.

I feel like even a decent PCIe 3.0 NVMe will likely be enough for the whole generation, and with extremely minimal advantages going beyond that. Maybe later on, Sony ports over some blockbuster PS5 exclusives where using the highest settings requires something faster, but I'd bet you still wouldn't even need a top end 4.0 drive for this.

Theoretically, there's definitely things game developers could do with even faster SSD's, but I just dont see them going out of their way to accommodate this just for high end SSD users on PC. It's going to be one of those 'good enough' things, and PC settings scaling beyond console versions of games will be directed at simpler GPU-heavy options as usual.
 
What speed become excessive/redundant/wasteful though? Video editing of multiple 8K streams - yep, very useful. But as talked about with the PS5's paltry 5.5 GB/s, even 10 GB/s compressed yada yada, that much unique content is too expensive to even make, filling RAM requires too much expensive RAM, and yet using streamed assets instead reduces the BW requirements to a minimum. I can't think of any application beyond high def video editing where that much BW is valuable and useable (and you still need to processing power to be able to process all that content too). Anything else you do with a computer including gaming, it seems to me overkill.

I would think future games beyond "The Medium" (i.e., rendering separate/multiple worlds within the same screen space or scene, each having its own unique high frequency detailed textures/materials) should benefit from such high SSD/NVMe bandwidth speeds.
 
Last edited:
What speed become excessive/redundant/wasteful though? Video editing of multiple 8K streams - yep, very useful. But as talked about with the PS5's paltry 5.5 GB/s, even 10 GB/s compressed yada yada, that much unique content is too expensive to even make, filling RAM requires too much expensive RAM, and yet using streamed assets instead reduces the BW requirements to a minimum. I can't think of any application beyond high def video editing where that much BW is valuable and useable (and you still need to processing power to be able to process all that content too). Anything else you do with a computer including gaming, it seems to me overkill.

For gaming, essentially anything beyond the baseline of what games on console are doing this gen, I'd say.

It wont be like last generation where consoles had HDD's and we could run SSD's on PC to cut loading times dramatically. Ultra fast loading is going to become the norm this generation on consoles already, so the advantages of minimizing loading times further will be miniscule.

Pop-in is also something that's going to be heavily addressed this generation through the use of fast SSD's in the consoles. It's not impossible that faster SSD's could reduce this further, but I'd again bet that the advantages will be minimal thanks to diminishing returns.

I suppose you could argue the ability to maybe push higher texture resolutions and whatnot, but *again* - diminishing returns. Such diminishing returns were already pretty noticeable this past generation already in terms of texture resolution, where it honestly stopped being a limiting factor to the impressiveness of graphics in lots of bigger games, especially in the back half of the gen.

I feel like even a decent PCIe 3.0 NVMe will likely be enough for the whole generation, and with extremely minimal advantages going beyond that. Maybe later on, Sony ports over some blockbuster PS5 exclusives where using the highest settings requires something faster, but I'd bet you still wouldn't even need a top end 4.0 drive for this.

Theoretically, there's definitely things game developers could do with even faster SSD's, but I just dont see them going out of their way to accommodate this just for high end SSD users on PC. It's going to be one of those 'good enough' things, and PC settings scaling beyond console versions of games will be directed at simpler GPU-heavy options as usual.

Yes I agree, halving load times from say 2 seconds to 1 second is hardly game changing. It's also likely that games will be bottlenecked by CPU speed at load time rather than raw SSD throughput anyway.

Certainly higher resolution textures and higher details models could use up some of that additional bandwidth but aside from the diminishing returns there, it also pre-supposes that the PS5 for example was being pushed to its IO limits with the console level assets which seems unlikely outside of very rare corner cases.

I think the biggest challenges on PC right now aren't the raw hardware speed which is already more than fast enough, it's getting DirectStorage rolled out and then developers making good use of it - something we've not really seen much of even on the consoles so far (outside of quick resume).
 
I think the biggest challenges on PC right now aren't the raw hardware speed which is already more than fast enough, it's getting DirectStorage rolled out and then developers making good use of it - something we've not really seen much of even on the consoles so far (outside of quick resume).

Yea, I just skimmed through the MGS presentation for DirectStorage again and you know it's an absolutely perfect example of the industry simply needing sufficient motivation to improve and integrate things. The changes they are making, to both Windows (the storage stack) and the end-to-end path for heavy bandwidth data (like textures and geometry) are all just very logical things that have been thought of and could have been done before. Looking back on the ideas they've proposed with DirectStorage and it's just the obviously correct way to do things, considering the circumstances. I guess you could argue that they required NVMe SSD tech to really make it practical however. The current implementation of DirectStorage (gpu based decompression) is actually quite an elegant solution to the problem that the PC industry has in that it takes ages for hardware adoption to happen. So while they build and integrate real hardware-based decompression into PC silicon, we have a workable solution in the meantime. The optimizations to the Windows Storage Stack don't require any work from the developer to implement (afaik) and the decompression tech will work on all modern GPUs, so all the hardware is already there, and most modern PCs which would actually run next gen games, likely already have NVMe drives in them.

Considering that, the only real hurdle for developers is to implement DirectStorage into their engines which basically entails optimizing the engine to only load what it requires in a very fined-grain fashion (SFS) and then utilize the decompression tech. Hopefully that's happening, and I'm fairly confident it is.. with the only reason why we're not seeing the effects of it yet is that we're still finishing a cross-gen period.

I honestly can't wait to start seeing some of these upcoming GDC talks. :)
 
What speed become excessive/redundant/wasteful though? Video editing of multiple 8K streams - yep, very useful. But as talked about with the PS5's paltry 5.5 GB/s, even 10 GB/s compressed yada yada, that much unique content is too expensive to even make, filling RAM requires too much expensive RAM, and yet using streamed assets instead reduces the BW requirements to a minimum. I can't think of any application beyond high def video editing where that much BW is valuable and useable (and you still need to processing power to be able to process all that content too). Anything else you do with a computer including gaming, it seems to me overkill.
Yea, for my application it would be useful, it's bottleneck is the PCIE speed. Some of the challenges faced is that not all codecs can decode on GPU using hardware, so you'll be using a CPU to do some work and firing it over to the GPU to do work and firing it back.

If in a situation you had PCIE speeds approach TB+ bandwidth, that would be a very interesting time in programming because discrete CPU and GPUs would behave similarly to a unified SoC. DS models would be super interesting.
 
What speed become excessive/redundant/wasteful though?
PCIe is a bit a mixed bag in terms of deliverable bandwidth. Because it focusses on reducing latency the theoretical numbers are rarely every achieved. It's worth keeping in mind that PCI bus loses a lot on 'theoretical' performance balancing CPU - GPU, VRAM - DRAM, audio, I/O and a bunch of other things.
If in a situation you had PCIE speeds approach TB+ bandwidth, that would be a very interesting time in programming because discrete CPU and GPUs would behave similarly to a unified SoC.

Thunderbolt combines PCIe, DP and power. Is there hardware configuration where TB exceeds PCIe?
 
Thunderbolt combines PCIe, DP and power. Is there hardware configuration where TB exceeds PCIe?
In my mind, I don’t believe so. Thunderbolt is often rated in Gbps, not GBps that PCIE is. PCIE is significantly faster than thunderbolt IIRC.

so when you see something like 80gbps for thunderbolt that is still slower than the 16x PCIE port.
 
Thunderbolt is often rated in Gbps, not GBps that PCIE is. PCIE is significantly faster than thunderbolt IIRC. so when you see something like 80gbps for thunderbolt that is still slower than the 16x PCIE port.

The conversion is trivial but this is definitely not correct in general terms, it sounds like you have had experience with a poorly-designed device. Thunderbolt is PCIe so it's not faster or slower - which is why I queried the earlier post.

What you're describing above is the typical shenanigans of marketing cheaper products where they quote the maximum bandwidth of the port (but not the controller), or the maximum bandwidth of one port particular port or any one port when used in isolation. Outside of Apple's hardware, it feels like Thunderbolt is a mess and it's common to see it implemented on Windows machines where only one of multiple Thunderbolt ports able to achieve the maximum throughput and if you use another it'll be half-speed, or where all the ports are capable of delivering full speed but not simultaneously because there aren't sufficient controllers for all ports to reach maximum bandwidth.

Trust, Intel know what they're doing with Thunderbolt and they tried to solve most of these issues (parrticually branding and marketing) with Thunderbolt 4 certification but it seems like only Apple bothered with the thunderbolt_readme.txt before implementing and marketing the ports. ¯\_(ツ)_/¯
 
Pop-in is also something that's going to be heavily addressed this generation through the use of fast SSD's in the consoles. It's not impossible that faster SSD's could reduce this further, but I'd again bet that the advantages will be minimal thanks to diminishing returns.
Pop-in is (now) most times an engine related problem and not really a bandwidth related thing. Even though we throw massive bandwidths in titles on the games, they have still pop-in problems. Those objects are often just not considered to be in the scene by the engine (e.g. we see this in racing games, even for objects that should already be in memory). I don't mean higher-res texture pop in like the famous door in FF7 Remake.
A problem we get with better and better graphics is, that pop-ins can be really visible because they are now one of the biggest anomalies in a scene. Graphics are just so good that your attention is directly catched by those (even sometimes) small pop-ins. Problem here is that the engine must also request the object fast enough and try to get it into the scene without that pop-in it effect it must slowly enter the scene as if it was always there, but just not visible from a millimeter further away.
So I really don't think the pop-in effect will go away, but it might get reduced a bit.

The raw bandwidth isn't really going to change that much.
 
The conversion is trivial but this is definitely not correct in general terms, it sounds like you have had experience with a poorly-designed device. Thunderbolt is PCIe so it's not faster or slower - which is why I queried the earlier post.

What you're describing above is the typical shenanigans of marketing cheaper products where they quote the maximum bandwidth of the port (but not the controller), or the maximum bandwidth of one port particular port or any one port when used in isolation. Outside of Apple's hardware, it feels like Thunderbolt is a mess and it's common to see it implemented on Windows machines where only one of multiple Thunderbolt ports able to achieve the maximum throughput and if you use another it'll be half-speed, or where all the ports are capable of delivering full speed but not simultaneously because there aren't sufficient controllers for all ports to reach maximum bandwidth.

Trust, Intel know what they're doing with Thunderbolt and they tried to solve most of these issues (parrticually branding and marketing) with Thunderbolt 4 certification but it seems like only Apple bothered with the thunderbolt_readme.txt before implementing and marketing the ports. ¯\_(ツ)_/¯
I mean, I don’t think thunderbolt currently exceeds 4X PCIE lanes. With respect to my issue, I need more bandwidth to GPU which is already using 16x lanes. If thunderbolt could do it faster and better I would switch, but I’ve not seen anything like that come up as being an ideal setup for a desktop pc.
 
I mean, I don’t think thunderbolt currently exceeds 4X PCIE lanes. With respect to my issue, I need more bandwidth to GPU which is already using 16x lanes. If thunderbolt could do it faster and better I would switch, but I’ve not seen anything like that come up as being an ideal setup for a desktop pc.
Thunderbolt 4 mades exactly (no more, no less) than four PCIe 3.0 lanes (so 32 Gbit/s).

How else would you be connecting an external GPU if not over Thunderbolt? Some kind of bespoke dock?
 
Thunderbolt 4 mades exactly (no more, no less) than four PCIe 3.0 lanes (so 32 Gbit/s).

How else would you be connecting an external GPU if not over Thunderbolt? Some kind of bespoke dock?
not sure, i wasn't sure if there was a way to by pass PCIE and go straight thunderbolt but I guess you confirmed you cannot.
So I'm not understanding why thunderbolt came into the discussion around bandwidth when it's clearly going to have less than PCIE?
 
Back
Top