Next-Generation NVMe SSD and I/O Technology [PC, PS5, XBSX|S]

So yeah i read the linked Thread and i still would not believe Jensen everything. The Idea that a GPU will outperform PS5 with 2 TF is a statement from Nvidia. I believe it when i see it.
We have people posting their Direct Storage results using GPU decompression and the GPU usage from the decompression doesn't even register in monitoring software because it's so low.

So you don't have to believe Jenson, you just need to test it yourself.
Btw iam talking here about RTX 2xxx cards and maybe 3xxx Cards. A PS5 does not need to be better as a RTX 4xxx Card.
PS5 isn't even better than half the 2000 and 3000 series cards.
A GPU with 100 add TF will of course have enough ressources to do the decompression. Also again - decompression is one thing. But PS5 has alot more custom tech no account for bottlenecks that would follow the decompression. Iam not seeing the PS5 I/O Block in its Entirety being mimiced with Direct Storage.
Forspoken developers claiming the game loads in 1 second without GPU decompression show it won't be a problem on PC.
And another Thought ( i mentioned that before) :
If decompressing per GPU would be such a logic and viable thing to do why did they not choose this path with the PS5.
They do, they also do it on PS4 and Xbox One.
They could have avoided all the custom I/O Shenanigens and have a beefier GPU instead. But they went with the heavy custom I/O Solution anyways.
Something is not adding up. Theres more to PS5s I/O and lovely Jensen flipping a switch and "just leave the GPU all the work" is not going to cut it.

Cost, power, heat.....to name a few.
 
Last edited:
If decompressing per GPU would be such a logic and viable thing to do why did they not choose this path with the PS5. They could have avoided all the custom I/O Shenanigens and have a beefier GPU instead. But they went with the heavy custom I/O Solution anyways.

Apples and Oranges .... sort of.

On PS5, Sony get to determine how the OS works, which compression formats can be used, how files are presented to the game etc. They can build custom hardware to do this, that has a tiny die cost, far smaller than the equivalent performance from a more general purpose compute shader solution. Sony are using a single silicon solution.

On PC, flexibility is essential, along with the ability to support hardware pre December 2020. PC can be, and increasingly will be, far more performant, but it might use more power and transistors to achieve this.

On PC, decompressing on the GPU directly into GPU memory (after the relatively sort-of narrowish PCIe bus) makes most sense, ultimately. Formats can adapt and change because they aren't hardwired into silicone, and different hardware vendors can choose their own approach. DS will work with a sizeable array of hardware configurations and at least two major OS revisions.

Something is not adding up. Theres more to PS5s I/O and lovely Jensen flipping a switch and "just leave the GPU all the work" is not going to cut it.

When you think of it in terms of "fixed platform vs PC with a crap load of hardware and software vendors", it starts adding up!
 
If decompressing per GPU would be such a logic and viable thing to do why did they not choose this path with the PS5. They could have avoided all the custom I/O Shenanigens and have a beefier GPU instead. But they went with the heavy custom I/O Solution anyways.
Something is not adding up. Theres more to PS5s I/O and lovely Jensen flipping a switch and "just leave the GPU all the work" is not going to cut it.
Completely different markets and ways of solving the same solution. The consoles do it is extremely cost efficient. But this requires the entire vertical stack to work together. This cannot occur on the PC side.

PC is just parts working together, but the raw performance of decompression via the GPU can easily surpass a fixed function unit, and because there isn’t vertical integration found in consoles that is the only way to do it.

Is it better than vertical integration? No. But that’s the difference between a console and a PC. But can it be more performant than vertical integration but at a price point that will never come anywhere close to the price point that a console offers.

The only secret sauce is how low they can get this price point.
 
They didn't do it with PS5 because firstly they're trying to be as cost effective as possible. Also, the GPU inside the PS5 is fixed.. which means that "cost" you were talking about, no matter how miniscule it might be on the PC side... is a precious resource on console. PCs continually improve over the course of a generation.

Sony essentially went for the cheapest solution which gets them the results/benefits they wanted from it. Their idea was to have a balanced system where their developers didn't have to concern themselves with moving goalposts regarding I/O... they could essentially guarantee a certain level of performance, without having any impact on their CPU and GPU performance. As a developer, it's undoubted the best overall solution.. If you look at any console generation, it's usually never the "most powerful" device that wins... it's always the best balanced device which is the easiest to exploit. Which is why the "time to triangle" metric from Cerny is so important to him.

Since the consoles are so well balanced and fixed platforms, of course they can be more efficient and easier to develop for.. I don't think anyone will argue against that... however on PC.. it appears to me at least, that Microsoft/Nvidia/AMD/Intel/ect are doing a very commendable job at integrating a much more efficient I/O pipeline, which is supported by most modern hardware and Windows OSs (Win 10, 11) and making it familiar enough for developers to exploit. The entire pipeline is absolutely much MUCH more efficient than it was just 1 year ago.

Given the other strengths of the PC architecture which developers can lean on if they choose to (such as more RAM for slower drives, or less ram for faster drives) there's a ton of capability there now. Games are also designed to scale.. which means if RAM capacity or Storage bandwidth is an issue, texture quality can be reduced for lower end systems, dramatically reducing the required bandwidth.. On consoles that isn't possible or necessary.. but it is an option on PC. Some developers will implement DirectStorage and take full advantage of it.. others will use it as a marketing tool where it's just slightly faster... and others will have no need... just as with any other technologies out there. Remember, MOST PS5 games are not using that I/O capability at all...

So in the end, I believe that DirectStorage in its current form now with GPU based decompression gives developers far more than enough throughput to handle anything they'll throw at it, from an I/O perspective. There are other engine bottlenecks now which will impede developers from pushing further and need addressing far before storage bandwidth. Luckily, Sony developers.. who are most likely to be building the games that require anything close to PS5's I/O capabilities during gameplay, are already improving their engines, and they along with the porting studios bringing their games to PC, will ensure that DirectStorage is utilized if/when it is necessary.
 
I don’t think it’s been overestimated. I think it’s just not been used. Fast IO resolves game and level design issues that have largely plagued titles after we got into the high fidelity era.

There is a reason why the terms corridor shooter, or walking simulator, were created. They described games that were largely streamlined in exchange for graphical fidelity. Lots of QTE to hide loading screens and the weapons and what was available to use were limited because on demand asset recall is out of scope for the slow spinning HDDs.

The real question that needs to be asked is whether it’s over engineered, which it may be, sure, but more means it won’t be the bottleneck. it’s a damn good upgrade and as we finally leave last Gen behind in 2023 I think we will finally get to see the fruits of fast IO and some other marketed features.
I tend to agree. There are people who are almost misterxmedia tier about the second GPU in the ssd that will allow the PS5 to defeat high end PC components 😂

But i think realism is best. It will allow for certain things and make devs lives easier most of all. Does PS5 need SSD space twice the speed of Xbox series x? It's possible it doesn't, but it fully eliminates any possibility of a bottleneck and perhaps can take off stress on the consoles other bits. The ps5 while still having many years left to give is nowhere near the most powerful thing in consumer computing and the same goes for series x

The real test for Xbox and PlayStation is dumping PS4 and Xbox one. Devs can't focus on what is even in the new boxes fully because they are still trying to largely account for just getting games out on the last gen.

This is partly the reason why I don't think pro consoles are even neccesary. We haven't seen what the base machines can do and they are far from being fully exploited yet
 
Probably Spiderman 2 from Insomniac will look better than anything else and will give a proper "Next Gen" Moment.

When that lands on PC we will get a firm Idea what realy is needed to match PS5.
 
Probably Spiderman 2 from Insomniac will look better than anything else and will give a proper "Next Gen" Moment.

When that lands on PC we will get a firm Idea what realy is needed to match PS5.

What makes you say that? When the PS4 Spiderman released it didn't look better than anything else.

And PS5 Spiderman runs better on PC graphics cards that released two before PS5 released.

We've also already had a proper 'next gen' moment on PC.
 
i think real PS5 Exclusives pose a challenge for PC ports though. The custom hardware in PS5 is massive.

Cerny made it very clear that ALL Steps need to be friction less not only the decompression. DMA , Check In and Load Management are done on PS5 with another custom unit wich is equivalent to another 2 zen 2 cores. The plus in latency on PC for all those steps will eventually cause friction.

It's a fallacy that the PS5 has lots of custom hardware outside of the the decompression unit for handling the IO. Sony are great at labelling pre-existing hardware components with marketing names to make people think it's something they've added, but in fact every CPU (and in fact most computing devices) has DMA engines for offloading data transfer processing requirements and every SSD has it's own (usually ARM based) CPU cores for IO processing.

Sony simply took the hardware that was already there and wrote a customer firmware for it better suited to a closed environment. Fabian Giesen from RAD Game Tools talks about it here:


From a hardware perspective, with DirectStorage, PC's can exceed the PS5 in pretty much every way (faster SSD, wider busses, faster decompression throughput, more raw CPU and GPU power etc...), where the PS5 excels is with it's custom software stack pulling all of that together very efficiently. Direct Storage on the PC addresses that to a degree but as others have stated it's never going to be as tight as what can be achieved in a closed environment like a console. In terms of end results though there's simply no reason why it wouldn't be possible to achieve anything on a properly equipped PC that can be achieved on the PS5, and much more.

So yeah i read the linked Thread and i still would not believe Jensen everything. The Idea that a GPU will outperform PS5 with 2 TF is a statement from Nvidia. I believe it when i see it.

As others have stated, faith isn't required. There are publicly available benchmarks which can show the actual performance. Here's a test Intel did on the Arc A770 (A GPU roughly in line with a 2070/3060 in performance terms) achieving almost 23GB/s throughput. And that's probably limited by the SSD speed rather than the GPU's decompression throughput.

https://www.intel.com/content/www/us/en/developer/articles/news/directstorage-on-intel-gpus.html

If decompressing per GPU would be such a logic and viable thing to do why did they not choose this path with the PS5. They could have avoided all the custom I/O Shenanigens and have a beefier GPU instead. But they went with the heavy custom I/O Solution anyways.

In addition to what others have already said around this there is one very important reason why Sony didn't take the GPU decompression route with the PS5 - because it didn't exist. GPU decompression is traditionally a very difficult task to achieve because it's difficult to parallelize, and has only existed in research papers for years. Nvidia had to literally invent a new compression/decompression routine (GDeflate) to make it feasible on GPU's. CPU/ASIC based decompression on the other hand has been around for years and Sony have a partnership with RAD Game Tools who specialise in that. So it's a no brainer that they would choose that tried and tested route rather than try to develop a brand new, untested technology to do the same job. The PC market had no choice but to do that. But now that it's developed and proven, it'll be interesting to see whether the next gen consoles utilise GPU decompression in future rather than a custom ASIC.
 
In addition to what others have already said around this there is one very important reason why Sony didn't take the GPU decompression route with the PS5 - because it didn't exist. GPU decompression is traditionally a very difficult task to achieve because it's difficult to parallelize, and has only existed in research papers for years. Nvidia had to literally invent a new compression/decompression routine (GDeflate) to make it feasible on GPU's.

GPU decompression has been around for years with many PS4 and Xbox One games using it.

I'm sure Cerny even spoken about in the Road to PS4.
 
I think it can be clearly stated that the construction of the current generation of consoles will be much more durable than the previous generations. And this is precisely due to hardware solutions such as decompression blocks, SFS and ultimately solutions based on SSD I/O speed. The question here is not whether it will be possible to use all these capabilities on PCs, of course it is possible, but this requires much more expensive hardware components than those found in consoles since 2020. It is clear that hardware decompression blocks and hardware-integrated SFS are a much cheaper and more elegant (and forward-looking!) technical feature than simply entrusting the solution to the GPU or CPU, which requires much more resources.

Anyway, it was not possible to take advantage of these capabilities in consoles until there were enough high-performance PCs on the market. Namely because of multiplatform developments.
 
GPU decompression has been around for years with many PS4 and Xbox One games using it.

I'm sure Cerny even spoken about in the Road to PS4.

Both the PS4 and Xbox One had hardware decompression ASICs for decompressing zlib - although I understand they were rarely used in favour of more advanced compression routines like Kraken running on the CPU.

GDeflate is the first commercially used GPU decompression routine as far as I'm aware. If you search online you can see this was basically just a research topic only a couple of years ago.
 
Both the PS4 and Xbox One had hardware decompression ASICs for decompressing zlib - although I understand they were rarely used in favour of more advanced compression routines like Kraken running on the CPU.

GDeflate is the first commercially used GPU decompression routine as far as I'm aware. If you search online you can see this was basically just a research topic only a couple of years ago.

Yes Kraken begins to be used in after 2016 because running it on Jaguar CPU was faster than the hardware decompression ASICS in PS4 and Xbox One. All people talk about better compression for Kraken but the main advantage is how fast it is for decompression compared to Zlib/deflate.

In addition to what others have already said around this there is one very important reason why Sony didn't take the GPU decompression route with the PS5 - because it didn't exist. GPU decompression is traditionally a very difficult task to achieve because it's difficult to parallelize, and has only existed in research papers for years. Nvidia had to literally invent a new compression/decompression routine (GDeflate) to make it feasible on GPU's. CPU/ASIC based decompression on the other hand has been around for years and Sony have a partnership with RAD Game Tools who specialise in that. So it's a no brainer that they would choose that tried and tested route rather than try to develop a brand new, untested technology to do the same job. The PC market had no choice but to do that. But now that it's developed and proven, it'll be interesting to see whether the next gen consoles utilise GPU decompression in future rather than a custom ASIC.

If it was the best solution for a console Sony and Microsoft would have use it inside PS5 and Xbox Series. From a performance per watt and cost perspective hardware ASICs are better. This is the same reason Sony use the Tempest engine for 3d audio with a modified CU with DMA style programming like CELL it means the CU is more efficient for the task than the same GPU CU without modification. Or MS use hardware ASIC for Audio.

Specialized hardware aren't a good things on PC. Every PC have a CPU and a GPU(discreet GPU or integrated). This is a known quantity.
 
Last edited:
Back
Top