General Next Generation Rumors and Discussions [Post GDC 2020]

Indeed but isn't DirectStorage designed to mitigate much of that?
DirectStorage, like macOS's CoreStorage is an API. It won't negate chain of I/O that exists in PC. You're still limited by the bandwidth of the transfers between SDD/controller, controller/southbridge, southbridge/RAM, potentially RAM/CPU for unpacking/conversion, and if the data if for the GPU then add in DDR/PCI/GDDR.

That's not to say there aren't still advantages to the PS5's customizations, there obviously are, but it's not like comparing how today's SSD's perform on the PC that don't benefit from DirectStorage.
I wouldn't want PS5's architecture in my PC, it would remove to much flexibility.
 
DirectStorage, like macOS's CoreStorage is an API. It won't negate chain of I/O that exists in PC. You're still limited by the bandwidth of the transfers between SDD/controller, controller/southbridge, southbridge/RAM, potentially RAM/CPU for unpacking/conversion, and if the data if for the GPU then add in DDR/PCI/GDDR.


I wouldn't want PS5's architecture in my PC, it would remove to much flexibility.


DirectStorage – DirectStorage is an all new I/O system designed specifically for gaming to unleash the full performance of the SSD and hardware decompression. It is one of the components that comprise the Xbox Velocity Architecture. Modern games perform asset streaming in the background to continuously load the next parts of the world while you play, and DirectStorage can reduce the CPU overhead for these I/O operations from multiple cores to taking just a small fraction of a single core; thereby freeing considerable CPU power for the game to spend on areas like better physics or more NPCs in a scene. This newest member of the DirectX family is being introduced with Xbox Series X and we plan to bring it to Windows as well.

I'm very curious to see how it works and what kind of latency it provides. We won't know how it compares to regular filesystem access until it shows up on pc.
 
I'm very curious to see how it works and what kind of latency it provides. We won't know how it compares to regular filesystem access until it shows up on pc.

Yup. But folks should not expect miracles on PC. The advantages of PC's flexibility exists because the extensibility comes from having a bunch of buses all tangled together, which allow the extensibility. Nextgen consoles are SSD-to-controller-to-RAM. BAM! It's hard to beat that. Even if Microsoft make no other changes to XSX's filesystem, that's just so much less moving data around than your typical PC.

Your going from SSD -> controller -> south bridge -> RAM -> CPU/RAM for decor
 
I'm very curious to see how it works and what kind of latency it provides. We won't know how it compares to regular filesystem access until it shows up on pc.

Curious to see PCIE5 with that monstrous speeds. That in combination with Zen 3, Ampere/RDNA2 with possible HBM in the higher end at 18TF or more, DDR5 on the way. I think il wait and get a whole new pc by 2021, perfect machine for XSX and PS5 games that may appear on pc. HZD with mods.... :D
DX12Ultimate and direct storage are a nice welcome.
 
Yup. But folks should not expect miracles on PC. The advantages of PC's flexibility exists because the extensibility comes from having a bunch of buses all tangled together, which allow the extensibility. Nextgen consoles are SSD-to-controller-to-RAM. BAM! It's hard to beat that. Even if Microsoft make no other changes to XSX's filesystem, that's just so much less moving data around than your typical PC.

Your going from SSD -> controller -> south bridge -> RAM -> CPU/RAM for decor
'
Did they have a DMAC on XSX. I did not remember seen one during the presentation?

EDIT: There is one logic.
 
Last edited:
DirectStorage, like macOS's CoreStorage is an API. It won't negate chain of I/O that exists in PC. You're still limited by the bandwidth of the transfers between SDD/controller, controller/southbridge, southbridge/RAM, potentially RAM/CPU for unpacking/conversion, and if the data if for the GPU then add in DDR/PCI/GDDR.

I totally get what you're saying and this is what interests me. For example what part of that path is likely to constrain the bandwidth? Or are you talking more in terms of latency due to the additional steps? As far as I know all of the steps in that chain easily exceed the hypothetical 10GB/s we're discussing meaning the bandwidth bottleneck is the SSD/controller itself.

Looking at an AMD system for example the path is SDD->Controller->PCIe 4.0 4x->Infinity Fabric->RAM or PCIe 4.0 16x->Graphics RAM.

Given that the PS5 is an AMD APU, PCIe 4.0 based system which presumably also uses IF for communication between its constituent parts just like a regular AMD APU, surely it would go through all of those steps as well, aside from the last one between main memory and graphics memory (not sure if PC's can read direct from the SSD/HDD into graphics memory without going via main memory first)?

Or is it that the PS5 controller sits on the APU rather than on the SSD itself and therefore we need to move that to the other side of the PCIe 4.0 4x step?
 
I totally get what you're saying and this is what interests me. For example what part of that path is likely to constrain the bandwidth? Or are you talking more in terms of latency due to the additional steps? As far as I know all of the steps in that chain easily exceed the hypothetical 10GB/s we're discussing meaning the bandwidth bottleneck is the SSD/controller itself.

Looking at an AMD system for example the path is SDD->Controller->PCIe 4.0 4x->Infinity Fabric->RAM or PCIe 4.0 16x->Graphics RAM.

Given that the PS5 is an AMD APU, PCIe 4.0 based system which presumably also uses IF for communication between its constituent parts just like a regular AMD APU, surely it would go through all of those steps as well, aside from the last one between main memory and graphics memory (not sure if PC's can read direct from the SSD/HDD into graphics memory without going via main memory first)?

Or is it that the PS5 controller sits on the APU rather than on the SSD itself and therefore we need to move that to the other side of the PCIe 4.0 4x step?
There is also the custom filesystem combined with very fast caches tailored for reads speed: and the data goes directly to the ram, without any CPU I/O management. It's not only about how many intermediaries.
 
I totally get what you're saying and this is what interests me. For example what part of that path is likely to constrain the bandwidth? Or are you talking more in terms of latency due to the additional steps? As far as I know all of the steps in that chain easily exceed the hypothetical 10GB/s we're discussing meaning the bandwidth bottleneck is the SSD/controller itself.

Looking at an AMD system for example the path is SDD->Controller->PCIe 4.0 4x->Infinity Fabric->RAM or PCIe 4.0 16x->Graphics RAM.

Given that the PS5 is an AMD APU, PCIe 4.0 based system which presumably also uses IF for communication between its constituent parts just like a regular AMD APU, surely it would go through all of those steps as well, aside from the last one between main memory and graphics memory (not sure if PC's can read direct from the SSD/HDD into graphics memory without going via main memory first)?

Or is it that the PS5 controller sits on the APU rather than on the SSD itself and therefore we need to move that to the other side of the PCIe 4.0 4x step?

The controller is ont the SSD but in the patent the decompressor are connected to the RAM. This probably more complicated than the diagram Mark Cerny have same the SSD is controller has an optimized read unit. And there is a DMAC in the patent too.

uS6bo2P.png


yUFsoEN.png
 
I'm very curious to see how it works and what kind of latency it provides. We won't know how it compares to regular filesystem access until it shows up on pc.

DirectStorage – DirectStorage is an all new I/O system designed specifically for gaming to unleash the full performance of the SSD and hardware decompression. It is one of the components that comprise the Xbox Velocity Architecture. Modern games perform asset streaming in the background to continuously load the next parts of the world while you play, and DirectStorage can reduce the CPU overhead for these I/O operations from multiple cores to taking just a small fraction of a single core; thereby freeing considerable CPU power for the game to spend on areas like better physics or more NPCs in a scene. This newest member of the DirectX family is being introduced with Xbox Series X and we plan to bring it to Windows as well.

NVME performance allows a system to drive I/O operations so high with such low latency it can bog down cpu cores. I wonder if MS's Direct Storage ditches async interrupts to accommodate nvme ssds.

https://pdfs.semanticscholar.org/def2/9d202e537d026b8d3ed91655b540ef86cceb.pdf

The paper shows that the common method of using async interrupts doesn't work as well with NVME SSDs. Requests to the SSDs are being serviced so quickly that the CPU core is spending more time swapping states than it does working on other tasks. Its cheaper to for the CPU to poll and wait for a i/o request versus trying switch to service a different task while waiting for i/o request to be fulfilled.

The paper recommends a hybrid system that using async interrupts for larger transfers but switches to polling for smaller transfers.
 
Last edited:
Another (lengthy) take on mostly PS5 and some series X. Also some strange comments that Sony has not shown the console yet because of fear that people might think it could not be mass produced, well.....always with a grain of salt folks.

 
Last edited:
Another (lengthy) take on mostly PS5 and some series X. Also some strange comments that Sony has not shown the console yet because of fear that people might think it could not be mass produced, well.....always with a grain of salt folks.


RGT is probably the best youtuber for console tech, although he goes too much length to appear unbiased.
 
36 CUs would be easier to double, though there's less room below since the Series X would be at an intermediate position between the PS5 and a doubled Pro. Sony may need to consider if doubling is enough, especially if there were to be a Pro variant of the Series X.

Doubling to 72CU's would likely still place it at less powerful than an equally hypothetical XSXXX, true, but I wonder how high RDNA2 can scale. I've heard a few people theorise that its max is 80CU's (although I've not done any digging to determine if that's nonsense) which would leave both Sony and Microsoft with 80CU GPU's, likely with 8 disabled for yields, leaving only clockspeeds as the differentiator. The X1X has shown that Microsoft can push clockspeeds, the PS5 shows that Sony can too.

It'll be interesting to see how that shakes out, and if we end up with two virtually identical Pro models.

Going from 14 to 16 Gbps would be a scant upgrade, and proportionally weaker than the PS4 to Pro transition with a ~14.3% bandwidth improvement stretched over 2x the CU. Perhaps there would be an even faster interface speed, or a change in width, such as at least matching 320-bits, if not going wider.

What are the most likely options here, do you reckon? Because every cost-effective option I can think of seems to be a bit shit:
320-bit+16gbps = 640GB/s
384-bit+16gbps = 768GB/s
320-bit+18gbps = 720GB/s
384-bit+18gbps = 864GB/s

Given that I've gone as high as a 384-bit bus and 24GB's of 18gbps, I'm being quite loose with my usage of cost-effective too, but even that fails to increase the PS5's meagre bandwidth in line with a doubling of the GPU.

Sony's variable clock solution might have some kind of impact on a future Pro, since we'd assume Sony wouldn't want to drop the clock. Raising the clocks could be interesting, though the current clocks are being described as being in a region that's already inefficient. 72 or more CUs may be interesting versus a competing xPro if they are both much larger in CU count but one is still striving for constant clocks. There may be some load scenarios where its costlier or more difficult to hold to a constant clock with many more active units.

I'd really like to see the two different approaches in action. Assuming each of them just increase clocks by 15% in line with TSMC's 5nm predictions. Although I suppose it depends on the success of Sony's constant power approach to determine whether we see that same approach taken with the XSXXX.

It strikes me that the high clocks of the X1X gave Sony's engineers a bit of an "awww now why didn't I think of that?" moment, and if the PS5's design is successful, I expect to see a similar riposte from Microsoft.

What else could be scaled with a Pro console like the CPUs might be an interesting question. Zen 2 seems to be a more successful initial implementation versus Jaguar, so the clocks currently given aren't artificially low. A 33% jump would give clocks that would be ~4.7GHz, and node jumps at that clock range are often threatened with clock regressions. Don't know if they'd try for a clock bump, or if a non-standard number of additional cores could be an option.

This is an area that excites me the most, although my imagination's probably being a bit too fertile, and I'd be really grateful if you could set me straight on this: the thing I would most like to see from the PS5Pro is a literal doubling of everything but the ODD. Is it technically possible, or am I being a raving lunatic?

For clarity, I'm imagining this hypothetical PS5ProDuo (see what I did there? :cool: ) to consist of the following :
  • Able to function as two separate PS5's and output as either splitscreen or two different screens.
  • Able to function as a single device, replete with a 4GHz 16C/32T CPU, a 2.5GHz 72CU GPU, an 11GB/s 1650GB SSD, and 32GB's of HBM (or maybe 32GB's of 16gbps GDDR6, but I put HBM anyway because I'm a big, raging HBM fanboy.)
  • Given that the memory in particular would make it quite an expensive device, perhaps ameliorate that by mandating PS6 games must play on it.
My apologies if this is all a bit messy. My laptop's broken, so I've written this on my phone, which isn't the best for scansion. For you or for me.
 
I expect to see a similar riposte from Microsoft.

Probably not in it’s current form.
PS5 isnt even out yet, and you already want the successor ;) Most likely there won’t be any mid gens, what are they going to advertise with, ps5/xsx are already ’8k capable’ and a substantional upgrade seems further away.
Besides that, games are going to be designed for base hardware anyways, you’l only get resolution upgrades, which can be upscaled with smarter tech like dlss2 console variant.
 
Probably not in it’s current form.

How do you mean?

PS5 isnt even out yet, and you already want the successor ;)


Most likely there won’t be any mid gens, what are they going to advertise with, ps5/xsx are already ’8k capable’ and a substantional upgrade seems further away.

I'm not so sure about an absence of mid-gen consoles this generation. I do agree that a substantial upgrade seems further away though. But that's kind of why I would like to see mid-gen consoles: release a substantial upgrade in 3-5 years, that can tide over core gamers for the next 3-5. The longer the better.

Besides that, games are going to be designed for base hardware anyways, you’l only get resolution upgrades, which can be upscaled with smarter tech like dlss2 console variant.

True, but there are things like fluid simulation, animation quality, NPC quantity that could all be dialled up on mid-gens. And, of course, ray tracing. 3-5 years of architectural improvements from AMD would yield us all some satisfying improvements. Even 52CU'ers like yourself.
 
I'm not so sure about an absence of mid-gen consoles this generation. I do agree that a substantial upgrade seems further away though. But that's kind of why I would like to see mid-gen consoles: release a substantial upgrade in 3-5 years, that can tide over core gamers for the next 3-5. The longer the better.
Sony specifically called out PC hardware progression as a reason for their PS4 Pro and PC hardware progression ... continues :LOL:

True, but there are things like fluid simulation, animation quality, NPC quantity that could all be dialled up on mid-gens. And, of course, ray tracing. 3-5 years of architectural improvements from AMD would yield us all some satisfying improvements. Even 52CU'ers like yourself.
Ray tracing alone is a reason for a Pro model from both consoles. In 4 or 5 years there may be a crazy distance between consoles and PCs if ray tracing becomes a hit.
 
Midgen is easily feasible for next gen, full suite raytracing (GI, shadow, reflection) at native 4k alone would eat up all the resources. Throw in some complex fluid sim and you'll be begging for PS6 in no time. They're for the uncompromising hardcore fans like me and a bunch of others. The only people who wouldn't want a mid gen upgrade this time are the die hard Xbox fans really:LOL:, they would pray Series X keeps the Tflops advantage for 7 years straight without a possible PS5 Pro that even out specs a Series XXX 3-4 years down the line.
But in the end gamers still win. Like the wise man Serizawa once said: "Let them fight"

MOD edit: Meme removed. B3D doesn't do GIFs. Please stick to words
 
Last edited by a moderator:
Midgen is easily feasible for next gen, full suite raytracing (GI, shadow, reflection) at native 4k alone would eat up all the resources. Throw in some complex fluid sim and you'll be begging for PS6 in no time. They're for the uncompromising hardcore fans like me and a bunch of others. The only people who wouldn't want a mid gen upgrade this time are the die hard Xbox fans really:LOL:, they would pray Series X keeps the Tflops advantage for 7 years straight without a possible PS5 Pro that even out specs a Series XXX 3-4 years down the line.
But in the end gamers still win. Like the wise man Serizawa once said: "Let them fight"

MOD edit: Meme removed. B3D doesn't do GIFs. Please stick to words
Yes, if there werent a 12 tflops Xbox to compete with we would have gotten a ps5 with jaguar cores and 8 tflops gpu.
 
Back
Top