Playstation 5 [PS5] [Release November 12 2020]

The flash controller is on a different, smaller chip by the flash modules. I'm not sure if the photographer would consider it interesting, but maybe someone could request a picture.

Think it might be interesting anyway if just to see what specific DRAM they are using for the cache. Actually, would also be interesting if they or someone could provide a photograph for the Toshiba NAND devices and the SK Hynix one in Series X's SSD.

Clearer photos would make it easier to track down documentation for those...if there's any online (I tried doing this for the SK Hynix one but kept running into blank wholesaler pages and AliExpress entries for unrelated things).

Am I the only one noticing the "butterfly" design of PS5's GPU (like the PS4 Pro) compared with the more monolithic style of the SX ?

That seems like a carry-over of the PS4 Pro, which was also a butterfly design. Sony (well, more specifically Cerny) seems to like butterfly designs.

I dunno how well that will carry forward in the future, though, while maintaining hardware-based BC. Suppose costs for 3nm or some 2nm PS6 APU end up being very expensive. Then let's say they decide to make that a chiplet because they want to use one of those for some handheld-like device (they may have to do this for Japan and a few other regions). So scaling-wise they are locked at a 36 CU (40 CU), which will mean a wider GPU, also means more costs.

Interestingly Cerny mentioned a 48 CU theoretical design at Road to PS5 so I'm guessing they could technically maybe do 18 CU increments meaning they may not be absolutely married to a butterfly model. Which is probably for the best if they still want to maintain hardware BC while keeping APU costs manageable.
 
Last edited:
Think it might be interesting anyway if just to see what specific DRAM they are using for the cache. Actually, would also be interesting if they or someone could provide a photograph for the Toshiba NAND devices and the SK Hynix one in Series X's SSD.

Clearer photos would make it easier to track down documentation for those...if there's any online (I tried doing this for the SK Hynix one but kept running into blank wholesaler pages and AliExpress entries for unrelated things).



That seems like a carry-over of the PS4 Pro, which was also a butterfly design. Sony (well, more specifically Cerny) seems to like butterfly designs.

I dunno how well that will carry forward in the future, though, while maintaining hardware-based BC. Suppose costs for 3nm or some 2nm PS6 APU end up being very expensive. Then let's say they decide to make that a chiplet because they want to use one of those for some handheld-like device (they may have to do this for Japan and a few other regions). So scaling-wise they are locked at a 36 CU (40 CU), which will mean a wider GPU, also means more costs.

Interestingly Cerny mentioned a 48 CU theoretical design at Road to PS5 so I'm guessing they could technically maybe do 18 CU increments meaning they may not be absolutely married to a butterfly model. Which is probably for the best if they still want to maintain hardware BC while keeping APU costs manageable.

48 CU would need to be 3 shader engines (RX 6800) for BC reasons. Turn off two for ps4, one for pro. Would have been a large SOC bigger than Xsx.
 
Think it might be interesting anyway if just to see what specific DRAM they are using for the cache. Actually, would also be interesting if they or someone could provide a photograph for the Toshiba NAND devices and the SK Hynix one in Series X's SSD.

Clearer photos would make it easier to track down documentation for those...if there's any online (I tried doing this for the SK Hynix one but kept running into blank wholesaler pages and AliExpress entries for unrelated things).
Do you need numbers more specific than the iFixit teardown had?
https://www.ifixit.com/Teardown/PlayStation+5+Teardown/138280

There may not be much motivation for public documentation for components meant for large customer orders.


That seems like a carry-over of the PS4 Pro, which was also a butterfly design. Sony (well, more specifically Cerny) seems to like butterfly designs.
Quite a few of AMD's GPUs have to split their CUs across the command processor fixed-function block. The larger sizes seem to be mostly butterfly-like. The PS4 Pro's somewhat odd layout may be related to how heavily Sony fights for die area savings.
GPU data paths are very broad and routing through the chip can be congested. Putting a lot of arrays on one side constrains what direction additional CUs can be added as well.
Splitting across the middle reduces the average length signals have to travel from the midline where the caches and control hardware are placed, and if there's a lot of links to the memory subsystem they don't all have to make the same turn into the CU arrays.

Looking at the PS5 APU, it looks like Sony may have had a certain ceiling for the width of the chip, since the CPU block is packed so tightly and may have sacrificed FPU area to keep it within the bounds set by the GPU. If the GPU didn't split its CUs, it might have required orienting the GPU in a way that would make it wider.

Interestingly Cerny mentioned a 48 CU theoretical design at Road to PS5 so I'm guessing they could technically maybe do 18 CU increments meaning they may not be absolutely married to a butterfly model. Which is probably for the best if they still want to maintain hardware BC while keeping APU costs manageable.
As long as raw unit counts are higher than what came before, the GPU can hide any extra. We know it can because it's already hiding that there are 40 CUs physically on-chip, even though at some early validation stage the GPU was running those units as part of its bring-up process.
 
48 CU would need to be 3 shader engines (RX 6800) for BC reasons. Turn off two for ps4, one for pro. Would have been a large SOC bigger than Xsx.

That'd definitely be a waste of CUs, so maybe that makes Cerny's hypothetical example kind of odd. If they ever seriously considered a 48 CU design, I wonder for what reason if it'd mean 12 disabled CUs.

Do you need numbers more specific than the iFixit teardown had?
https://www.ifixit.com/Teardown/PlayStation+5+Teardown/138280

There may not be much motivation for public documentation for components meant for large customer orders.
.

I somehow missed this xD; thanks for the link, that covers any needless hunting.

Actually speaking of PS5 GPU shape, something kind of came to my attention. Seen a few photos breaking down the allocation of things on the PS5 APU from the x-rays:

oyOpkt1.jpg


and then I came across this GPU ref image for Navi 21:

z1k7j.jpg


Now I can't see an Infinity Fabric Cache Controller in the PS5 shot similar to Navi 21, but is it possible they have extremely cut-down slices of cache sandwiched between PHYs and a cache controller as the small (maybe unmarked) grey block between the two pairs of memory controllers interfacing to the L2$?

I'm just asking because it's genuinely slightly perplexing and I feel like I might've made an analysis on this elsewhere off of a bad read, because even now I'm looking at it still and can't see how any theoretical amount of IC slice sandwiched between the PHYs would be even 2 MB in size going by the size of the actual 2 MB (or more specifically, 256 KB) L2$ blocks on the GPU. Thin them and lengthen them a bit to fit in that space and you get maybe six in each slice, or 6 MB in all. Would not sound worth it in all honesty.

Though, just looking again at the Navi 21 picture I should probably figure the actual cell size for the IC is a lot smaller than that for the L2$ (cell density per cache level is something I've only kind of just started looking into in terms of node process fabbing), which is why I was thinking if there's actual room there (can't explain for where the Infinity Fabric Cache Controller would be at or why it'd be a small block between the GDDR6 memory controllers since that looks contrary to how the cache controller's actually supposed to function) it could be closer to 12 MB - 24 MB?

Honestly would just enjoy some concise clarification so the book on "does it have IC or doesn't it" can be closed shut once and for all, these die x-rays should be enough evidence anyway IMO.
 
It does not have Infinity Cache.

I mean that's what I'm assuming is the case but I had to make absolutely sure. For example, just going from what I mentioned earlier there's no Infinity Fabric Cache Controller to handle the (hypothetical) L3$. Which should be the strongest indicator it's not present.

At least it should be clear now it's not there because there's critical parts to the IC completely absent anyway, so seeing if there's people still trying to argue otherwise pointing to missing IFCC should be enough to dissuade their argument (if they aren't stubborn).
 
FADD is one of the most fundamental operations, and is used very widely in gaming. The PS5 will absolutely need to be able to do it.

However, the reason people are speculating that the FADD unit was cut from PS5 is that the Zen2 core actually has two different places that can do FADD. The FMA pipes can do FADD with a throughput of 2 per clock and latency of 5 cycles, but in addition to this there are FADD pipes that can also do FADD with a throughput of 2 per clock, but with a latency of 3 cycles.

That is, FADD is in general such an important instruction, that they added completely separate execution units just to cut 2 cycles of latency from them. It would appear that Sony felt that this is a waste, and just has the FMA units calculate them instead.

This is definitely a downgrade, but it might be a very small one.

Where did you got those values for FADD??? They dont check out with the ones on the article bellow:

https://www.anandtech.com/show/1621...e-review-5950x-5900x-5800x-and-5700x-tested/6

In there they claim Zen 2 FADD throughput is 1 per clock with 5 cycle latency.
 
Last edited:
Well, MLID is finally addressing the SOC images; still listening. Can tell he's flustered and I'm finding it both very funny and also empathetic, if he genuinely feels like people were misquoting him.

2021's a good year so far.
 
Also quick: I think guys like MLID, RGT etc. need to keep in mind that, them speculating on what might've been in PS5 was never really the issue. The problem is they seemed to afford Sony leeway in certain design decisions that for whatever reason, they didn't afford Microsoft, and this was before Hot Chips came out (and I phrase it this way because I genuinely don't think these guys have actual direct insider contacts who have in-depth hands-on with this hardware).

Also, trying to go to phantom customizations to explain why PS5 is performing better than Series X in certain 3P cross-gen titles was never once necessary, there are a handful of very simple potential answers based on info Sony and Microsoft already provided. Anything between profiling issues of the segmented memory (Series X), certain engines benefiting from higher GPU clock (PS5), lack of tools/tool immaturity/tool unfamiliarity (Series X) or simply better/more familiar tools for handling data I/O (PS5) could've been obvious answers to pick from.

Instead they kept leaving the door open for Infinity Cache, Zen 3 unified cache, Geometry Engine customizations (now the x-ray shots don't necessarily debunk anything with the GEs AFAIK) and other esoteric things and they let it build up so this is basically the result of that.
 
Last edited:
That'd definitely be a waste of CUs, so maybe that makes Cerny's hypothetical example kind of odd. If they ever seriously considered a 48 CU design, I wonder for what reason if it'd mean 12 disabled CUs.



I somehow missed this xD; thanks for the link, that covers any needless hunting.

Actually speaking of PS5 GPU shape, something kind of came to my attention. Seen a few photos breaking down the allocation of things on the PS5 APU from the x-rays:

oyOpkt1.jpg


and then I came across this GPU ref image for Navi 21:

z1k7j.jpg


Now I can't see an Infinity Fabric Cache Controller in the PS5 shot similar to Navi 21, but is it possible they have extremely cut-down slices of cache sandwiched between PHYs and a cache controller as the small (maybe unmarked) grey block between the two pairs of memory controllers interfacing to the L2$?

I'm just asking because it's genuinely slightly perplexing and I feel like I might've made an analysis on this elsewhere off of a bad read, because even now I'm looking at it still and can't see how any theoretical amount of IC slice sandwiched between the PHYs would be even 2 MB in size going by the size of the actual 2 MB (or more specifically, 256 KB) L2$ blocks on the GPU. Thin them and lengthen them a bit to fit in that space and you get maybe six in each slice, or 6 MB in all. Would not sound worth it in all honesty.

Though, just looking again at the Navi 21 picture I should probably figure the actual cell size for the IC is a lot smaller than that for the L2$ (cell density per cache level is something I've only kind of just started looking into in terms of node process fabbing), which is why I was thinking if there's actual room there (can't explain for where the Infinity Fabric Cache Controller would be at or why it'd be a small block between the GDDR6 memory controllers since that looks contrary to how the cache controller's actually supposed to function) it could be closer to 12 MB - 24 MB?

Honestly would just enjoy some concise clarification so the book on "does it have IC or doesn't it" can be closed shut once and for all, these die x-rays should be enough evidence anyway IMO.
Ill go with Ocams Razor - its Navi 10 layout + Zen2.

Makes sense because AMD is not stupid, they will design 40CU chip as efficient as possible for yields.
 
Where did you got those values for FADD???

The only place you go to look up x86 instruction latencies: Agner Fog's Software Optimization Resources
Specifically, volume 4.

Fog measures all the values empirically instead of trusting what the manufacturers state. His values occasionally differ from those published by Intel or AMD, and when they do, the manufacturers typically go and fix their manuals.

They dont check out with the ones on the article bellow:

https://www.anandtech.com/show/1621...e-review-5950x-5900x-5800x-and-5700x-tested/6

In there they claim Zen 2 FADD throughput is 1 per clock with 5 cycle latency.

That's for the x87 instruction, which are obsolete. Look for the correct modern SSE/AVX one from Fog's tables. (search for Zen2, then search for ADDSS)
 
Downside is that MS have to be able to deal with high thermal density of AVX256 no matter what, while staying almost silent. Tiny bit more die too.
Flip side of that is, if it's not used a lot you have lower thermals.
Unless things have changed which could very well be the case, AVX wasn't used as much more so due to mixed support in generations of cpus that was needed to be supported by games.
So the benefit wasn't there to make much use out of it. Doesn't mean that will be the case going forward.

All these things are features that may get leveraged more in the future, but I still don't expect most people to see the difference or care.
Interesting for us regarding engine implementation and gdc talks though.
 
imo, I don't think there is a mid-gen refresh coming. The node shrink would not be significant enough to warrant a refresh and still keep the price points to where they are today. Next generation after this one will be interesting however. Curious to see how they intend to tackle it.
https://www.gameinformer.com/feature/2019/12/03/the-first-25-years
MASAYASU ITO [Executive Vice President, Hardware Engineering and Operation, Sony Interactive Entertainment]
Indeed, in the past, the cycle for a new platform was 7 to 10 years, but in view of the very rapid development and evolution of technology, it’s really a six to seven year platform cycle. Then we cannot fully catch up with the rapid development of the technology, therefore our thinking is that as far as a platform is concerned for the PS5, it’s a cycle of maybe six to seven years. But doing that, a platform lifecycle, we should be able to change the hardware itself and try to incorporate advancements in technology. That was the thinking behind it, and the test case of that thinking was the PS4 Pro that launched in the midway of the PS4 launch cycle.
This rather suggests that sony has plans for ps5 pro
 
Yea, it's a good way to look at it. It's like asking a Zen 2 motherboard to support Zen 3 CPU; the zen 3 support would be present in the microcode for it to be able to support it.

Sorry, I phrased matters poorly. I meant that I'm completely unfamiliar with the term "no microcode reference" in the context of game development. I'm vaguely familiar with it in the context of general computing.

I've dabbled in some C++ and HTML in my time, and I always assumed that dev kits were somewhat similar insomuch as they're a significant number of steps away from microcode. Am I mistaken on that front?

And being able to make your own versions of these features doesn't necessarily imply you can outperform the hardware versions of these functions.

Oh, absolutely. I recently finished Death Stranding, and its checkerboard solution is substantially worse than that in God of War or Horizon Zero Dawn. As mentioned in this Digital Foundry video, Kojima's studio decided to checkerboard purely in software. And it's to the detriment of its presentation.

imo, I don't think there is a mid-gen refresh coming. The node shrink would not be significant enough to warrant a refresh and still keep the price points to where they are today. Next generation after this one will be interesting however. Curious to see how they intend to tackle it.

I disagree there. I don't think we'll see a PS4 to Pro power increase, and certainly not an XB1 to X1X increase, but I do anticipate a tentative step into chiplet territory.

I initially thought the PS5's 36CU design would lend itself well to doubling the GPU to 72 CU's, but I now think the mid-gen refreshes are going to be the likes of RDNA1>RDNA2 IPC improvements (that's more of a Microsoft move IMO,) tensor cores, more bandwidth, and a clockspeed increase, with the core/CU counts staying the same.

In essence, shrink the current SoC's to 5nm versions, use the lower performing ones on their own in slim versions of the PS5/XSX, and stick some chiplets onto the higher performing versions for "Pro/X" versions.
 
Back
Top