That's because the PS5's SoC "basically runs at almost full power during gaming" (he says). As a result, TDP (Thermal Design Power) values and the amount of heat generated during gaming are "about the same". On the other hand, it is rare for a PS4 SoC to operate at the very edge of TDP, and even when gaming, it generates only a few percent of its TDP.
Perhaps something was lost in translation, a misquote, or he misspoke about the PS4 SOC only generating a few percent of TDP. Having something like 90%+ power margin would have had significant implications in how far Sony could have pushed clocks or how quiet the PS4 could have been. That text doesn't mesh with the PS4 being shown to pull 100-150W for games.
Interview with Otori VP in Japanese(2/2)
To cool both sides of the main board, the PS5's cooling fan is 45 mm thick, which is thicker than the current PS4 and PS4 Pro. If we divide the SoC-mounted side of the PS5 into "Side A" and the back of the PS5 into "Side B," then the heat emitted from Side B is "equivalent to that of the PS4's SoC," according to Mr. Otori. Therefore, the air is sucked in from both sides of the cooling fan to cool the A and B sides of the main board.
I'm not sure about this figure for the side opposite the SOC pulling as much as the PS4 SOC. It may depend on what figure he's using for the PS4, but since the PS4's been measured pulling 100-150W, the upper range could leave 100W+ for the SOC, while the lower could 50-60W. The elements with the most obvious thermal compound are the GDDR6 modules and the DC converters and a few other components (3 NAND modules, and a few ICs near the IO ports with silver "towers" that likely put them in contact with the shield).
From the following, we could probably assume ~2.5W for the GDDR6, or 20W for 8 modules.
https://www.eeweb.com/high-bandwidth-memory-ready-for-ai-prime-time-hbm2e-vs-gddr6/
That leaves 30-80W for everything else on that side, and which I cannot account for. The global power conversion losses for the whole console should probably not be worse that roughly 30W with 10% inefficiency, and that's including the power supply itself and the more substantial number of conversion ICs on the SOC side.
The big glob of thermal compound over much of the converter ICs also makes me think that something on the upper range of that estimate would have a more close contact with the the heatpipe.
There were other structural features as well. One example is the thermal conductivity between the GDDR6 compatible memory mounted on the B side of the main board and the shield board. Instead of the so-called 'stick-on' type thermal conductor in sheet form, it is coated with a liquid material that hardens like rubber after a short time. This is a measure to increase productivity in response to automation.
In the case of the paste-type heat-conductive materials, it is difficult to remove them from the backing board by an automatic machine, so it is necessary to manually remove them. The PS5 uses almost all of the thermal conducive materials used in the PS5, whereas the PS4 series used only some of them.
The thick material over the GDDR6 also hints that it's likely not dissipating that much heat. The reference to this choice allowing for more automation may also a data point to my perception that the PS5's physical design had a higher emphasis on mass production and compatibility with tooling.
When Sony said that work on a metal TIM began about 2 years ago in the teardown video, it did occur to me that about two years ago would have been when they were having frequency issues with fixed clocks and moving to take advantage of boosting and AMD Smart Shift. It's interesting how one decision may drive innovation in another area.
2 years could point to when they had to commit to a physical SOC and node. Prior to 2018, perhaps there was some uncertainty on the 7nm node variant, since it seems AMD has at this point used the same 7NP node for all its chips despite all the speculation at the time about 7nm variants that could have provided additional performance.
The smaller-die strategy would have been a candidate the whole time, and I think the characterization of the processes at the time would have given Sony a decent idea of the risks in terms of power density and whatnot.
Perhaps it came down to cost in the end for any alternatives, be it the Oberon SOC on a different node variant or a larger SOC with lower clocks.
I think it's still worth lending some credence to Cerny's claim that they historically had trouble predicting power consumption and that power demands were spiky. At least it would have been more expensive to get the SOC to similar levels of performance without falling back to the silicon's self-management.
Official numbers are 560gb/s vs 448Gb/s.
the whole debate around split pools I would generally ignore, largely speculation and not representative of what Scarlett is doing, which is asymmetrical memory sizes which allows the cpu and GPU to access at different speeds.
Memory size doesn't really govern the access speed of either the GPU or CPU. The CPU likely physically can only consume the same amount of bandwidth regardless of which slice of the memory space it is reading from, whereas the GPU has a substantially broader interconnect and so it can be constrained if it its accesses fall outside of the GPU-optimized zone.
Don't forget the specific memory configuration on XSX. When the CPU is used (using the slower pool of memory) then it will reduce the total available bandwidth, on top of the regular memory contention.
Memory is memory. If the CPU consumes X amount of bandwidth, it's X amount from the total available for either console.
I think you mean L1 since that's shared in the number of CUs per shader array.
L2 is matched to bus width on RDNA. So there's going to be much more L2 available. L2 is 5MB on XSX.
The number of L2 slices is generally matched to the bus width, but the capacity per slice is not. In theory, a narrower bus could have more L2, if the design opted to increase slice capacity. That could offset some of the downsides of having higher clocks vs memory and some bandwidth consumption, but the cost argument would go against it.