Playstation 5 [PS5] [Release November 12 2020]

Cerny said that heat management was important aspect of the hardware design, but that we will have to wait for the official teardown to find out the details.

Yeah no kidding. All these discussions we’ve seen claiming that power shifting and such was just meant to be able to claim a higher theoretical speed while Cerny in the presentation very clearly (to me) was trying to explain how they set about to make the peak power consumption more predictable, as an important part in reducing or maybe even eliminating the situations where the PS5’s fan had to go into full jet engine mode, as happens so relatively frequently with the PS4.
 
Sony has extrra logic in cpu and gpu to detect load. This load detection is then used to drive power and clockspeed via smartshift. Sony's solution is more than smartshift and isn't something that could be just enabled in last moment. It must have been planned to be in there since beginning as it requires extra transistors to be in there which implement the load counters.

If sony didn't have the load detection in cpu&gpu it wouldn't be possible to make every console behave same. If load counters/detection logic isn't extensive enough it would potentially lead to hw bugs crashing hw due to excessive power draw(load). PC side implementations are much simpler and are based on just power draw and temperature. Mileage pc user gets depends on ambient temp, cooling and silicon lottery. PS5 is deterministic, pc smartshift solutions are not.

It's important for console to be deterministic so every user gets same performance despite quality differences in silicon and some users having cool rooms and some rooms having higher ambient temperature.
 
I thought the deterministic side meant that any calculations on performance were totally separate from local hardware monitoring. Instead a model Apu is used which is the same in all retail consoles.

This monitoring the incoming work queue sets frequency and the smartshift diverts power to enable this. Smartshift may use local hardware monitoring to ensure the right power to that specific Apu.

If the model is badly made, or a chip is poor it absolutely could fail and must be tested for when validating useable Apus.

I am a lay person but that is how Cernys talk came across to me.
 
Also important to note that the sheer size of the thing should point to an effort to keeping it quiet. I can't see Sony building something so massive and still having it sound like a jet engine? Might as well keep it small if that were the case.
Yeah, there is no way Mark Cerny would mentored heat and noise so prominently if PS5 wasn't significantly better. I don't expect it to be silent, but it can't be another jet engine console.
 
Do you really think 36 CUs@2.23 GHz can be faster than 44 CUs@1.825 GHz with 10.28TFs?

44 CUs 1.825 GHz can be easily fixed frequency.
Also many say RDNA 2 doesn’t scale very much with frequency.
 
Having more CUs may have meant too many ripples in their SDK, for instance to account for which CU to use etc. They had to make that adjustment when going to the 4Pro with 36 CUs. I dont know if they wanted to expand those fields even further for PS5 SDK.
 
Do you really think 36 CUs@2.23 GHz can be faster than 44 CUs@1.825 GHz with 10.28TFs?

44 CUs 1.825 GHz can be easily fixed frequency.
Also many say RDNA 2 doesn’t scale very much with frequency.
In some scenarios yes, and in others no.
The front end is the same, so that's going to give the advantage to the super high clockspeed. But if you run into a part of code where you are shader bound, then the 44CUs will perform better.
I'ts not exactly clear cut because of that unfortunately. It's going to depend on where the engine is spending more of its time and what it's doing.

the answer below mine is better.
 
Do you really think 36 CUs@2.23 GHz can be faster than 44 CUs@1.825 GHz with 10.28TFs?
It would depend on what the workload is limited by.
Other parts of the GPU that don't scale with CU cound would be clocked 22% faster, and if they are a greater bottleneck clock speeds can provide a bigger benefit.
Workloads that do not have as much parallelism or small batch sizes can also leave CUs underutilized.

Also many say RDNA 2 doesn’t scale very much with frequency.
I don't think there's been any kind of comparison made in public for clock scaling for an unreleased architecture.
 
Do you really think 36 CUs@2.23 GHz can be faster than 44 CUs@1.825 GHz with 10.28TFs?

44 CUs 1.825 GHz can be easily fixed frequency.
Also many say RDNA 2 doesn’t scale very much with frequency.
In the PS5 presentation Cerny said that there are benefits with higher clockspeeds compared to more CUs. Cant rember the exact quote. But I wished we got another teardown explaining better the PS5 hardware
 
In the PS5 presentation Cerny said that there are benefits with higher clockspeeds compared to more CUs. Cant rember the exact quote. But I wished we got another teardown explaining better the PS5 hardware

Some hw units might not be scaled with flops. It could be for example that ps5 and xbox have same amount of rops, tmu's and geometry units. In that case higher clockspeed would provide more resources to use. Also if some loads don't fully utilize gpu then narrow and higher clocked gpu might get better performance as the wider gpu is not fully utilized and is running at lower clock speed. But all of this is conjecture. xbox for sure has the better gpu but how much better we cannot deduce from flops alone.
 
The CU count was obviously the best balance between performance and as close as possible hardware backwards compatibility with PS4 (Pro).
 
A rising tide lifts all boats ?

Yes. What Mark Cerny said was:

I like running the GPU at a higher frequency let me show you why. Here's two possible configurations for a GPU roughly of the level of the PlayStation 4 Pro - this is a thought experiment don't take these configurations too seriously. If you just calculate teraflops you get the same number but actually the performance is noticeably different because teraflops is defined as the computational capability of the vector ALU.

That's just one part of the GPU, there are a lot of other units and those other units all run faster when the GPU frequency is higher. At 33% higher frequency rasterization goes 33% faster, processing the command buffer goes that much faster, the L2 and other caches have that much higher bandwidth and so on. About the only downside is that system memory is 33% further away in terms of cycles but the large number of benefits more than counterbalanced that. As a friend of mine says a rising tide lifts all boats. Also it's easier to fully use 36 CUs in parallel than it is to fully use 48 CUs. When triangles are small it's much harder to fill those CUs with useful work so there's a lot to be said for faster assuming you can handle the resulting power and heat issues.
I've thought about the realworld implications of PS5 vs XSX a few times but each time I get a bit of a headache and things like advantage of L3 cache, which are going to be significant, are also highly variable. Whatever any person expects to see in coming Digital Foundry videos, I think there will be more than a few surprises, specifically performance being better than expected, or worse - depending on the title.
 
Back
Top