In the Mark Cerny presentation he said "constant power" approach can push GPU clock to 2.23GHz while "constant frequency" approach will
have trouble to maintain 2GHz.
Is there any reason why it happens? If the cooling solution is capable of 2.23GHz under heavy workload, why will there be problem to use
"constant 2GHz" even there is some power spike?
hmm the reason is that there is a power limit that the silicon can handle safely before it runs into a thermal limit problem. Whether that is through the wall or a battery source (laptops). I also think there is a budget of power between CPU and GPU and that is with or without saturation for instance.
I think traditionally on APUs, the priority for power was towards the GPU. So if the GPU started ramping up really hard, the CPU would start to get starved of power. In this scenario there is a 2 way shift, so if the CPU needed that power, it could borrow from the GPU.
Both CPU and GPUs have their own specific power curves, it should look something like this taken from AMD smartshift. You'll see wattage on the X axis and GPU performance on the Y axis.
There is sweet spot of performance that hugs the upper left corner, this is where you are getting the most performance with the least amount of wattage. As you want more performance you incur diminishing returns to get that.
As you can see on the image, each additional 25W of power the GPU gets in power, it gains a little less performance than the previous 25W shift.
So with smart shift, you're basically setting a power bound and shifting the power around for the task. I see smartshift as a way to gain performance, but I don't think it necessarily evades the issue if the CPU and GPU are in heated contention for power.
If look at this Radeon Tool that allows you to set your frequency/wattage for your RDNA card:
You'll easily see on this graph that PS5 sits just outside the axis of this graph at max cap.
But if you look at how the graph is moving exponentially upwards it's eating a lot more power once you get past the middle dot to continue to obtain more frequency. Which I think is around the 1350Mhz mark on this graph. How ironic it says 1825 for the limit set here. If we were to follow an imaginary line to 2200, the power would likely be over 1400 mv. That's nearly 2x the baseline power level.
The PS5 graph will look something like this, there will be a sweet spot somewhere in there that giving back some clock rate will be a super boon for the CPU to do it's work.
And somehow, if you're running the GPU hard at max cap rate, it must be pulling a lot of resources away from the CPU. The graph will be different of course with RDNA 2, but I have a hard time believing it's going to be that different.
Otherwise you've just got this super beefy PSU pumping out all sorts of huge power to both CPU and GPU.