Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
The question that hasn't really been answered really comes down to what the clock rate looks like when it's under sustained load, because blipping up and down is very fast, we have no real ideal of effective computational output except to benchmark.
I believe Digital Foundry had an interview with Mark Cerny that stated that the devs could cap either GPU or CPU if they wanted to. So in other words, if they want the max frequency 100% of the time for the GPU they could do it by sacrificing CPU frequency. Apparently the power budget is generous enough so that both CPU and GPU should be at max frequency under typical circumstances.
https://www.eurogamer.net/articles/digitalfoundry-2020-playstation-5-the-mark-cerny-tech-deep-dive
 
I believe Digital Foundry had an interview with Mark Cerny that stated that the devs could cap either GPU or CPU if they wanted to. So in other words, if they want the max frequency 100% of the time for the GPU they could do it by sacrificing CPU frequency. Apparently the power budget is generous enough so that both CPU and GPU should be at max frequency under typical circumstances.
https://www.eurogamer.net/articles/digitalfoundry-2020-playstation-5-the-mark-cerny-tech-deep-dive

That's what Mark Cerny said:

"The CPU and GPU each have a power budget, of course the GPU power budget is the larger of the two. If the CPU doesn't use its power budget - for example, if it is capped at 3.5GHz - then the unused portion of the budget goes to the GPU. That's what AMD calls SmartShift. There's enough power that both CPU and GPU can potentially run at their limits of 3.5GHz and 2.23GHz, it isn't the case that the developer has to choose to run one of them slower."
That said on both CPU and GPU there are some functions, and Mark Cerny gave the example of 256-bit instructions on the CPU, that consume more power and there will be power-hungry GPU functions as well.
 
I believe Digital Foundry had an interview with Mark Cerny that stated that the devs could cap either GPU or CPU if they wanted to. So in other words, if they want the max frequency 100% of the time for the GPU they could do it by sacrificing CPU frequency. Apparently the power budget is generous enough so that both CPU and GPU should be at max frequency under typical circumstances.
https://www.eurogamer.net/articles/digitalfoundry-2020-playstation-5-the-mark-cerny-tech-deep-dive

Isn’t the following the more important statement:
"Regarding locked profiles, we support those on our dev kits, it can be helpful not to have variable clocks when optimising. Released PS5 games always get boosted frequencies so that they can take advantage of the additional power," explains Cerny.

It sounds like the PS5 uses some form of AMDs AVFS but still has modes in their devkits to make optimization processes similar to current consoles.
 
Isn’t the following the more important statement:
"Regarding locked profiles, we support those on our dev kits, it can be helpful not to have variable clocks when optimising. Released PS5 games always get boosted frequencies so that they can take advantage of the additional power," explains Cerny.

It sounds like the PS5 uses some form of AMDs AVFS but still has modes in their devkits to make optimization processes similar to current consoles.
Yes, that is right!
So it is true that they can keep max GPU clocks, but not by downclocking CPU. They can keep max GPU freq by either underutilizing the CPU or by profiling their workload and optimizing it to have cheaper (in electrical power terms) GPU usage. ie. for example not having a lot of low triangle geometry (Cerny stated that contributes to increased power consumption on PS4Pro in HZD).
 
Power consumption =/= frequency. Cerny explained it during the conference, that's why both CPU and GPU can hit both max frequencies at the same time. I don't developers are going to think in terms of "frequency" but how much power does this particular effect cost me.
 
Power consumption =/= frequency. Cerny explained it during the conference, that's why both CPU and GPU can hit both max frequencies at the same time. I don't developers are going to think in terms of "frequency" but how much power does this particular effect cost me.
Presumably Sony's devkit has optimisation tools focussed on monitoring power draw. Plenty of devs have commented that PS5 is easy to develop for, as has Shuhei Yoshida who said that many developers are telling Sony they never worked on a console as easy to develop on as PS5.
 
Power consumption =/= frequency. Cerny explained it during the conference, that's why both CPU and GPU can hit both max frequencies at the same time. I don't developers are going to think in terms of "frequency" but how much power does this particular effect cost me.
It sounds like the developers could monitor the power consumption over time, but I don't think that it's necessary. I just skimmed the optimization manual for the Zen CPU where AVFS was introduced and it doesn't seem like there is any major thing to be considered. However, there are minor things which are advised like having longer and fewer NOP loops instead of short and many. However, we still don't know if Sony is using the equivalent of AVFS or something similar.
https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf
 
"AVFS" is a bullet point in the circuitry design to enable "pay as you go" power-performance scalability in the IP blocks. The overall platform-level control and orchestration of power/frequency/voltage/etc happen in a programmable thing called System Management Unit (SMU), that talks to all the IP blocks via the Infinity Fabric control plane.

The difference you are looking for is unlikely anything more drastic than a special brew of the SMU firmware, carrying algorithms & profiles to work in accordance to PS5 spec.
 
Last edited:
God I hope not, every SDR-HDR routine I've ever seen looks like garbage, leave the damn colourspace alone. Who knows what choices an artist might have made with a few billion extra colours and a way of setting luminence directly? Certainly not an engineer working to create a generic solution that can run in real time.
 
God I hope not, every SDR-HDR routine I've ever seen looks like garbage, leave the damn colourspace alone. Who knows what choices an artist might have made with a few billion extra colours and a way of setting luminence directly? Certainly not an engineer working to create a generic solution that can run in real time.
Their solution seems to be focused on rebalancing the luminance and less so on remapping colors.
 
There's no way to do one without the other though, for the TV to show that broader luminence range it needs to be fed a HDR signal so you have to translate the RGB colour values to one of the HDR signal formats to be able to specify luminence. It's one of the things that annoys me about Netflix on my Xbox, it flips into HDR mode regardless of the content makking SDR content look dull as all hell.
 
There's no way to do one without the other though, for the TV to show that broader luminence range it needs to be fed a HDR signal so you have to translate the RGB colour values to one of the HDR signal formats to be able to specify luminence. It's one of the things that annoys me about Netflix on my Xbox, it flips into HDR mode regardless of the content makking SDR content look dull as all hell.
But you have control over the ultimate image by tailoring the metadata being fed to the TV to control how it tonemaps.
 
This is my issue with it though you are tone mapping from an 8-bit colour format to a 10-bit colour format to get the luminence channel you need, there is no 1:1 mapping because even if the colour value is highly similar the new luminence data is going to alter the final image. SDR games are SDR, they were designed that way from the ground up and unless you dedicate a team to redoing every art asset in HDR you are just applying fancy filters.
 
If this rumour is true about big navi:



Navi21 is really like 2x5700 which is 40 CU



https://videocardz.com/newz/amd-sienna-cichlid-navi-21-big-navi-to-feature-up-to-80-compute-units

that sort of puts XSX in a bit of no man's land. Not sure how they got to 52CU unless of course it's 2x2x14. Seems like the only possible configuration, but if you look at the column called Max_CU_per_SE, the number always stops at 10. Which is 5 WGP. Not the 7WGP that MS has.

I wonder if there will be a PC counterpart here.
 
Last edited:
That should mean that Navi21 isn't Sienna Cichlid, since we already have Linux-patch saying Sienna Cichlid has HBM2?
Also Max_CU_per_SE doesn't always stop at 10, Navi 14 is at 12, Vega 10 at 16, Picasso/Raven at 11 etc
 
that sort of puts XSX in a bit of no man's land. Not sure how they got to 52CU unless of course it's 2x2x14. Seems like the only possible configuration, but if you look at the column called Max_CU_per_SE, the number always stops at 10. Which is 5 WGP. Not the 7WGP that MS has.

Perhaps the Max_CU_per_SE is simply referencing that particular chip, and any products derived from it using disabled elements. NANVI14 lists 12, and the older VEGA10 lists 16 (there are others > 10 too). So it appears to be configurable from Navi all the way backwards based on requirements.

And if you're expecting to be compute heavy (like Vega was) packing in CUs might not even be an issue anyway. L0 cache scales with CUs, and L2 scales with number of memory channels (at least as far as I can tell from the RDNA whitepaper). The graphics command processor and ACE's and DMA block and all that seem to be separated from the number of Shader Engines anyway. Plus 4 shader engines would probably have required 8 redundant CUs instead of 4 (wasting a good chunk of die area), and also meant needing to double up L1, rasterizer, Primitive unit and RBs (costing space you could use on CUs).

I think it's 2x2x14 as you say, and if so I've no doubt it's by far the best use of the silicon budget that MS have!

The 4 RBs per SE (down from 8) is interesting though. So 4 x 4 x 16? 64 ROPs on top end AMD again? :runaway:

Maybe there really are some aspects of RDNA1 that console makers chose to keep. ;)
 
That should mean that Navi21 isn't Sienna Cichlid, since we already have Linux-patch saying Sienna Cichlid has HBM2?
Also Max_CU_per_SE doesn't always stop at 10, Navi 14 is at 12, Vega 10 at 16, Picasso/Raven at 11 etc
Indeed. I suppose Max CU really means, the max for the chip, and the lowered variants of it is referring to redundancy etc
 
Indeed. I suppose Max CU really means, the max for the chip, and the lowered variants of it is referring to redundancy etc
Of course it means the max for the chip, but my point was that 10 isn't any magical ceiling so you shouldn't think just because Navi21's limit is (probably) 10 it would apply to all RDNA2 products
 
Status
Not open for further replies.
Back
Top