Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
Some workloads benefit from higher clock. Some workloads benefit from lower clock and higher energy density instructions.

The alternative would be to make more expensive console. Or always have lower clock. When optimizing for fixed price point it makes sense to maximize what one can get out of that hw. Sony did just that. To not have to lower clock sony would have to add better power supply and better cooling which would add price to console. And/Or add bigger lower clocked chip again adding to price.

Even MS had to do similar(not same) thing. Using SMT lowers clock of cpu. avx2 on xbox we don't know but it would be good to assume it will not lower clock.

That is true.

All in all, every topic around PS5 boost limitations are determined by their budget for the cooling solution, which we dont know. But we can infer, from Cerny words, its not a whole lot, or else they would not have trouble having the fixed frequencies he mentioned (3.0/2ghz).
 
Yes, and for XSX for example the only custom hardware is the data decompressor. The remainining seem stock RDNA2 and DX12 ultimate API things.

Aside from the comment @mrcorbo made, don't you think this interpretation might be a bit uncharitable?

DX12U *is* Microsoft's API after all, and the first AMD GPU [edit: generation] to support it fully just happens to be the one designed for Microsoft's own Xbox GPU.

It's quite reasonable that MS should want DX12U and their console hardware to mirror match each other as much as possible.
 
A game struggling to hit 60FPS on PS5:

- Devs are limited by TDP
- Increasing workload per cycle increases TDP, reduces Mhz = 60FPS unreachable.
- Reducing workload per cycle decreases TDP, increases Mhz = 60FPS uncreachable.
- Are forced to optimize code without increasing workloads or make concessions to the picture quality.
Have you thought about the fact that TDP is pointless engineering metric if it is a moving goalpost?

Do you have an idea that rendering pipeline, eh, comprises of multiple stages that stresses different graphics subsystems over the course of a frame? Throwing async compute into the picture makes the load mix even more application dependent, and even async compute can't be the perfect gap filler all the times to suck up all "under-utilised" resources. Blanket generalisation like this can hold true only when naive/impractical assumptions like "all software use all hardware resources at 100% all the times" are made.
 
Last edited:
A game struggling to hit 60FPS on PS5:

- Devs are limited by TDP
- Increasing workload per cycle increases TDP, reduces Mhz = 60FPS unreachable.
- Reducing workload per cycle decreases TDP, increases Mhz = 60FPS uncreachable.
- Are forced to optimize code without increasing workloads or make concessions to the picture quality.

We will never have 60fps games on PS5. :runaway:
 
So then these are neither peak nor boost clock speeds as we know them.

If the clocks are absolutely 2.23 and 3.5... then whats the use of smartshift?

What power could be "sent" to the GPU if its already at full 2.23GHZ?

This doesn't seem like a real thing..

Running at high clocks doesn't mean you're maxing out power draw. It's the nature of the instructions you're using and the operations you're performing which matters most.
 
Running at high clocks doesn't mean you're maxing out power draw. It's the nature of the instructions you're using and the operations you're performing which matters most.
Filling all 8 cores with work does mean power draw though. He's right. Unless you intend to be sitting with a single core running at max cap, there needs to be a reduction as the core count increases.
Look at how heavily they took the TLOU remaster and immediately filled all the cores on PS4 to reach 60fps.
IPRuHWB.jpg
 
Speculation time - what cooling tech could Sony be employing that we'll like the look of as per Cerny's expectations, and can deal with the predict thermals? Is there anything new? Will their double-sided cooling come into play? Will the board be mounted centrally in the case with space either side, and a fan draw air over both front and back sides? And new magical techs?
Thermal guidelines for CPUs can be fluid. While not directly comparable, I did look at some values that were apparently sourced from AMD documentation in https://www.gamersnexus.net/guides/...lained-deep-dive-cooler-manufacturer-opinions.

If it were just the CPU, something along the lines of a Wraith Stealth might work for the CPU portion, assuming we can get a combination of Ryzen 3600 and 3700. Sony's target clock is a notch below the 3700, although there are 8 cores versus the 65W 3600. Odds are one core in the PS5 is OS-reserved, so Sony may be able to be more sure about what sort of instructions that one uses.
The main value I'm curious about is the P0 value (non-boost all-core base power), which we can see can be unexpectedly high for the desktop processors. We can also see how widely power can vary based on a modest change in clocks once you get in the boost ranges.

I think a console could get away with similar temp values and a cooler with capabilities in the same range as the Series X without going with double-sided cooling. I'm still not sure what the gain is here, and I think Sony might be going for less power.

Random speculation:
The 3600 seems to be the one that behaves itself, but how should we interpret the fact that the PS5's top-clock is below the base clock of a 65W CPU?
I don't recall seeing a PS4 power breakdown, but there was some speculation based on existing Jaguar chips that there might have been ~30W or so for 8 cores. How much lower than 65W is the ceiling for the PS5's CPU, and what does that mean for either the GPU or the total console power?
Microsoft could conceivably be allocating a value close to the desktop P0 rating for its CPU, if the power ratings from Digital Foundry are representative.
Is there a good reference for the GPU-only consumption of Navi 10? I thought I saw a reference for 180W for the 5700XT. Let's say we tossed on a 50% gain in efficiency for RDNA2, and could get ~120W for the same performance. If there were a guaranteed 90W for the GPU, 30W for the CPU, and 30W being sloshed about, then that gives a 150W APU consumption and 30-50W for the rest of the console for a Pro-like power budget. Maybe that's too conservative in terms of power savings?


To elaborate, what use is there to reach 3.5/2.23ghz if that means developers will have to cap their code to obey the power draw envelope. AKA reduce work per cycle!
I think they'd profile their code for performance and see less performance than the raw numbers would suggest. That already happens since real life doesn't give perfect scaling, but it would be a more complex thing to profile and opens up a new class of interactions between workloads and threads as far as adverse effects are concerned.
 
Filling all 8 cores with work does mean power draw though. He's right. Unless you intend to be sitting with a single core running at max cap, there needs to be a reduction as the core count increases.
Look at how heavily they took the TLOU remaster and immediately filled all the cores on PS4 to reach 60fps.
IPRuHWB.jpg

But you would have to factor in cerny specifically mentioned expected heavy(ier) use of avx2 instructions for having to reduce clock speed.

This is all so speculative. We don't even know how much the clocks will be reduced. Could be half, could be 200MHz. And could be it really is not that usual for that to happen or could be happens all the time.
 
Even MS had to do similar(not same) thing. Using SMT lowers clock of cpu. avx2 on xbox we don't know but it would be good to assume it will not lower clock.
We don't actually know that PS5 doesn't have a different max clock speed when SMT is enabled either. All we really know is that it will adjust clocks based on load/power requirements, and having SMT enabled changes the power and load. That's the issue many of us are having with Sony's reveal, it only provided best case performance figures with a guarantee that those numbers are fluid and a confident nearly close most of the time assurance that those numbers can be reached. Things would be clearer if they gave us a range or performance. If they said the CPU will never drop below 2ghz and the GPU never below 3ghz, then we would know there isn't going to be a ton or range. But if the GPU is clocking down to 800mhz that would be a really noticeable drop, I think.
 
But you would have to factor in cerny specifically mentioned expected heavy(ier) use of avx2 instructions for having to reduce clock speed.

This is all so speculative. We don't even know how much the clocks will be reduced. Could be half, could be 200MHz. And could be it really is not that usual for that to happen or could be happens all the time.
You are right that we don't know, because the characteristics of the chip aren't known.
But we do know that power/frequency relationship looks nearly exponential for a lot of chips, some chips may appear more linear in nature - meaning a significant amount of more power draw to marginally increase frequency.

Everyone thought 2.0 Ghz was high, they said it wasn't achievable using normal methods. But using boost they got it to 2.23 Ghz. At an exponential scale, that's a significant amount of additional power required to hold that frequency. If work load is increasing to maintain that clock the power draw must be greater. Once the remaining power capacity is eaten up by the GPU and it requires even more power than that, it must with draw from the CPU or downclock itself.
 
Filling all 8 cores with work does mean power draw though. He's right. Unless you intend to be sitting with a single core running at max cap, there needs to be a reduction as the core count increases.
Look at how heavily they took the TLOU remaster and immediately filled all the cores on PS4 to reach 60fps.
IPRuHWB.jpg
I really wouldn't back "immediately filled all the cores" with this graph, which pretty much depicts the opposite of the story, i.e. the varying workload mix in reality that advanced power management techniques are designed to take advantage of. That is on top of the fact that they had already been actively seeking out major parallelisation wins (this graph in particular: rendering a frame ahead of game logic).
 
Even MS had to do similar(not same) thing. Using SMT lowers clock of cpu. avx2 on xbox we don't know but it would be good to assume it will not lower clock.
If MS can run AVX on 16 threads at 3.66 GHz without lowering clocks, why can't Sony run AVX on 16 threads at 3.5 GHz?
 
You are right that we don't know, because the characteristics of the chip aren't known.
But we do know that power/frequency relationship looks nearly exponential for a lot of chips, some chips may appear more linear in nature - meaning a significant amount of more power draw to marginally increase frequency.

Everyone thought 2.0 Ghz was high, they said it wasn't achievable using normal methods. But using boost they got it to 2.23 Ghz. At an exponential scale, that's a significant amount of additional power required to hold that frequency. If work load is increasing to maintain that clock the power draw must be greater. Once the remaining power capacity is eaten up by the GPU and it requires even more power than that, it must with draw from the CPU or downclock itself.

Max clock is not max power draw. Furmark is good example of this. It heats up gpu in a way no game does. In console world it's probably possible to find furmark like cases but not in all the games, all the time. We would really need some developer break nda and tell how easy it's to create virus like power loads or is it difficult and ps5 coasts on max clock most of the time.

edit. Cerny was specific on saying max gpu clock is not limited by power/heat. There are parts of the gpu that just refuse to work on higher clock no matter what.
 
I really wouldn't back "immediately filled all the cores" with this graph, which pretty much depicts the opposite of the story, i.e. the varying workload in reality that advanced power management techniques are designed to take advantage of. That is on top of the fact that they are actively seeking out major parallelisation wins (this graph in particular, rendering a frame ahead of game logic).
you're right, it took them a while to get to this point. So it wasn't immediate in that sense. Their job functions had all sorts of gaps everywhere on all cores.
To me their original graphs looked like they had tons of breathing room for the CPU to sit idle. To me this type of load is the type of load where the boost frequency will stay rather high.
CkEi9l1.jpg

The original graphs were a lot less filled and they couldn't get to even a 33ms frame time. They had to overlap their rendering frame
 
Max clock is not max power draw. Furmark is good example of this. It heats up gpu in a way no game does. In console world it's probably possible to find furmark like cases but not in all the games, all the time. We would really need some developer break nda and tell how easy it's to create virus like power loads or is it difficult and ps5 coasts on max clock most of the time.
Is the assumption that developers can't stress a GPU and CPU unless it's stress test?
I don't think that's reflective of many current generation consoles that are running full tilt on their fans.
 
If MS can run AVX on 16 threads at 3.66 GHz without lowering clocks, why can't Sony run AVX on 16 threads at 3.5 GHz?

My best guesses would be power supply and/or cooling subsystem. Or could be MS did something specific on their chip design to allow for higher clocks.
 
Is the assumption that developers can't stress a GPU and CPU unless it's stress test?
I don't think that's reflective of many current generation consoles that are running full tilt on their fans.

This is in no way contradicting what I claimed. I just said as per cerny max clock speed doesn't imply max power draw. Very well optimized games would find the powerdraw limits and we really don't know what that means. Cerny was very specific on mentioning avx2 for cpu side. And cerny was very specific saying gpu clock speed is not limited by thermals/power draw but some internal implementation of gpu. It just doesn't clock any higher no matter what.

Taking what cerny said it means that to hit those power draw limits one would have to have sufficiently well optimized code. Again what that really means we will not know until some developer spills the beans. Though just hitting max clocks is not enough to cause throttling.
 
My best guesses would be power supply and/or cooling subsystem. Or could be MS did something specific on their chip design to allow for higher clocks.

I find it strange that so many theories appear based on power problema over a 2.23 Ghz 36 CU APU and lower CPU clocks, and no one finds power problems on a APU with higher CPU clocks (+3%) and 52 CU (+44%) at 1825 (-22%) Mhz with extra logic to route more data, and wider controllers.

Why is that? What data do you have for that line of thought? Based only on these thingsI would guess Microsoft's APU will consume more power than PS5.
 
Status
Not open for further replies.
Back
Top