Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
For example when GPU's in high load, CPU is not, and vice versa.
Assume that the power shift happens with a frequency of ~1000Hz, for example.
That only means that the system is working at it max (in this case, fixed) power (as in volts/amps) and not that the CPU and GPU are both under max load at max frequency.

I think you may be missing part of the formula there. CPU and GPU clocks could both be maxed at times when both aren't also at high load.
That could be true, but everything Cerny has said points to frequency being determined by load. I suppose clocks could be maxed when there is no load, but what difference would that make if, when you start using the system they drop in frequency. On a related note, I can turn invisible when no one is looking. Unfortunately every time someone looks in my direction, I turn visible again and they see me.
 
I don't think it's going to be as simple as shifting CPU and GPU load. The impression I got from Cerny if you listen carefully was that it just another tool to afford the GPU more power. But in general, the CPU is relatively low powered vs the GPU and it has to function in some capacity, so SmartShift in and of itself probably not enough. I'd expect additional GPU downclocking necessary, even if shifting some CPU power in high-load scenarios.

Not sure where techpowerup gets their info but everything looks pretty accurate and they list the PS5 GPU base clock at 1750MHz and game clock at 1900MHz. https://www.techpowerup.com/gpu-specs/playstation-5-gpu.c3480
My guess, and we'll see when the dev docs are available, is that the scheduler is constrained in some way at higher clocks if Dictator is right with his reporting that there are power profiles the developer chooses from.
Techpowerup is very unreliable. I wouldnt look into this until we get some devs closer to release who will give more info regarding this
 
80 ROPs for XBSX? They are smoking a pretty good shit.

I guess we'll see, there is good evidence based on the latest die shot from MS that there are 5 asymmetrical shader arrays to match up with the 320-bit memory bus. 3 arrays, each with 10 CU's and 2 arrays, each with 8 CU's. 4 total disabled for yields.
 
I guess we'll see, there is good evidence based on the latest die shot from MS that there are 5 asymmetrical shader arrays to match up with the 320-bit memory bus. 3 arrays, each with 10 CU's and 2 arrays, each with 8 CU's. 4 total disabled for yields.

Asymmetrical chip design?
We know that L2 and MCs are independent, and shared. But 56 CUs leaves no chance for anything other than 4 SEs.
But it's off-topic, though.
 
Asymmetrical chip design?
We know that L2 and MCs are independent, and shared. But 56 CUs leaves no chance for anything other than 4 SEs.
But it's off-topic, though.

L2 is tied to each 64-bit channel. 320-bit interface requires 5 x 64-bit channels. There is no requirement for having a power of 2 number of shader arrays. Each array would contain 16 ROPs and there are 5 arrays. Just an unequal number of CU's in them, 3 have 10, 2 have 8. Look at the die shot. I don't know how that leaves no chance.
 
L2 is tied to each 64-bit channel. 320-bit interface requires 5 x 64-bit channels. There is no requirement for having a power of 2 number of shader arrays. Each array would contain 16 ROPs and there are 5 arrays. Just an unequal number of CU's in them, 3 have 10, 2 have 8. Look at the die shot. I don't know how that leaves no chance.
No they're not tied. AMD's GPU L2 connects to Infinity Fabric, not memory controllers.

(as side note, I know everyone keeps referring to them as 64-bit memory controllers, but are they really that? At least certain AMD slides suggest they're actually 16-bit controllers, which also fits the fact that GDDR6 uses 16-bit channels (other option would be 64-bit split into 4x16 "virtual" memory controllers but why list them as 16 separate MCs (for 256bit controller) then?)
 
Regarding BC for PS3, call me crazy but couldn't Sony port the IP of the CELL BE to modern nodes? I understand the porting process in itself may be pricey and their may also be some licensing concerns with IBM/Toshiba, but I'd think the eventual unit cost would be incredibly cheap, it'll draw tiny amounts of power and the die would be absolutely minuscule. It could simply be placed on the motherboard and like any chip, over time; it could continue to be shrunk with future nodes.

It'd be a fairly major undertaking, not the least of which is porting from the original IBM SOI design rules while shifting to a much smaller node; consider for example the pipelining & stages needed to achieve a certain frequency with particular transistor design and everything changes on there from top to bottom.

:oops: :oops:

They may have to do some special designing too to make it slower just like IBM had to do for Xenon when they did the 45nm revision.
 
Last edited:
Cerny stated it during his reveal. Here's a Eurogamer article with a quote:

"Rather than look at the actual temperature of the silicon die, we look at the activities that the GPU and CPU are performing and set the frequencies on that basis - which makes everything deterministic and repeatable," Cerny explains in his presentation. "While we're at it, we also use AMD's SmartShift technology and send any unused power from the CPU to the GPU so it can squeeze out a few more pixels."​

If power delivery is constant and clocks are known variable and not tied to thermals, and unused power from the CPU has to be shifted to the GPU to achieve the highest clocks, then the highest clocks for the GPU and CPU can't be achieved at the same time. Otherwise clocks would be at the maximum and power wouldn't have to be shifted to the GPU to achieve those speeds.

Correct me if wrong, but this is how I see it!
As Cerny stated colling systems are created on an estimated guess. Worst case scenarios, would go about the estimated power usage, creating overheating. This happens on a fixed clocks, variable power system, and also on this fixed power, variable clocks system.
If that were not the case, the unestimated worst case scenario Cerny is mentioning would never happen.
Remember that power usage is dependable of workloads, so even with fixed power, an excessive, not predicted workload would require more power.
It's on one of those cases that that GPU will try and use CPU power.
If available (CPU with low workload) it will use power from the CPU, and prevent the system going over the expected power usage.
If not, it will downclock 2 or 3 percent, gaining 10% power, and keepting inside the budget.
 
He also stated that it would maintain those clocks ( the clocks referenced in the reveal) most of the time.

So he lied is what you saying?

He prefers to belive that PS5 GPU base clock is well below xsx target clock, despite the latter bigger apu. What can you do?

There is common opinion, that PS5 will need expensive cooling. What about xsx thermal output? The biggest AMD GPU so far and fairy highly clocked. I don't think that ms can cut any corners there.
 
It's breathtaking.



I guess it makes sense that L1 would thrash more with hyperthreading? Then L2 is per core, but L3 is shared amongst all, and then DRAM is hit harder.

Kind of curious to see how console Zen 2 will fare with the huge bandwidth available along with not having to traverse RomeI/O. Infinity fabric will be handling cache snooping etc. now too instead of Garlic and Onion.

I'm sure betanumerica brought that up the curiosity already.

From what they said briefly they did not care so much about cache hit rates vs just making sure work was spread evenly across all cores so all cores were working. So they may have lower cache hit rates than would be considered optimal and end up hitting DRAM more than they'd like. They did focus on data-oriented design, so cache and RAM were not an afterthought, just maybe second priority to keeping all cores busy and not having one core bottleneck the whole system.
 
Correct me if wrong, but this is how I see it!
As Cerny stated colling systems are created on an estimated guess. Worst case scenarios, would go about the estimated power usage, creating overheating. This happens on a fixed clocks, variable power system, and also on this fixed power, variable clocks system.
Remember that power usage is dependable of workloads, so even with fixed power, an excessive, not predicted workload would require more power.
It's on one of those cases that that GPU will try and use CPU power.
If available (CPU with low workload) it will use power from the CPU, and prevent the system going over the expected power usage.
If not, it will downclock 2 or 3 percent, gaining 10% power, and keepting inside the budget.

Then one can wonder, why even bother. Just go with a 2% lower clock from the get go.

If that were not the case, the unestimated worst case scenario Cerny is mentioning would never happen.

Or just leave out even mentioning it, if it never happens anyway. XSX never downclocks, so don't mention it, its sustained then.
Theres more to it, im sure.
 
There is common opinion, that PS5 will need expensive cooling. What about xsx thermal output? The biggest AMD GPU so far and fairy highly clocked. I don't think that ms can cut any corners there.

Microsoft has shown everything, the case and cooling. So have a look at the actual Xbox Series X cooling and make an informed judgment.
 
Cooling is intresting because cerny alluded in past consoles they were guessing what the peak power output would be and planning for reserve. i.e. you get those games where console is ridiculously loud.

PS5 should have fixed power budget which sony can optimize against. The remaining question is ambient temperature. For ambient temp sony can probably design against specific maximum temp and just say console will not work in hotter than let's say 40C.

So if above is true it should be possible for sony to design a very specific cooling system where all important parameters are exactly known. Would it be expensive? Perhaps, but it would not be more expensive than what is exactly needed.
 
Then one can wonder, why even bother. Just go with a 2% lower clock from the get go.



Or just leave out even mentioning it, if it never happens anyway. XSX never downclocks, so don't mention it, its sustained then.
Theres more to it, im sure.
Well, Cerny knows better than us than in a 33 ms frame time the CPU and the GPU are not fully used generally at once, the GPU usually kicks once the CPU finishes its heavy jobs. So I supposse that normally both CPU and GPU wont be fully utilized at the same time allowing the clock reduction to be only near 2% because the CPU in that moment is not at full occupacy. If both were fully activated in all its transistors Cerny said that the speed would be 3 GHz and 2 GHz respectively. He was clear in that point. He is quite smart indeed and has great low level engineering knowledge.
 
Last edited:
Well, Cerny knows better than us than in a 33 ms frame time the CPU and the GPU are not fully used generally at once, the GPU usually kicks once the CPU finishes its heavy jobs. So I supposse that normally both CPU and GPU wont be fully utilized at the same time allowing the clock reduction to be 2% bonly nearecause of CPU in that moment is not at full occupacy.
This is probably a fairly accurate assessment of what is to come.
At 33 ms frame times, I don't expect there to be much CPU usage so for the most part I do expect the GPU to run at it's capped rate.

I think this is where this setup will be fine, the setup for 60fps or greater titles is sort of the scenario that I'm looking at.
I hope that most titles on PS5 are 60fps though (personal preference)
 
Status
Not open for further replies.
Back
Top