Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
If PS5's gpu spends most of the time at 9.2TF then it would clearly contradicts to what Cerny is promising and be very disingenuous to the marketed 10.3 TF. I can only imagine the thundering uproar from the core community and media alike, it would possibly put a bad name to Sony or Playstation for next gen which is undesirable to the company. But if it hovers around 9.8-10TF under heavy load then it's gonna be totally fine. Also cpu speed is gonna be a non issue at 4k or close to 4k res, so a slight downclock would literally be unnoticed during gameplay.
During a multiplatform gameplay using XSX rendering at native 4k for comparison, if PS5 stays at 1800p most of the time then it must mean the gpu clock is heavily dropped and 9.2TF is most likely the standard number since there's a 44% pixel difference between 2160p and 1800p. If it stays at ~2000p then it would be less than 20% in pixel difference and Cerny would be correct after all. 2000p vs 2160p would be virtually undiscernable at a normal viewing distance or even face to the screen lol, it would require hardcore magic from DF to tell the story.

This might become more complicated if the PS5 lacks VRS - it wouldn't be possible to know what was down to clocks, and what was down to efficiency.

Same for bandwidth. The gap in bandwidth for the GPU to use is lager than the difference in Terraflops.

I think we might need developers to leak us info on how the hardware behaves when they're profiling games.

Edit: Sony still haven't confirmed if they have VRS or sampler feedback, right?
 
I wonder if another reason for the overhead required on expansion drives, is that games will likely be built around 12 channels of access and with an 8 channel device they may have to interleave data access within those channels as if it were 12 channels.

Totally uninformed speculation, could be total bunk, but I thought I'd share the idea.. =s
 
Is there a transcript available for the Road to PS5 video? Found this at 37 minute mark. Cerny "running a GPU at a fixed 2Ghz target was looking unreachable with old fixed frequency." He's talking about AMD Smartshift here. It seems pretty clear that in order for the GPU to maintain 2.23Ghz, that power will have to come from decreasing the clock of the CPU. So again as a lot have been saying, it depends on what developers want for a balance between CPU and GPU performance. It may be able to maintain that GPU clock speed but we don't know what that's going to do to the CPU clock, as we don't know the base frequencies. Sony only mentions "up to" for both CPU and GPU frequencies. Cerny also mentions "running the CPU at 3Ghz was causing headaches with the old strategy" meaning fixed clocks. So it sounds to me like Sony was unable to run the GPU at a fixed 2.0Ghz while also having a fixed CPU clock of 3.0Ghz. That means that for the GPU to reach it's max frequency the CPU will be running somewhere below 3GHz.

Problem with fixed clocks had nothing to do with clocks but with fluctuating power usage. Inverting the scheme, by unlocking speeds and locking power gives much better control in thermals.
In one case you have clocks fixed and power changing with workloads. Here you have fixed power and clock changing with workloads.
The second case allow for much better control of thermals, and as such you can go higher on clocks than with the other method, because you can keep your temperatures under control.
 
Mark Cerny said:

Another issue involving the GPU involves size and frequency. How big do we make the GPU, and what frequency do we run it at. This is a balancing act. The chip has a cost, and there is a cost for whatever we use to supply that chip with power and to cool it. In general I like running the GPU at a higher frequency. Let me show you why.

*shows an example of a hypothetical ps4 pro level with 36 or 48 cus at the same TF, faster chip have all sections of the chip faster, etc...*

It's easier to fully use 36CU in parallel than 48CU, when triangles are small it's much harder to fill all those CUs with useful work. So there's a lot to be said for faster assuming you can handle the resulting power and heat issues, which frankly we haven't always done the best job at. Part of the reason for that is, historically, our process for setting cpu and gpu frequencies have relied on heavy duty guesswork with regards to how much electrical power it will consume and how much heat will be produced as a result inside the console.

Power consumption varies A LOT from games to games. When I play GoW on ps4pro I kmow the power consumption is high just be the fan noise but power isn't simply about engine quality, it's about the minutiae of what's being displayed and how. It's counter-intuitive but processing dense geometry typically consumes less power than processing simple geometry which is, I suspect, why Horizon's map screen makes my ps4pro heat up so much.

Our process on previous consoles has been to try to guess the maximum power during the entire console lifetime might be. Which is to say, the worst case scene in the worst case game, and prepare a cooling solution which we think will be quiet at that power level. If we get it right, fan noise is minimal. If we get it wrong, the console will be quite loud for the highest power games and a change it might overheat or shutdown if we misestimated power too badly.

PS5 is especially challenging because the CPU supports 256bit native instructions that consume a lot of power. These are great here and there but presumably only minimally used... Or are they? If we plan for major 256bit instruction usage, we need to set the cpu clock substantially lower or noticeably increase the size of the power supply and fan. So after long discussions we decided to go with a very different direction for PS5.

*blah blah about gcn vs rdna cu sizes*

We went with a variable clock strategy for PS5, which is to say we continuously run the gpu and cpu in boost mode. We supply a generous supply of electrical power, and increase the frequency until it reaches the capability of the power and cooling solution. It's a completely different paradigm. Rather than running at constant frequency and letting power vary based on workload, we run essentially at constant power and let the frequency vary based on workload.

We then tackle the engineering challenge of a cost effective and high performance of a cooling solution designed for that specific power level. It's a simpler problem because there are no more unknowns. No need to guess what power consumption the worst case game might have. As for the details of our cooling solution, we're saving them for the teardown. I think you'll be quite happy with what the engineering team came up with.

So how fast can we run the GPU and CPU with this strategy?

The simplest approach would be to look at the temperature of the silicon die and throttle frequencies on thst basis. But that won't work, it fails to create a consistent PS5 experience. It wouldn't do to run a console slower simply because it was in a hot room.

So rather than look at the temperature we look at the activities the CPU and GPU are performing, and set the frequencies on that basis which makes everything determimistic and repeateable. While we're at it we also used AMD smartshift tech and send any unused power from the cpu to the gpu so we can squeeze out a few more pixels.

The benefits of this strategy are quite large. Running a gpu at 2ghz was looking like an unreachable target with the old fixed frequency strategy. With this new paradigm we're able to run way over that, in fact we have to cap the gpu at 2.23 so we can guarantee that the on chip logic operates properly. 36CU at 2.23 is 10.3TF and we expect the gpu to spend most of it's time at or close to that frequency and performance. Similarly, running the CPU at 3ghz was causing headaches with the old strategy. But now we can run it as high as 3.5ghz. In fact, it spends most of it's time at that frequency.

That doesn't mean ALL games will be running at 2.23 and 3.5. When that worst case game arrives, it will run at a lower clock speed, but not too much lower. To reduce power by 10% it only takes a couple percent in frequency. So I would expect any downclock to be pretty minor.

All things considered, the change to a variable frequency approach will show significant gains for playstation gamers.
 
... When I think about the use cases in which the SSD speed of PS5 brings substantial gains, over existing SSD solutions, I am trying to look at cases where it isn't tied in with other items. Or put another way; when the argument that 4Pro and PS5 have better fillrate and rasterization than X1X and XSX because of clockspeed. but we see in reality they don't because fillrate is tied in with bandwidth. The consoles are bandwidth bound than ROP bound. So when I think about the SSD speeds in general, I'm looking at the system and asking at what particular I/O speed are they I/O limited or system performance limited...

Not really addressing your query, but it reminded me of this PS5 vs PS4Pro:

Can it be pushed much further with 4K@60 with all the bells and whistles? Hopefully, but we’ll see.
 
This might become more complicated if the PS5 lacks VRS - it wouldn't be possible to know what was down to clocks, and what was down to efficiency.

Same for bandwidth. The gap in bandwidth for the GPU to use is lager than the difference in Terraflops.

I think we might need developers to leak us info on how the hardware behaves when they're profiling games.

Edit: Sony still haven't confirmed if they have VRS or sampler feedback, right?
Is it even possible for PS5 to lack VRS since it's RDNA 2 already? Unless they have a even better proprietary solution?
The bandwidth difference is a curious case tho, it's not really an apple to apple comparison is it? Let's say each party dedicates 3GB of GDDR6 to OS then you're left with 10GB of 560GB/s + 3GB of 336GB/s vs 13GB of 448GB/s. Depends how you see it but if you average it out they have literally the same 13 GB of 448GB/s left for gaming. So that's even Steven right there.
But yes we do need more info before coming down to any conclusion.
 
Problem with fixed clocks had nothing to do with clocks but with fluctuating power usage. Inverting the scheme, by unlocking speeds and locking power gives much better control in thermals.
In one case you have clocks fixed and power changing with workloads. Here you have fixed power and clock changing with workloads.
The second case allow for much better control of thermals, and as such you can go higher on clocks than with the other method, because you can keep your temperatures under control.
You could also spend more $ on a better cooling solution, or have a larger enclosure to ensure you keep temperatures under control. Ideally you want the capability of maintaining maximum performance across the APU at all times. Of course with consoles everything is a balancing act because price point is extremely important. Sony must have felt this was the right trade for performance/cost.
 
Is it even possible for PS5 to lack VRS since it's RDNA 2 already? Unless they have a even better proprietary solution?
The bandwidth difference is a curious case tho, it's not really an apple to apple comparison is it? Let's say each party dedicates 3GB of GDDR6 to OS then you're left with 10GB of 560GB/s + 3GB of 336GB/s vs 13GB of 448GB/s. Depends how you see it but if you average it out they have literally the same 13 GB of 448GB/s left for gaming. So that's even Steven right there.
But yes we do need more info before coming down to any conclusion.
I don't think it's a simples as saying they both have "13GB of 448GB/s left for gaming". On XBSX you have 3.5 GB of memory at 336GB/s that can be used for things that require less memory bandwidth. If you can fill all of those 3.5 GB's of available "standard ram" with things that don't need the full 560 GB/s of "graphics ram" ( or whatever they called it) then you have the remaining 10 GB's left at full speed. That's how MS described it in the DF article but we'll have to wait and hopefully hear from developers as to what the difference between the consoles is like in real world performance.
 
Last edited:
Well to the right of the SHARE and SAVE buttons under the video itself there is a 3 dot overflow menu that has the option to Open Transcript which puts a oddly shaped but functional transcript to the right of the video itself.
I'll be damned!
hahaha thanks man you just saved me so much trouble LOL.

All this time I was looking for a transcribing service
 
Is it even possible for PS5 to lack VRS since it's RDNA 2 already? Unless they have a even better proprietary solution?
The bandwidth difference is a curious case tho, it's not really an apple to apple comparison is it? Let's say each party dedicates 3GB of GDDR6 to OS then you're left with 10GB of 560GB/s + 3GB of 336GB/s vs 13GB of 448GB/s. Depends how you see it but if you average it out they have literally the same 13 GB of 448GB/s left for gaming. So that's even Steven right there.
But yes we do need more info before coming down to any conclusion.

It's certainly possible for PS5 to lack VRS - these are customised solutions. For example, MS added int8 and int4 to their GPU. There's also the matter of licensing. If RNDA2 uses a MS patented implementation of VRS, it's possible that won't translate to PS5.

As for bandwidth, why would you average the BW of the memory ranges on XSX? The accesses between the 10GB optimal ram and the 3.5GB slower range won't ever be evenly split.

If 50% of your access are to/from the 3.5GB none-optimal ram you've really messed everything up!
 
No, again, you are misquoting him. If developers could design a game locked at 2.23GHz, and by Cernys own admission couple % (2-3% I guess) saves 10% TDP, why did he say they couldnt hit 2.0GHz target with old way of doing things (so, no variable frequency)?

He spend several minutes prior to this laying out the strategy(33:15) with which they used to select their chip frequencies and fan sizes. They would have to predict how much heat the workloads would generate and then match a PSU and fan that fits those needs.

The fact that they had to predict a hypothetical workload means that they would have to lock the frequency at something lower than the cap.

He gave examples of how rendering low triangle counts (for the GPU) and AVX workloads (For CPU) were being especially powerhungry and as a result produce a lot more heat.

If the PS4 could go to 2.23GHz (hypothetically) while you only use the system in a way that is light on powerdraw (like HZD during gameplay) then as soon as you open the map screen it would shut off the console because that was an example Cerny gave as being extra heavy on powerdraw.
This scenario would with the old strategy necessitate a lower than 2.23GHz frequency lock because the map screen would skyrocket the temperatures at such a frequency, but gameplay wouldn't. That means the lock would lower gameplay performance in favor of keeping the map screen cool enough to use.

By using their new solution they can cap the amount of power that goes into the system to be constant and whenever one of the processors draw too much energy, that processor will have its clocks lowered slightly.

And a couple means 2 not 2-3:D.
 
If developers target anywhere near max frequencies for PS5's GPU the XBSX is going to have a pretty massive CPU advantage. This also solidifies why MS went with a larger enclosure than normal for the XBSX.
The CPU of Series X already has the advantage of going to 3.8GHz if using 8 threads, If the PS5 is to drop to 3.0GHz in those situations then Series X will have a 800MHz advantage, which is large.

The use of SmartShift implies major shifts in clocks indeed, I fully expect the CPU to drop to 3.0GHz if the GPU is maintained at 2.2GHz.

Also cpu speed is gonna be a non issue at 4k or close to 4k res, so a slight downclock would literally be unnoticed during gameplay.
Of course not, if you just fill the scenes with more objects and details, increase the draw distance/crowd size a bit, then CPU consumption will rise sharply, that's without even mentioning the increased complexity of physics, game simulation, AI, level design .. etc.
 
Last edited:
You could also spend more $ on a better cooling solution, or have a larger enclosure to ensure you keep temperatures under control. Ideally you want the capability of maintaining maximum performance across the APU at all times. Of course with consoles everything is a balancing act because price point is extremely important. Sony must have felt this was the right trade for performance/cost.

off course you could. Liquid hydrogen would be nice. ;)
Problem is cost, noise, and case size. As you say, this was the right trade.
 
He spend several minutes prior to this laying out the strategy(33:15) with which they used to select their chip frequencies and fan sizes. They would have to predict how much heat the workloads would generate and then match a PSU and fan that fits those needs.

The fact that they had to predict a hypothetical workload means that they would have to lock the frequency at something lower than the cap.

He gave examples of how rendering low triangle counts (for the GPU) and AVX workloads (For CPU) were being especially powerhungry and as a result produce a lot more heat.

If the PS4 could go to 2.23GHz (hypothetically) while you only use the system in a way that is light on powerdraw (like HZD during gameplay) then as soon as you open the map screen it would shut off the console because that was an example Cerny gave as being extra heavy on powerdraw.
This scenario would with the old strategy necessitate a lower than 2.23GHz frequency lock because the map screen would skyrocket the temperatures at such a frequency, but gameplay wouldn't. That means the lock would lower gameplay performance in favor of keeping the map screen cool enough to use.

By using their new solution they can cap the amount of power that goes into the system to be constant and whenever one of the processors draw too much energy, that processor will have its clocks lowered slightly.

And a couple means 2 not 2-3:D.
I mean, we'll have to wait and see. This is last bit of a puzzle we have related to next gen systems so we are in for few months of disscusion. At one point we had people saying 2.0GHz is impossible and Cerny is not an idiot to design fast and narrow console, yet, here we are... : )
 
Of course not, if you just fill the scenes with more objects and details, increase the draw distance/crowd size a bit, then CPU consumption will rise sharply, that's without even mentioning the increased complexity of physics, game simulation, AI, level design .. etc.

CPU requirements tend to rise also with higher framerates, atleast on pc that's the case.
 
If PS5 uses power consumption to determine clock speed how is that not going to result in variability among different units?
 
If PS5 uses power consumption to determine clock speed how is that not going to result in variability among different units?

As i understand, it doesn't have a variable power consumption, but variable clock rates for the CPU and GPU. It's a fixed power usage, therefore the idea that it doesn't matter what unit you have, and where you have it.
 
Status
Not open for further replies.
Back
Top