Next Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Discussion in 'Console Technology' started by Proelite, Mar 16, 2020.

  1. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    12,794
    Likes Received:
    8,190
    Location:
    London, UK
    Apparently they can given Sony are aiming to ship 10m consoles by early 2021. So evidently not a problem. :nope:

    And yet, we've heard zero noise, rumour or suggestion about developing for PS5, only gushing happiness at the new architecture. Many developers have been used to developing for variable clock hardware and variable performance profiles across literally hundreds/thousands of varying performance profiles because this has been the PC CPU and GPU for years.

    Mark Cerny said workload, activity is different. The whole GPU may may be active but the workload may be light because of lots of use of 32-bit, 64-bit or 128-bit instructions and data. Equally parts of the GPU may be inactive but the workload may be heavy because of a lot of use 256-bit instructions and data - which was a scenario Mark Cerny mentioned. The workload determines the power draw, not the level of GPU's level of activity. It's a subtle, if arguably near-semantic, difference. :yes:
     
    egoless and Globalisateur like this.
  2. PSman1700

    Veteran Newcomer

    Joined:
    Mar 22, 2019
    Messages:
    2,743
    Likes Received:
    926
    Yea there's where i got it :p Another game with ray tracing at 60fps, think RT becomes the norm later on.
     
  3. chris1515

    Veteran Regular

    Joined:
    Jul 24, 2005
    Messages:
    4,786
    Likes Received:
    3,744
    Location:
    Barcelona Spain
    And it will arrive on Xbox console later.
     
  4. cheapchips

    Veteran Newcomer

    Joined:
    Feb 23, 2013
    Messages:
    1,175
    Likes Received:
    972
    "some use of ray tracing" suggests a somewhat limited use, which is what we're expecting right?
     
    PSman1700 likes this.
  5. chris1515

    Veteran Regular

    Joined:
    Jul 24, 2005
    Messages:
    4,786
    Likes Received:
    3,744
    Location:
    Barcelona Spain
    Maybe specular reflection like GT7...
     
  6. PSman1700

    Veteran Newcomer

    Joined:
    Mar 22, 2019
    Messages:
    2,743
    Likes Received:
    926
    Yes like we have seen so far, fully ray traced games aren't really going to happen anytime soon i think.
     
  7. fehu

    Veteran Regular

    Joined:
    Nov 15, 2006
    Messages:
    1,755
    Likes Received:
    747
    Location:
    Somewhere over the ocean
    I don't feel like agreeing with you and anyone that thinks this, just because Microsoft has already done it with the cpu presets more thread / more frequency, and anyone is fine with it. to the point that nobody talks about it and slept out of mind.
    Maybe next-next generation will allow *choosing* between balanced / send all dilithium energy to the gpu.
    The only reason that we are still talking about Sony's implementation is that still it's not clear how it will auto-manage the frequency balancing.
     
  8. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    10,871
    Likes Received:
    10,965
    Location:
    The North
    activity level for a specific chip is how many transitions are flipping states. So 1S > 0s and vice versa. When bits flip there is much more power draw. Usually this should be associated with workload, but not always. Larger instruction sets flip way more bits with less instructions which is why you’re seeing so much more power draw. You go from adding 2x32 to adding 8x32 in a single shot across all cores. You’re going to get massive activity across more transistors across more cores. That’s an easy way to see how activity level scales on a CPU.
     
  9. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    12,794
    Likes Received:
    8,190
    Location:
    London, UK
    Before I write a page an a half here, let me ask you a question. Do you believe that all FinFET transistors across the APU die are equal in terms of use and power draw?
     
  10. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    10,871
    Likes Received:
    10,965
    Location:
    The North
    certainly not in terms of use, transistors in chip are not being used equally all over. Some will definitely be used all the time as per its use for most functions that people require of the chip, and others less (functions less called, operations less called). As for power draw no, generally speaking they should operate within a close tolerance of each other, but you're going to get some differences when spread over billions of transistors.

    I'm not referring to idle power states when referring to activity.
     
  11. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    12,794
    Likes Received:
    8,190
    Location:
    London, UK
    This isn't the case. Ignoring FinFET memory transistors, there are different types of FinFET logic gates which can be optimised for performance (and use considerably more power) and others can for lower leakage (and use considerably less power). This is why state flips are not useful for determining power draw.
     
  12. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    10,871
    Likes Received:
    10,965
    Location:
    The North
    But from a simplistic point of view it's what we need for dynamic power equations. It would be fairly challenging to broadly talk about chips, without benchmarking.
     
  13. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    12,794
    Likes Received:
    8,190
    Location:
    London, UK
    True, and we don't have much on this other this statement for Digital Foundry's Road to PS5 analysis piece:

    An internal monitor analyses workloads on both CPU and GPU and adjusts frequencies to match. While it's true that every piece of silicon has slightly different temperature and power characteristics, the monitor bases its determinations on the behaviour of what Cerny calls a 'model SoC' (system on chip) - a standard reference point for every PlayStation 5 that will be produced.​

    This is why I picked up on the workload/activity thing because whatever this internal monitor is predicated on, it is workload rather than activity. What does this mean? Is there logic in PS5 profiling GPU/CPU/API workloads in realtime to make adjustments power distribution? ¯\_(ツ)_/¯.

    I guess you do need a smart system if you are going to steal power from CPU or GPU, you have to understand the consequences otherwise it could be problematic.
     
    BRiT, pharma and iroboto like this.
  14. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    10,871
    Likes Received:
    10,965
    Location:
    The North
    fair enough, I see your POV. Likely monitoring it's instructions that are coming in, which I guess in turn they would know the power levels for each type of instruction.
     
    BRiT and DSoup like this.
  15. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,367
    Likes Received:
    3,961
    Location:
    Well within 3d
    Every unit in the SOC needs to meet its performance target under the design's max transient power limit and the TDP. The parameters of the power delivery system and cooler would give limits to the acceptability of silicon, but the lower points would tend to be less extreme in their scaling than the points at the edge of the safety margin. Cerny made a reference to clock/power points intended to match the thermal density of the GPU and CPU sections, although I'm not clear on why that was emphasized given there seems to be no evidence of any other AMD products needing that, and they can experience more significant swings than the PS5's described method can.
    On the other hand, such a method could be simpler than the usual AMD production method, where the validation suites would be testing many more DVFS points and transition combinations than the PS5's design requires. Whatever the PS5 DVFS points are, the described system is consistent with using AMD's standard DVFS in a less challenging way than other consumer products.


    The validation process for the PS5 seems more complex than it was for the PS4. However, in terms of manufacturing it looks to me like it's within the limits of what AMD does routinely since there's a version of this DVFS in virtually every chip it makes.
    The system itself is using a model that is conservative in terms of what it calculates as a worst-case output, but the dynamic estimate is significantly closer to reality than the prior generation's design-time guard banding. The estimates the PS5 uses should be more conservative since every chip needs to meet the platform's model SOC standards, whereas AMD's many product bins and high-clocking SKUs can tweak parameters and make assumptions about silicon quality the console cannot.

    It's an apparently single-binned console SOC being built in the millions. For practical purposes, it is very important that most do. The CPU portion is significantly below the design max of the Zen 2 core, so I think that element is unlikely to be an obstacle. The GPU max clock is unusually high for prior GPU generations, but it seems reasonable that a pipeline specifically tailored for a higher clock target can hit a max clock that is modestly higher than the peak clocks of some RDNA products, especially since it doesn't need to be sustained.
    Whether taking the GPU clocks to this level will be the winning design philosophy remains to be seen, but it seems to me that it should at least be producible.

    AMDs DVFS has been described in other products as using activity monitors for functional elements of the pipeline. Later proposals and patents also included things like small blocks of redundant processing hardware that served as representative elements for the behavior of the most demanding silicon, such as dummy ALUs and registers running operations intended to give a worst-case figure for electrical and thermal performance. Then there's a significant number of thermal sensors and current monitors.
    The on-die voltage management and Vdroop protection indicate the hardware can manage and detect current and voltage changes at the microsecond or nanosecond scale. The activity monitors and thermal estimates work to gauge power consumption and die temperatures at microseconds up to a millisecond range, going by the power management described for various GPUs and Zen.
    I think AMD's described token-based power trading between chips or chip regions before, which may go into what SmartShift can rely upon for determining how much slack is left in the power budget.

    What the PS5 appears to be doing is taking all of this DVFS hardware, backing away from the highest CPU clock ranges, and picking a more conservative and fixed set of figures for the per-chip power model.
     
    function, DSoup, iroboto and 3 others like this.
  16. Ronaldo8

    Newcomer

    Joined:
    May 18, 2020
    Messages:
    233
    Likes Received:
    232
    The 8 thread vs 16 thread preset is chosen on a per title basis and anyway still does not imply a common power envelope with the GPU. So, not for the first time, I've got no inkling of what you are talking about.
     
    PSman1700 likes this.
  17. Kreten

    Newcomer

    Joined:
    Sep 28, 2013
    Messages:
    32
    Likes Received:
    23
    What really confuses me is what he said that they were not able to keep the GPU stable at 2.0GHz using traditional method. So if clock speeds are only reduced 50MHz the system is still at 2.25GHz and well over the point that they couldn’t keep stable.


    Is anyone able to explain a reason to this?

    I didn’t see a mention of resolution there, where does 4k 60 come from?
     
    #3197 Kreten, Jul 20, 2020
    Last edited by a moderator: Jul 20, 2020
    PSman1700 likes this.
  18. chris1515

    Veteran Regular

    Joined:
    Jul 24, 2005
    Messages:
    4,786
    Likes Received:
    3,744
    Location:
    Barcelona Spain
    https://bethesda.net/en/article/7u3fdVVW7wfC5fhNyoeU2n/deathloop-gameplay-reveal-and-next-gen-details

    From a blog post of bethesda just after the PS5 reveal event

     
    Kreten likes this.
  19. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    10,871
    Likes Received:
    10,965
    Location:
    The North
    Cerny said that they weren't able obtain 2.0 GHz with fixed clocks, not that the clocks weren't stable.
    The challenge with fixed clocks is that the frequency is never allowed to go down, so as activity level continues to increase the power draw must increase to match it as well. Which means that it must be able to survive worst case scenarios with respect to activity levels. You may also encounter some yield issues as you set a very high fixed clock because all your chips must be able to withstand the torture of running high power with high frequencies. Your cooling and power system must be matched for it.

    So with respect to looking at that entire system, they were unable to achieve 2.0 GHz.

    Variable clocks allows them to step around those issues, if the activity level spikes the power draw high, the system can temporarily drop the frequency and the chip will still be able to continue. You no longer need to worry as much about the absolute worst case torture test because the system can continually down clock and keep within the parameters of cooling and power.

    The setup they chose could reduce the yield because of fixed power draw and the requirement that all chips must be able to hit the 2230Mhz mark and hold it as per their workload rules. But as others have suggested Sony shouldn't have chosen a clockspeed that they could not have produced in fairly decent quantities.
     
  20. Kreten

    Newcomer

    Joined:
    Sep 28, 2013
    Messages:
    32
    Likes Received:
    23
    Ok so with Fixed clock power is always at 2.0GHz and it doesn’t work, but with their power shift it can go from 2.23GHz to (single digit percentage drop max of 9%) 2.03GHz? In this case it still means that the GPU is operating the entire time at frequency higher than 2.0GHz. Unless the GPU is allowed to swing much more and scale with workloads?


    Also while they don’t want ambient temperatures to affect the performance of the chip, it still must have some type of thermal protection in case it can’t get enough airflow or something. It would probably just shut down with an error and not just 100% ignore temperatures of the chip and possibly damage it.
     
    PSman1700 likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...