Playstation 5 [PS5] [Release November 12 2020]

Discussion in 'Console Technology' started by BRiT, Mar 17, 2020.

  1. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,552
    Likes Received:
    4,713
    Location:
    Well within 3d
    The decompression logic should be on the APU, probably somewhere on the opposite side of the die from the CPU block.
    The flash controller is on a different, smaller chip by the flash modules. I'm not sure if the photographer would consider it interesting, but maybe someone could request a picture.

    There's already some evidence that the controller isn't quite like the Sony patents people had fixed on last summer. That one specifically mentioned not needing a DRAM buffer, but there is one for the PS5.
     
    tinokun, Pete, PSman1700 and 5 others like this.
  2. thicc_gaf

    Regular Newcomer

    Joined:
    Oct 9, 2020
    Messages:
    324
    Likes Received:
    246
    Think it might be interesting anyway if just to see what specific DRAM they are using for the cache. Actually, would also be interesting if they or someone could provide a photograph for the Toshiba NAND devices and the SK Hynix one in Series X's SSD.

    Clearer photos would make it easier to track down documentation for those...if there's any online (I tried doing this for the SK Hynix one but kept running into blank wholesaler pages and AliExpress entries for unrelated things).

    That seems like a carry-over of the PS4 Pro, which was also a butterfly design. Sony (well, more specifically Cerny) seems to like butterfly designs.

    I dunno how well that will carry forward in the future, though, while maintaining hardware-based BC. Suppose costs for 3nm or some 2nm PS6 APU end up being very expensive. Then let's say they decide to make that a chiplet because they want to use one of those for some handheld-like device (they may have to do this for Japan and a few other regions). So scaling-wise they are locked at a 36 CU (40 CU), which will mean a wider GPU, also means more costs.

    Interestingly Cerny mentioned a 48 CU theoretical design at Road to PS5 so I'm guessing they could technically maybe do 18 CU increments meaning they may not be absolutely married to a butterfly model. Which is probably for the best if they still want to maintain hardware BC while keeping APU costs manageable.
     
    #8082 thicc_gaf, Feb 16, 2021
    Last edited: Feb 16, 2021
  3. Proelite

    Veteran Regular Subscriber

    Joined:
    Jul 3, 2006
    Messages:
    1,564
    Likes Received:
    1,005
    Location:
    Redmond
    48 CU would need to be 3 shader engines (RX 6800) for BC reasons. Turn off two for ps4, one for pro. Would have been a large SOC bigger than Xsx.
     
  4. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,552
    Likes Received:
    4,713
    Location:
    Well within 3d
    Do you need numbers more specific than the iFixit teardown had?
    https://www.ifixit.com/Teardown/PlayStation+5+Teardown/138280

    There may not be much motivation for public documentation for components meant for large customer orders.


    Quite a few of AMD's GPUs have to split their CUs across the command processor fixed-function block. The larger sizes seem to be mostly butterfly-like. The PS4 Pro's somewhat odd layout may be related to how heavily Sony fights for die area savings.
    GPU data paths are very broad and routing through the chip can be congested. Putting a lot of arrays on one side constrains what direction additional CUs can be added as well.
    Splitting across the middle reduces the average length signals have to travel from the midline where the caches and control hardware are placed, and if there's a lot of links to the memory subsystem they don't all have to make the same turn into the CU arrays.

    Looking at the PS5 APU, it looks like Sony may have had a certain ceiling for the width of the chip, since the CPU block is packed so tightly and may have sacrificed FPU area to keep it within the bounds set by the GPU. If the GPU didn't split its CUs, it might have required orienting the GPU in a way that would make it wider.

    As long as raw unit counts are higher than what came before, the GPU can hide any extra. We know it can because it's already hiding that there are 40 CUs physically on-chip, even though at some early validation stage the GPU was running those units as part of its bring-up process.
     
    tinokun, Pete, Inuhanyou and 5 others like this.
  5. Vega86

    Newcomer

    Joined:
    Sep 25, 2018
    Messages:
    182
    Likes Received:
    123
    Does the APU have any dead CUs for yield purposes?
     
  6. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    18,768
    Likes Received:
    21,044
    4 of them, only 36 are active.
     
    Vega86, Pete, thicc_gaf and 1 other person like this.
  7. Tkumpathenurpahl

    Tkumpathenurpahl Oil Monsieur Geezer
    Veteran Newcomer

    Joined:
    Apr 3, 2016
    Messages:
    1,797
    Likes Received:
    1,800
    Does anyone know where the "Tempest Engine" fits into that? I recall Cerny stating that it's a modified CU. Is that modified CU in addition to the 40 "unmodified" CU's, or is it 1 or 2 of the 40 total CU's?
     
  8. thicc_gaf

    Regular Newcomer

    Joined:
    Oct 9, 2020
    Messages:
    324
    Likes Received:
    246
    That'd definitely be a waste of CUs, so maybe that makes Cerny's hypothetical example kind of odd. If they ever seriously considered a 48 CU design, I wonder for what reason if it'd mean 12 disabled CUs.

    I somehow missed this xD; thanks for the link, that covers any needless hunting.

    Actually speaking of PS5 GPU shape, something kind of came to my attention. Seen a few photos breaking down the allocation of things on the PS5 APU from the x-rays:

    [​IMG]

    and then I came across this GPU ref image for Navi 21:

    [​IMG]

    Now I can't see an Infinity Fabric Cache Controller in the PS5 shot similar to Navi 21, but is it possible they have extremely cut-down slices of cache sandwiched between PHYs and a cache controller as the small (maybe unmarked) grey block between the two pairs of memory controllers interfacing to the L2$?

    I'm just asking because it's genuinely slightly perplexing and I feel like I might've made an analysis on this elsewhere off of a bad read, because even now I'm looking at it still and can't see how any theoretical amount of IC slice sandwiched between the PHYs would be even 2 MB in size going by the size of the actual 2 MB (or more specifically, 256 KB) L2$ blocks on the GPU. Thin them and lengthen them a bit to fit in that space and you get maybe six in each slice, or 6 MB in all. Would not sound worth it in all honesty.

    Though, just looking again at the Navi 21 picture I should probably figure the actual cell size for the IC is a lot smaller than that for the L2$ (cell density per cache level is something I've only kind of just started looking into in terms of node process fabbing), which is why I was thinking if there's actual room there (can't explain for where the Infinity Fabric Cache Controller would be at or why it'd be a small block between the GDDR6 memory controllers since that looks contrary to how the cache controller's actually supposed to function) it could be closer to 12 MB - 24 MB?

    Honestly would just enjoy some concise clarification so the book on "does it have IC or doesn't it" can be closed shut once and for all, these die x-rays should be enough evidence anyway IMO.
     
  9. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    18,768
    Likes Received:
    21,044
    It does not have Infinity Cache.
     
  10. thicc_gaf

    Regular Newcomer

    Joined:
    Oct 9, 2020
    Messages:
    324
    Likes Received:
    246
    I mean that's what I'm assuming is the case but I had to make absolutely sure. For example, just going from what I mentioned earlier there's no Infinity Fabric Cache Controller to handle the (hypothetical) L3$. Which should be the strongest indicator it's not present.

    At least it should be clear now it's not there because there's critical parts to the IC completely absent anyway, so seeing if there's people still trying to argue otherwise pointing to missing IFCC should be enough to dissuade their argument (if they aren't stubborn).
     
  11. Metal_Spirit

    Regular Newcomer

    Joined:
    Jan 3, 2007
    Messages:
    624
    Likes Received:
    395
    Where did you got those values for FADD??? They dont check out with the ones on the article bellow:

    https://www.anandtech.com/show/1621...e-review-5950x-5900x-5800x-and-5700x-tested/6

    In there they claim Zen 2 FADD throughput is 1 per clock with 5 cycle latency.
     
    #8091 Metal_Spirit, Feb 17, 2021
    Last edited: Feb 17, 2021
    iceberg187 and snc like this.
  12. thicc_gaf

    Regular Newcomer

    Joined:
    Oct 9, 2020
    Messages:
    324
    Likes Received:
    246
    Well, MLID is finally addressing the SOC images; still listening. Can tell he's flustered and I'm finding it both very funny and also empathetic, if he genuinely feels like people were misquoting him.

    2021's a good year so far.
     
    RagnarokFF likes this.
  13. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,552
    Likes Received:
    4,713
    Location:
    Well within 3d
    There's the x87 opcode FADD, but I think the numbers originally given correspond to a vector floating point addition, so using FADD as a shorthand for the non-legacy variant.
     
    BRiT and Pete like this.
  14. thicc_gaf

    Regular Newcomer

    Joined:
    Oct 9, 2020
    Messages:
    324
    Likes Received:
    246
    Also quick: I think guys like MLID, RGT etc. need to keep in mind that, them speculating on what might've been in PS5 was never really the issue. The problem is they seemed to afford Sony leeway in certain design decisions that for whatever reason, they didn't afford Microsoft, and this was before Hot Chips came out (and I phrase it this way because I genuinely don't think these guys have actual direct insider contacts who have in-depth hands-on with this hardware).

    Also, trying to go to phantom customizations to explain why PS5 is performing better than Series X in certain 3P cross-gen titles was never once necessary, there are a handful of very simple potential answers based on info Sony and Microsoft already provided. Anything between profiling issues of the segmented memory (Series X), certain engines benefiting from higher GPU clock (PS5), lack of tools/tool immaturity/tool unfamiliarity (Series X) or simply better/more familiar tools for handling data I/O (PS5) could've been obvious answers to pick from.

    Instead they kept leaving the door open for Infinity Cache, Zen 3 unified cache, Geometry Engine customizations (now the x-ray shots don't necessarily debunk anything with the GEs AFAIK) and other esoteric things and they let it build up so this is basically the result of that.
     
    #8094 thicc_gaf, Feb 17, 2021
    Last edited: Feb 17, 2021
    RagnarokFF likes this.
  15. AbsoluteBeginner

    Regular Newcomer

    Joined:
    Jun 13, 2019
    Messages:
    960
    Likes Received:
    1,301
    Ill go with Ocams Razor - its Navi 10 layout + Zen2.

    Makes sense because AMD is not stupid, they will design 40CU chip as efficient as possible for yields.
     
    thicc_gaf likes this.
  16. tunafish

    Regular

    Joined:
    Aug 19, 2011
    Messages:
    619
    Likes Received:
    395
    The only place you go to look up x86 instruction latencies: Agner Fog's Software Optimization Resources
    Specifically, volume 4.

    Fog measures all the values empirically instead of trusting what the manufacturers state. His values occasionally differ from those published by Intel or AMD, and when they do, the manufacturers typically go and fix their manuals.

    That's for the x87 instruction, which are obsolete. Look for the correct modern SSE/AVX one from Fog's tables. (search for Zen2, then search for ADDSS)
     
    tinokun, Pete, BRiT and 2 others like this.
  17. Tabris

    Newcomer

    Joined:
    Sep 24, 2011
    Messages:
    108
    Likes Received:
    166
  18. snc

    snc
    Regular Newcomer

    Joined:
    Mar 6, 2013
    Messages:
    805
    Likes Received:
    549
    similar opinion to nxgamer but interesingly its microsoft dev and afaik microsoft didn't cut this 256bit zen2 operations
     
  19. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    3,456
    Likes Received:
    2,805
    Flip side of that is, if it's not used a lot you have lower thermals.
    Unless things have changed which could very well be the case, AVX wasn't used as much more so due to mixed support in generations of cpus that was needed to be supported by games.
    So the benefit wasn't there to make much use out of it. Doesn't mean that will be the case going forward.

    All these things are features that may get leveraged more in the future, but I still don't expect most people to see the difference or care.
    Interesting for us regarding engine implementation and gdc talks though.
     
    mr magoo, PSman1700 and function like this.
  20. snc

    snc
    Regular Newcomer

    Joined:
    Mar 6, 2013
    Messages:
    805
    Likes Received:
    549
    https://www.gameinformer.com/feature/2019/12/03/the-first-25-years
    MASAYASU ITO [Executive Vice President, Hardware Engineering and Operation, Sony Interactive Entertainment]
    This rather suggests that sony has plans for ps5 pro
     
    RagnarokFF and Tkumpathenurpahl like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...