Next Generation Hardware Speculation with a Technical Spin [pre E3 2019]

Discussion in 'Console Technology' started by TheAlSpark, Dec 31, 2018.

Thread Status:
Not open for further replies.
  1. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    2,002
    Likes Received:
    1,130
    From what I remember from the hotchips presentation the only changes was really internal cache/memory sizes, nothing major,.
    Mainly helped running VM.

    If it's a monolithic soc then they may make minor tweaks again.
    Otherwise be little reason to.
     
    Adonisds likes this.
  2. bgroovy

    Regular Newcomer

    Joined:
    Oct 15, 2014
    Messages:
    668
    Likes Received:
    521
    I don't think there is any interconnect with sufficient bandwidth for GPU chiplets. You'd have to put memory controllers on each and that would just recreate all the same multi-GPU issues with memory not being shared, data duplication and various inefficiencies.
     
    Metal_Spirit likes this.
  3. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    8,111
    Likes Received:
    6,388
    Depends what they are and why they are Implemented.
     
  4. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    8,111
    Likes Received:
    6,388
    Hot chips slides don’t outline any customizations on the GPU or CPU. So I wouldn’t use that as a fully exhaustive source.
     
    #2104 iroboto, May 15, 2019
    Last edited: May 16, 2019
    BRiT likes this.
  5. TheAlSpark

    TheAlSpark Moderator
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,814
    Likes Received:
    5,916
    Location:
    ಠ_ಠ
    They increased the entries for the L2 Instruction & Data TLB (4KB pages) over standard Jaguar.

    ----

    GPU slide: https://images.anandtech.com/doci/11740/img_20170821_093653.jpg
    Nothing particularly out of the ordinary except for whatever bits they added to help facilitate backward compatibility (e.g. texture formats).

    Not sure what they're getting at with Conservative Occlusion Query or some of the other features listed.
     
    #2105 TheAlSpark, May 16, 2019
    Last edited: May 16, 2019
    Jay, Pixel and BRiT like this.
  6. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    2,002
    Likes Received:
    1,130
    Could have sworn I listened to presentation by MS at hotchips also.
    But they did cover changes, and it was as @AlBran pointed out for the CPU. Basically profiled it and realised upping those caches would improve VM performance.
    They are just bog standard jaguar cores otherwise.
    So you wouldn't specifically take those changes across to zen, but that doesn't mean they couldn't make similar changes.
     
  7. TheAlSpark

    TheAlSpark Moderator
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,814
    Likes Received:
    5,916
    Location:
    ಠ_ಠ
    The increased entries for the 4KB pages is kind of curious since the number of entries is otherwise identical to standard Jaguar for the rest of the scenarios (including the larger page sizes), so I'm not sure what's going on there or the context.

    At any rate, maybe the Azure folks will simply push for keeping the 16MB L3 per CCX as opposed to having the cut-down variant ala Raven Ridge. There's probably not a whole lot they can reasonably improve upon that the AMD engineers haven't already tried to do for performance. The rest would just have to be BC-related.
     
  8. bbot

    Regular

    Joined:
    Apr 20, 2002
    Messages:
    742
    Likes Received:
    10


    Yes, but not because of bandwidth.


    Davi Wang

    But realistically it’s more of a software problem than a hardware one.
    The Infinity Fabric interconnect should be able to provide an interface
    that is wide enough, and high-speed enough to deal with the communication
    to make is look and feel like one chip, but getting the OS and the
    applications to see it that way is a lot tougher.


    https://www.pcgamesn.com/amd-navi-monolithic-gpu-design
     
  9. HBRU

    Regular Newcomer

    Joined:
    Apr 6, 2017
    Messages:
    430
    Likes Received:
    47
    specially true for Sony because is less capable of creating BC on OS level...
    so I think PS5 will be wider and slower clocked...
     
  10. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    2,002
    Likes Received:
    1,130
    After going around the houses I've settled on what I think MS should do.
    MCM design with:
    • Off the shelf zen chiplet CPU's (I don't think there is much to tinker with for blade usage)
    • GPU (not chiplet just module)
    • IO & memory controller module (allowing a different one to be used in blades, which will also allow board stacking via high speed interconnects for azure users) also might be better to use HBM on blades?
    • Accelerator module (possibly AI, RT for example)
    If the cost of mcm component construction was so prohibitive we wouldn't see it being moved down the stack.
    Think off the shelf zen and ability to swap IO dies would be a big benefit as that's where the big change is between console and server occures, so overall would make it a reasonable and cost effective choice. Compared to singe huge dies.
    Also helps yields as if any component is defective you don't loose the whole chip. CPU, wouldn't need to worry about as off the shelf.

    Still think PS5 is fine with monolithic though
     
    #2110 Jay, May 16, 2019
    Last edited: May 16, 2019
  11. Metal_Spirit

    Regular Newcomer

    Joined:
    Jan 3, 2007
    Messages:
    398
    Likes Received:
    193
    Problems with multi-GPU

    —-

    Splitting GPUs into chiplets isn’t a new idea in the realm of ideas, however it is a concept that is difficult to conceive. One of the key areas of shuffling data around a GPU is bandwidth – the other is latency. In a graphics scenario, the race is on to get a low frame rendering time, preferably below 16.67 milliseconds, which allows for a refresh rate of 60 Hz to have a full display frame inserted on every refresh cycle. With the advent of variable refresh displays this has somewhat changed, however the main market for graphics cards, gamers, is heavily reliant on quick refresh rates and high frame rates from their graphics. With a multi-chip module, the manufacturer has to consider how many hops between dies the data has to perform from start to finish – is the data required found directly connected to the compute chip, or does it have to cross from the other side of the design? Is the memory directly stacked, or is there an intrapackage connection? With different memory domains, can the data retain its concurrency through the mathematical operations? Is there a central management die, or do each of the compute chiplets manage their own timing schema? How much of the per-chiplet design comes from connectivity units compared to compute units?

    Ultimately this sort of design will only win out if it can compete on at least two fronts of the triad of performance, cost, or power. We already know that multi-die environments typically require a higher power budget than a monolithic design due to the extra connectivity, as seen with multi-die CPU options in the market, so the chiplets will have to take advantage of smaller process nodes in order to eliminate that deficit. Luckily, small chiplets are easier to manufacturer on small process nodes, making it a potential cost saving over big monolithic designs. Performance will depend on the architecture, both for raw compute, as well as the interconnect between the chips.

    https://www.anandtech.com/show/14211/intels-interconnected-future-chipslets-emib-foveros

    So, power usage is a problem... and since the idea is to go above 64 CUs, cost should be too.
     
    Jay likes this.
  12. Gubbi

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,533
    Likes Received:
    888
    People here are out of their goddamn minds.

    Next gen console will launch at $499 and like the past generation it is highly unlikely they will be sold at a loss.

    The console vendors will need to produce a system with CPU/GPU, DRAM, Storage, PSU and a controller and sell it at the same price as a RTX 2070 8GB.

    There is zero, ZERO!!!, chance , that we will see a 64 CU GPU in next gen consoles; 1.) the cost of the die is too large, 2.) the power consumption is too large, 3.) the bandwidth demand on the memory subsystem, and consequently price, is too large.

    Both MS and Sony are going to compete against console as a service-providers next gen, that puts tremendous downward pressure on the purchasing price of physical consoles.

    I would expect a 48CU GPU die, with only 40 active to ensure as many usable dies as possible. I would expect MS to pair hot GPU dies with cool CPU dies to maximize the power/yield point. If they can hit 1.7GHz, then that's 8.7TFlops with FP32 and 17.4TFlops using packed FP16. I would expect it to be paired with 16GB GDDR6 on a 256bit bus running either 13 or 14GHz (~400GB/s bandwidth).

    What Lockhart is/isn't is just speculation at this point, every thing I've read originates from a Reddit post in february AFAICT. If it isn't just a SKU with gimped storage (no optical, half the SSD), it might be a client to MS' console-in-the-cloud service. It could be an APU with limited capability. Enough to play existing XB1 titles, but everything more demanding would be streamed from a server.

    Cheers
     
    DavidGraham, PSman1700 and iroboto like this.
  13. Globalisateur

    Globalisateur Globby
    Veteran Regular

    Joined:
    Nov 6, 2013
    Messages:
    3,011
    Likes Received:
    1,727
    Location:
    France
    @Gubbi
    4/ GCN doesn't scale well above 56 CUs
     
    Heinrich4 likes this.
  14. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    2,002
    Likes Received:
    1,130
    See title of thread.

    Nothing at all to discuss otherwise, and I personally have enjoyed the discussions and pro's, con's and feasibility of peoples ideas.

    So $499 8.7 TF 16GB consoles?
    Sounds a bit expensive to me but, that's your opinion/speculation.
     
    #2114 Jay, May 16, 2019
    Last edited: May 16, 2019
  15. function

    function None functional
    Legend Veteran

    Joined:
    Mar 27, 2003
    Messages:
    5,138
    Likes Received:
    2,253
    Location:
    Wrong thread
    With MS looking to use the silicon in Azure racks for "none entertainment purposes", you potentially have other considerations that could affect the chip. E.g. your good dies go in the server blades, your salvage dies go in the consoles. This could potentially lead to a wider chip than you'd otherwise see if only used for a console, IMO. The savings of 'cheap' console parts Vs buying enterprise parts at several $k a pop could affect the calculation on die area.

    I won't say 64 CUs, but I'm betting we'll see more than 40 active in anaconda, and probably more for the cloud units than retail. X1X is already at 40 on 14nm.

    Power is also relative to clocks. Wider but slower can give better performance for less power.
     
    Heinrich4 and Picao84 like this.
  16. Picao84

    Veteran Regular

    Joined:
    Feb 15, 2010
    Messages:
    1,567
    Likes Received:
    717

    You seem to forget that XBox One X already does >6TFlops on GPU. What would be the point in launching a console with only 33% more GPU power? Sure, the new beefier CPU probably occupies more die space and uses more power, but if they are not going to push GPU to at least 10 TF, what's the point? Go from a CPU starved console to a GPU starved one?

    Regarding the memory, 16GB is not as forward thinking as 8GB was on the PS4, plus the Xbox one X is already at 12GB? Again I fail to see the point of releasing a new console if the difference in hardware would be so small.

    About the supposed competition from streaming services (which success really remains to be seen and I'm very very sceptic about), you only talk about price, but what about experience? What would be the point of selling a cheaper console that gives you an experience that is barely improved on the previous generation, while streaming services can easily upgrade their hardware and give a better experience over the life of your standalone console? If game streaming is really a threat to the traditional console business, they need to go all in on hardware to show they can provide the better experience on the long run.
     
    #2116 Picao84, May 16, 2019
    Last edited: May 16, 2019
    Cuthalu and Heinrich4 like this.
  17. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    41,031
    Likes Received:
    11,647
    Location:
    Under my bridge
    TF's doesn't matter - it's what's on screen that matters. XB1X is hampered by running current-gen games not targeting it, firstly. Secondly, advanced features could make those 8 TFs do significantly more than 33% more than 8 TFs of last-gen GCN.

    Discussing TFs is interesting, but it doesn't tell us what the end results will look like and shouldn't be used for business considerations or design logistics.
     
    McHuj and OCASM like this.
  18. Picao84

    Veteran Regular

    Joined:
    Feb 15, 2010
    Messages:
    1,567
    Likes Received:
    717
    If this would be a new non-GCN architecture sure, TFlops would be as meaningless as they are for comparing AMD and NVIDIA GPUs.
    Since Navi is still GCN, I really doubt it can do significantly more than 33%, unless RPM would be heavily used. Each iteration of GCN was hyped as hell as bringing huge uplifts and all of them were wimps, with uplifts coming mostly from process and clock speeds.
    Vega 64 has 50% more TFlops than Fuji and yet it only delivered a average 30% uplift. Performance per TFlop has been going down, not up, so I highly doubt Navi reverses the trend.
     
    #2118 Picao84, May 16, 2019
    Last edited: May 16, 2019
    Heinrich4 likes this.
  19. Gubbi

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,533
    Likes Received:
    888
    You seem to forget that MS and Sony needs to make money off of these consoles.

    Packed fp16 doubles the computational oomph where it is applicable.

    There are two reasons for the 12GB in Scorpio: 1.) The extra computational power of the GPU demanded more bandwidth and resulted in a 384 bit bus. 2.) Assets required for 4K rendering. On both the original Xbox 1 and on Scorpio, 3GB is reserved for the system/other apps. That leaves 5 and 9 GB for the active game.

    On a 16 GB machine you would have 13GB for a game, 2½ x the XB1 and almost 50% more than Scorpio. Because of the expected high performance mass storage, assets can be fetched on demand to a much larger degree without degrading the gaming experience.

    Fiji has 6% more bandwidth than Vega 64.

    Cheers
     
    #2119 Gubbi, May 16, 2019
    Last edited: May 16, 2019
    BRiT likes this.
  20. Picao84

    Veteran Regular

    Joined:
    Feb 15, 2010
    Messages:
    1,567
    Likes Received:
    717
    Historically most consoles were sold at a loss in the beginning.

    Exactly and there were discussions already on this forum about that topic and I remember someone (Sebbi?) saying it is very limited. In real world you might use it for a third of the rendering or less, depending on what you are rendering. It's not something you can put a lot of faith on.

    Fair enough, but the memory is not only used for rendering. You will have beefier CPUs for something (e.g. more believable worlds, better AI), are you going to have them compete with GPUs for that increase of only 50%?
     
    Heinrich4 likes this.
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...