Next Generation Hardware Speculation with a Technical Spin [2018]

Discussion in 'Console Technology' started by Tkumpathenurpahl, Jan 19, 2018.

Tags:
Thread Status:
Not open for further replies.
  1. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,834
    Likes Received:
    18,634
    Location:
    The North
    It largely depends on the quality of HRT titles. If it’s massive and it generates demand, then everyone should be offering it all price points. But then again, how RT acceleration is handled is the big ?

    With only seeing nvidia’s Implementation we are anchoring our opinion of what RT hardware looks like
     
    OCASM, vipa899 and BRiT like this.
  2. beyondtest

    Newcomer

    Joined:
    Jun 3, 2018
    Messages:
    58
    Likes Received:
    13
    That's a bummer. I hope there's one that could investigate if for example Ryzen 7 is the same exact ship as 5 except 5 has the SMT hardware disabled inside them through other means like firmware/software/motherboards.

    Is it true that SMT affect clockspeeds? Does this apply to Ryzen?
     
  3. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,516
    Likes Received:
    24,424
    It's not affecting clockspeeds so much as it's affecting thermals, do more work and produce more heat or consume more power, this indirectly impacts clockspeed because todays CPUs are throttled based on thermal targets or power consumption.
     
    mrcorbo, function and beyondtest like this.
  4. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    AMD's stated there's just one Zen chip for the 1x00 Ryzen products and a different revision for the 2x00 products, though if you want to see Zen without SMT that's Ryzen 3. There's no motivation to make a different chip as removing the elements specific to SMT provide negligible savings (and cost millions of dollars in engineering and compromised binning to create two almost identical products). Outside of a small set of context elements, the rest of the hardware is no different and the impact of the SMT-specific elements is unlikely to show up over a wide range of more pressing bottlenecks. SMT is readily disabled by software for the chips that come with SMT available, and is likely disabled by some kind of blown fuse for the products sold without it. The chip just leaves the SMT-specific elements unused or ignored.

    The extra tracking and logic in SMT by itself is dwarfed by the rest of a massive OoO engine, many deep pipelines, and wide execution and memory resources. It's not likely that those small elements in the simplest 2-thread case would become the critical path when there are so many more large elements with more pressing scaling challenges. As noted by BRiT, the higher utilization that SMT can allow by filling in stall cycles in parts of the chip can increase power consumption, which can make a core hit limits more readily. If there are enough cores with sufficient resource utilization gaps, this may allow a chip to meet the TDP specifications of a slightly higher-clocked bin when it is tested during manufacturing.
     
  5. beyondtest

    Newcomer

    Joined:
    Jun 3, 2018
    Messages:
    58
    Likes Received:
    13
    Would you happen to have a reference with that statement of having one chip? It makes sense to me though.

    Btw, is it also true that Intel also uses the same chips for desktop and laptops?

    Basically just one or few designs then they just "artificially nip and tuck" for their desired price/performance segmentation?

    Very interesting thanks. So things can cost the same just need a better cooler. Then again, there's binning right? Even if the chips costs near the same to build from high to low, say Ryzen 3 and 7, AMD will have to charge for sony to use Ryzen 7 features.

    Btw sorry guys, I'm fairly new to these kinds of details and I still appreciate it staying here as I'm more interested in the context of consoles.
     
  6. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    The specifics were announced as far back as the initial launch of the various products.
    Epyc's launch discussed how it used the same chip as Ryzen, and various product reviews of the Ryzen families show how the lower products have progressively more cores and L3 cache capacity disabled.
    https://www.pcper.com/reviews/Proce...cessor-Launch-Gunning-Xeon/Architectural-Outl

    https://www.anandtech.com/show/1124...0x-vs-core-i5-review-twelve-threads-vs-four/2


    The APU is a different Raven Ridge die, and the Ryzen 2 non-APU chips are port of the original Ryzen's Summit Ridge chip to a slightly more refined process.

    The same chip can go into multiple product lines, though the highest end desktop/enthusiast ones are derived from higher-end server chips that generally aren't portable outside of a niche like workstation laptops that are mobile only in the sense that they can be carried from wall plug to wall plug with little battery life.

    Generally, this is the case. Silicon chips are individually cheap due to mass-production, but the up-front costs of engineering a new implementation are massive. Tweaking the same chip saves those costs and allows less than perfect chips to be salvaged. AMD's current line is somewhat unusual in the number of markets covered by virtually the same chip, but even though Intel has a fair number of different dies they go into a vast and confusing number of different product lines.
     
    TheAlSpark, iroboto, mrcorbo and 2 others like this.
  7. Jay

    Jay
    Veteran

    Joined:
    Aug 3, 2013
    Messages:
    4,033
    Likes Received:
    3,428
    It may be in AMD's interest to pretty much just bin different chips, that doesn't mean MS or Sony buy a single design, and can bin in the same way.

    So from a manufacturing stand point may be little reason to produce chips that don't have SMT, just disable it at hardware level for bining purposes or to meet a quota.
    But AMD would be able to sell SMT as a feature of the design, so that would cost MS and Sony more to use it.

    So for MS and Sony there would be a cost to SMT, above and beyond thermals etc.
    Thats how I would expect AMD to sell design features anyway.
    Not replying to anyone in particular, just adding my 2 cents.
     
    vipa899 likes this.
  8. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    967
    Likes Received:
    1,223
    Location:
    55°38′33″ N, 37°28′37″ E
    DXR 1.0 does not include geometry LOD or geometry shaders (though these features are considered for the future), probably because raytracing was intended as an addition to the rasterization and computing pipelines.

    You can build/modify the command list exclusively for the separate reflection pass (but AFAIK that would require multi-GPU support to run in parallel to AO and GI).

    You can't. You have access to command lists and bundles, but you cannot access acceleration structures after the video driver builds them from your geometry (AFAIK, you can physically access the memory region, but the data format is proprietary and not disclosed).

    It is not certain that next-gen consoles will get RTRT in the first place - this could only happen with mid-range $200 GPUs typical for consoles. For all we know, NVidia's is not coming to consoles (not at GeForce RTX 20xx price point either) and AMD is seemingly taking a different, software-based approach with 'Radeon Rays'.

    By the time hardware RTRT trickles down to mid-range, there will be DXR tier 2.0 with more shader types and processing functions.

    Exactly. DXR 1.0 lets hardware developers choose a subdivision algorithm which is best for available memory/cache bandwidth and their specific software or fixed-function implementation of ray-intersection search.

    For your game engine needs, your can ship pre-computed BSP/BVH structures with your game assets and use whatever algorithm would be most efficient or full-featured for your particular task.
     
    #3528 DmitryKo, Nov 11, 2018
    Last edited: Nov 12, 2018
    OCASM, mrcorbo, BRiT and 2 others like this.
  9. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    967
    Likes Received:
    1,223
    Location:
    55°38′33″ N, 37°28′37″ E
    It requires extensions to the instruction decoder and micro-op scheduler.

    Superscalar ALUs are built to have several blocks operating in parallel - such as memory access, register file access, integer operations, floating-point operations, whatever. Each complex 'macro' command - i.e. x86/x64 instruction - is implemented with a sequence of proprietary VLIW microcode which runs very simple operations for each block, the micro-ops.

    Thus it is possible to implement several front-ends in each ALU (i.e. several instruction decoders and schedulers) which share execution blocks (the back-end), and present these front-ends as separate cores. But this will only work when the threads have orthogonal workloads which consume different blocks, so that re-ordering and parallelization of micro-ops is possible. Thermal restrictions would also apply, as modern CPUs are very actively restricting the workloads to remain within their specified TDP level.

    I don't think so. The "simple" ALU only supports a subset of the "full" instruction set - but the register width is fixed and FP32 most likely works by chaining two FP16 blocks in the ALU.
     
    #3529 DmitryKo, Nov 11, 2018
    Last edited: Nov 12, 2018
  10. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    967
    Likes Received:
    1,223
    Location:
    55°38′33″ N, 37°28′37″ E
    These could have 65/70 or 130/140 shader blocks per each CU, for a total of 2600/2800 or 5200/5600 shader processors - if these are indeed 'simple' cores, you can put more of them on the same area...

    I'm not sure how would you derive the above from the actual citation:

    "The Navi 12 is not going to be the GPU that gets featured in PS5, its a derivative of the actual Navi die and has been created specifically so AMD can get it to market for the PC audience primarily."

    Yes, in comparison to NES Classic/SNES Classic, the Switch is a full featured console... a lot of innovation here, almost like the Sony PlayStation® of 1995 :twisted:
     
    #3530 DmitryKo, Nov 11, 2018
    Last edited: Nov 12, 2018
    iroboto and vipa899 like this.
  11. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    19,426
    Likes Received:
    10,320
    Demand is only one part of the equation.

    The other, and arguably far more important part of the equation is how much does it cost in terms of silicon real estate to enable performant RT that will blow people away?

    Right now, with NV's RT hardware the lower bound is a 2070 that offers performance impacted RT that some people find impressive and others less so. It's certainly doing RT faster than non-RT hardware, but is it doing it fast enough for games at a quality level that is an overall improvement in the game, versus just an improvement in specific areas of a game at the expense of other areas of a game?

    In 2 years will acceptable quality and performance for games, be found in a hardware component not even a console, but just a component of a console be available for under 500 USD? Again, just for the hardware component, not even talking about a full blown console at this point.

    And additionally will it be flexible enough that developers can adapt and use it in their games, regardless of the types of games they are making?

    I'd argue that it isn't very likely. At some point in the future we all hope RT will be viable in a console, but next generation is not likely to be it for reasons of cost versus performance versus quality versus flexibility.

    More than happy to be proven wrong, of course. :) But I just don't see it, at this point in time.

    Regards,
    SB
     
    AstuteCobra, milk and DmitryKo like this.
  12. vipa899

    Regular

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    Maybe the current console generation should last longer then 2020, maybe to 2022?
     
  13. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
    This quote makes absolutely no sense whatsoever.
     
  14. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    967
    Likes Received:
    1,223
    Location:
    55°38′33″ N, 37°28′37″ E
    Indeed, but then you can imply anything out of it!
     
  15. ultragpu

    Banned

    Joined:
    Apr 21, 2004
    Messages:
    6,242
    Likes Received:
    2,306
    Location:
    Australia
    It does all sound like PS5's custom Navi is based on KUMA of the new uArch, so no more limitation to the 64cu count of GCN which is good. Does this not change the expectation of the reasonable teraflops we might end up getting? 14-15tf used to be the higher end but with KUMA this could be well within reach I hope.
     
  16. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,834
    Likes Received:
    18,634
    Location:
    The North
    I don’t think it would change things. If the goal is to increase compute power by going wide, everything else in the pipeline needs to be equally increased or you’ll hit bottlenecks elsewhere.

    If the goal is to increase compute power using clockspeed, the entire pipeline moves together but the amount of power the heat goes up.

    There will be an optimum ratio of CU to clockspeed to generate maximum power for the least amount of silicon and power but a 64CU limitation was not likely to be a factor there.
     
    vipa899 likes this.
  17. ultragpu

    Banned

    Joined:
    Apr 21, 2004
    Messages:
    6,242
    Likes Received:
    2,306
    Location:
    Australia
    But isn't this where 7nm comes in presumably? I'm hoping the efficiency of the new process node would negate most of that, of course to a reasonable degree. Also we really don't know what's in the pipeline exactly do we? Would 24gig GDDR6 be ample already?
     
  18. msia2k75

    Regular

    Joined:
    Jul 26, 2005
    Messages:
    326
    Likes Received:
    29
    It doesnt sound anything like that... We have no idea (and especially wcctech) what Navi will be nor if the 64cu limitation will be still there.
     
    vipa899 likes this.
  19. ultragpu

    Banned

    Joined:
    Apr 21, 2004
    Messages:
    6,242
    Likes Received:
    2,306
    Location:
    Australia
    Well it sure as hell ain't gonna be Navi 12 which is only 40CU, so what else is out there? Navi 20 custom it is.
     
  20. vipa899

    Regular

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    For a dreamer it may sound like that. In truth we dont know anything, if history repeats itself there will be some midrange hardware for the time in there.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...