AMD: Navi Speculation, Rumours and Discussion [2019]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

  1. xEx

    xEx
    Regular Newcomer

    Joined:
    Feb 2, 2012
    Messages:
    934
    Likes Received:
    395
    It is, they are obviously trying to fool people into thinking the hardware of the Xbox and what enables its RT is Nvidia. AMD should response.
     
    Cuthalu likes this.
  2. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,702
    Likes Received:
    2,430
    #RTXOn is simply a context to win prizes by leaving comments on social media about which game the user wants to see RTX integrated into it. It has nothing to do with Xbox or PlayStation or whatever. It's a running theme through all E3 content regardless of vendors.
     
    A1xLLcqAgt0qc2RyMz0y, pharma and BRiT like this.
  3. snarfbot

    Regular Newcomer

    Joined:
    Apr 23, 2007
    Messages:
    509
    Likes Received:
    177
    I dunno look at Pascal ray tracing, it doesnt have any hardware at all and is running it acceptably imo. At least for the methods implemented in actual games, like tomb raider or battlefield.

    Also from what I've gathered its async compute performance is what holds it back in demanding scenes.

    AMD async compute is much better so should perform better right?

    I think it would be a mistake not to support it in some form even software would be sufficient. Sure they would get destroyed in benchmarks but they're gettin beaten pretty bad anyway so why not.
     
  4. xEx

    xEx
    Regular Newcomer

    Joined:
    Feb 2, 2012
    Messages:
    934
    Likes Received:
    395


    0.25 a first view.
     
    sonen likes this.
  5. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,098
    Likes Received:
    2,814
    Location:
    Well within 3d
    Nvidia's way of handling compute and graphics in parallel had issues in past generations. Nvidia's expectations seemed to have the two types more separate, or there was a conflict in how it tracked the two that forced broader stalls and context switches at first, and also less flexibility in allocating SMs to one type or the other. That improved with each generation, such that by Pascal I'd say feature-wise it was in the ballpark of at of some GCN implementations, though some of the later GCN additions might not have directly corresponding features.
    The differences are less stark with modern hardware, and the benefits of GCN's implementation were sometimes debatable because its graphics context processing could bottleneck frame times to the point where an Nvidia GPU could get close or do better running the compute synchronously.

    There are some possible caveats with ray-tracing, at least going by initial Turing implementations. Ray-tracing on an SM seemed to place a barrier that restricted it from also launching compute, though it's not clear if that was a long-term architectural or implementation limit or a case of teething issues.

    While I won't try to interpret the nature of the various rectangles too much, at a glance it seems like this GPU does move more elements into the center of the die, with the shader core section ringing what might be the command processors and part of the cache hierarchy.
    While not a perfect match to other layouts, this may mean the L2 or whatever the upper level cache is came in from the sides of the die. That and the way the compute section appears have its blocks to flip orientation twice as much as prior GCN arrays, and without the same symmetry of clusters across all of the new mid-lines is another difference. (Not quite what Nvidia does, but it's closer than before.)
    How the hardware for compute and graphics is distributed, or what all of the area in the middle strip and around the center portends should be interesting to hear more on.
     
  6. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,098
    Likes Received:
    2,814
    Location:
    Well within 3d
    If the video is supposed to be of a 40 CU GPU, the arrangement has 20 blocks in the shader array section. Not sure which blocks to interpret as the front end or the L/S and cache blocks. It's possible that since the LLVM changes mention a mode where a workgroup can exist in two CUs that a pair of CUs sharing a front end and possibly other hardware is what was called a workgroup processor.
     
    AlBran likes this.
  7. yuri

    Newcomer

    Joined:
    Jun 2, 2010
    Messages:
    175
    Likes Received:
    146
  8. Arnold Beckenbauer

    Veteran

    Joined:
    Oct 11, 2006
    Messages:
    1,408
    Likes Received:
    347
    Location:
    Germany
    More pins - more power!
     
    digitalwanderer likes this.
  9. Pressure

    Veteran Regular

    Joined:
    Mar 30, 2004
    Messages:
    1,317
    Likes Received:
    242
    So they keep the tradition alive by clocking it out of the sweet spot. VEGA 10 was pretty good at only 1.0V instead of 1.2V.
     
    A1xLLcqAgt0qc2RyMz0y likes this.
  10. anexanhume

    Veteran Regular

    Joined:
    Dec 5, 2011
    Messages:
    1,470
    Likes Received:
    651
    Rough math says they're now 80-90% of Nvidia on FLOPs translating to game performance comparatively. Much better than Vega.

    edit: This seems new.

    https://videocardz.com/81012/amd-radeon-rx-5700-xt-and-radeon-rx-5700-final-specs

     
    #751 anexanhume, Jun 10, 2019
    Last edited: Jun 10, 2019
  11. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    So this is what I think we might have:

    [​IMG]

    In summary, I think the RF and LDS are shared by two compute units, each with 2x SIMD-32, SALU and TMU.

    Each "quarter" is two shader engines, with 2 sets of 2xROP-4s, 64 ROPs in total.
     
    Lightman, Newguy, BRiT and 6 others like this.
  12. Triskaine

    Newcomer

    Joined:
    Mar 28, 2010
    Messages:
    59
    Likes Received:
    56
    Okay, time for a slight revision: AMD's next Mid-Range GPU is a 225 Watt Mini-Housefire with a blower fan. Going by AMD's SOP it's also overvolted as hell to keep the yield up. In terms of process normalized performance per-watt it's still behind Turing by ~50%. With the HBM joker AMD can get to something High-End'ish next year, maybe around the 2080 Ti, but overall nothing that in any way threatens nVidia's dominant position. I'll probably buy one anyway, because I'm a sucker for housefire silicon.
     
  13. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    https://videocardz.com/81012/amd-radeon-rx-5700-xt-and-radeon-rx-5700-final-specs

    with typos fixed:
    This refers to SIMD32s :)

    The scalar ALUs have been really beefed up.
     
  14. Per Lindstrom

    Newcomer Subscriber

    Joined:
    Oct 16, 2018
    Messages:
    18
    Likes Received:
    13
    Intresting! what about the central area?
     
  15. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,011
    Likes Received:
    110
    Does that mean wavefront size can be either 32 or 64 or do I misunderstand this...
    Oh, and apparently full speed fp16 texture filtering, catching up with nvidia (since fermi IIRC) there...
     
  16. Per Lindstrom

    Newcomer Subscriber

    Joined:
    Oct 16, 2018
    Messages:
    18
    Likes Received:
    13
    I only see 8 * 4 ROPS, what do I missing?
     
  17. anexanhume

    Veteran Regular

    Joined:
    Dec 5, 2011
    Messages:
    1,470
    Likes Received:
    651
    I think so. AMD had a patent on variable wavefront sizing.
     
  18. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Stuff! Too coarse-grained to say much about those blocks.

    Fascinating that work can be issued in 32-work item hardware threads. I was expecting 64 and 128...

    I've added extra yellow blocks to the picture. Hopefully the picture will update soon to show them.

    I also added L1, just for the sake of it.
     
    BRiT and Per Lindstrom like this.
  19. Nemo

    Newcomer

    Joined:
    Sep 15, 2012
    Messages:
    125
    Likes Received:
    23
    So Navi has 512Kb L1? :shock:
    Vega10 has 16Kb L1.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...