AMD: Navi Speculation, Rumours and Discussion [2019]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

  1. tEd

    tEd Casual Member
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,104
    Likes Received:
    70
    Location:
    switzerland
    Those 225W are TBP not TDP according to AMD https://www.amd.com/en/products/graphics/amd-radeon-rx-5700-xt
    (Scroll all the way down for the specs)
     
  2. yuri

    Newcomer

    Joined:
    Jun 2, 2010
    Messages:
    246
    Likes Received:
    230
    Finally, clocks which make sense!

    Navi21 XT
    • Base clock 1450-1500MHz
    • Game clock 2000-2100MHz
    • Boost clock 2200-2400MHz
    Lower base clock than Navi10 but higher boost. It seems RDNA2 (design + DVFS) makes it possible to scale the frequency to fit the load much better.

     
  3. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,100
    Likes Received:
    1,186
    Location:
    London
    Theory: large range in clocks hints that big Navi is 160 CUs not 80.
     
    PSman1700 likes this.
  4. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    11,477
    Likes Received:
    6,216
    There are really no rumors / insiders suggesting 160CUs..

    Plus, 4x Navi10 CUs at 2.1GHz wouldn't compete with a 3080/3090, it'd be considerably faster.
     
  5. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,847
    Likes Received:
    1,044
    Location:
    New York
    160 CUs or more SIMDs per CU? Both would be monstrous but I don’t see how the former is possible unless they’ve hit theoretical peak density numbers on 7nm.
     
    Lightman likes this.
  6. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    1,795
    Likes Received:
    713
    Location:
    msk.ru/spb.ru
    How much power would a 160 CU chip require?
     
  7. A1xLLcqAgt0qc2RyMz0y

    Veteran Regular

    Joined:
    Feb 6, 2010
    Messages:
    1,528
    Likes Received:
    1,331
    160 CU's, 2.4 GHz clocks.

    Silly season is in full force.
     
  8. PSman1700

    Veteran Newcomer

    Joined:
    Mar 22, 2019
    Messages:
    3,445
    Likes Received:
    1,364
    Nothing compared to RAMdisk, 24gb hbm3, zen3 and navi3 (>13.1TF) rumors for a 2020 console at 400/500 bucks. People will speculate in a speculation thread ;)
     
  9. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,710
    Likes Received:
    1,072
    Location:
    France
    I'm really curious how amd will market RDNA2 on PC. If the rasterisation perfs are on par with 3080, but RT 2 steps behind, do you sell it same price as 3080 ? How do you communicate about RT in this situation ? In 2021, I don't see myself buying a card which do not allow me to play with RT a little (CyberPunk 2077 and the remaster of W3 are the 2 main game to come for me).

    Or, a good scenario would be yes, RT is not at ampere level, but sit between Turing and Ampere ?
     
    Lightman likes this.
  10. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,847
    Likes Received:
    1,044
    Location:
    New York
    A good scenario would be RT that kicks Ampere’s ass.
     
    Lightman, Rootax and DegustatoR like this.
  11. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,100
    Likes Received:
    1,186
    Location:
    London
    True. AMD's counter-espionage has hit truly spectacular levels though. NVidia has done much the same, though it seems AMD has won this round.

    My die analyses of Navi variants are trivial things for NVidia to have done too (and NVidia has had a very long time to do them). I believe this is why NVidia has marketed "value" so heavily, because it doesn't take a rocket scientist to see that GPU compute is uber-cheap now.

    Honestly, I'm gobsmacked by the weak compute of the two consoles (16% of the XSX die is for CUs). I see this as a major fail. Or, built-in extreme obsolescence. Perhaps the next console gen will consist of 2 refreshes?

    Why would 6900XTX with ~41TF (2GHz) be on a completely different level from 3090 at 36TF? Only NVidia is now allowed to have huge amounts of compute?

    I've contemplated more SIMDs per CU. ALI:TEX could double, sure, but I wonder about LDS space and LDS versus VGPR/lane-mechanics too. To be honest, I have no theory for or against.

    If both XSX and PS5 have RDNA 1 CUs, then yes, PC Navi could be a 4-SIMD per CU "monster", being the only "RDNA 2" GPU that has RDNA 2 CUs.

    Also, I think a patent that talks about CUs sharing L1 encourages lots of CUs per L1. One thing I haven't been able to work out is whether an L1 is per shader engine or per shader array.

    Because an L0 (and LDS) is shared by two CUs in a WGP, the patent should probably be read with WGPs in mind, not CUs. The WGP is the real unit of compute in RDNA, not a CU.

    Additionally there are vague rumours saying that RDNA 2 is real RDNA, not the GCN/RDNA hybrid seen in Navi 1x. This could be interpreted to mean that any rumours that talk about Navi 2x CUs should be re-interpreted with WGP replacing CU.

    I don't understand what you mean by theoretical peak density numbers and why hitting them is relevant.

    A Navi 10, 14 or XSX CU is ~2mm². A 5xxmm² die with ~150mm² of "cache" doesn't make sense to me. No matter how exciting the idea of a solid 128MB lump of last level cache, I can't take it seriously. I believe that a cache is a cache precisely because it's a small, efficient, block of memory.
    We've seen a die that's considerably larger than 500mm². So take your pick:
    1. massive last level cache
    2. lots of CUs, about the same FLOPS as EDIT: GA102
    3. HBM 4096-bit bus plus 512-bit GDDR6 bus
    4. some combination of these
    Navi 21 scaled up from Navi 10 with 80 CUs and 4 shader engines is only about 360mm².

    I would expect the full die to run at lower clocks.
     
    #3771 Jawed, Oct 18, 2020 at 3:04 PM
    Last edited: Oct 18, 2020 at 3:42 PM
    entity279, Lightman and PSman1700 like this.
  12. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    17,275
    Likes Received:
    17,678
    I think it would be 160 HCUs. Not CUs. Not DCUs. ...
     
  13. xEx

    xEx
    Regular Newcomer

    Joined:
    Feb 2, 2012
    Messages:
    962
    Likes Received:
    426
    I was refering the the leak that said the 225W was the TGP. iirc the 5700xt have a 225W TGP with 180W TDP.

    Tbh if we believe all these leaks we would expect Navi to have a fully functional AI chip that can upscale image from 240p to 4K with 99% Quality consuming 150W....

    I will keep myself a little skeptical until I see the final product. I don't care about the Top of the line so for me the price/performance in the mid range is more important and I know AMD will at least have that.



    I don't think AMD will price Navi based on RT. They don't have the "brand" to do so, it will be slower than NV and only a handful ppl care that much about it. AMD until now market it's RT implementation as "functional" like not losing 50% performance when enabling it so I expect their marketing will go in that direction something like "less brute force but more intelligent and useful" We will see. I am also waiting for CP2077 with RT.
     
  14. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,847
    Likes Received:
    1,044
    Location:
    New York
    Btw I don’t think we’ve seen any data that shows Ampere RT to be any better than Turing. Relative perf hit with RT on vs off is about the same.
     
  15. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,331
    Likes Received:
    3,310
    Location:
    Finland
    IIRC someone calculated that theoretical 80 CU @ 2.x GHz (can't remember exact number) would be clearly above anything NVIDIA has in ray-box but would be slower than 3080 (or was it 2080 Ti? 2080?) in ray-triangle
     
    Lightman likes this.
  16. andermans

    Joined:
    Sep 11, 2020
    Messages:
    6
    Likes Received:
    6
    I think for raytracing the reality is going to be that it doesn't depend on raytracing hardware at all, but that the raytracing HW on both sides is probably going to be fast enough that the memory/cache hierarchy and divergency become the bottleneck. Those see stark differences between raytracing and "traditional" workloads in ways that are much harder to optimize for raytracing.

    e.g walking the BVH tree for raytracing is going to be way more latency sensitive than most GPU things workloads we've seen and both the tree walk as well as the rays scattering around are horrible for shader divergence.
     
  17. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,847
    Likes Received:
    1,044
    Location:
    New York
    How would you even begin to calculate that? AMDs patent suggests each intersection engine can do 4 boxes or 1 triangle per clock. Even if you assume that’s true what numbers would you use for Turing and Ampere?
     
  18. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,331
    Likes Received:
    3,310
    Location:
    Finland
    Couldn't find the image, I think it was explained at the time where they got their numbers for GeForces. I'll try to dig it up later today if I have the time, if someone else doesn't have it at hand (I'm pretty sure it was in this particular thread)
     
  19. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,847
    Likes Received:
    1,044
    Location:
    New York
    If it’s the image I think you’re talking about I thought that was made up nonsense. Nvidia hasn’t shared any details about its RT units.
     
  20. yuri

    Newcomer

    Joined:
    Jun 2, 2010
    Messages:
    246
    Likes Received:
    230
    RT perf is irrelevant. AMD simply has to be cheaper considering their weaker brand recognition and mainly their lacking SW - utterly horrid last gen driver experience, possibly broken OpenCL, lack of DLSS, CUDA, "driver utils" like Ansel, etc.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...