AMD: RDNA 3 Speculation, Rumours and Discussion

Discussion in 'Architecture and Products' started by Jawed, Oct 28, 2020.

Tags:
  1. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    The answer to that tweet referred to Nvidia first, then to all next-gen. And IIRC no one said Q1/22 for RDNA3. I'd say everyone pointed to 3rs/4th quarter 2022 since the beginning.
     
  2. LordEC911

    Regular

    Joined:
    Nov 25, 2007
    Messages:
    877
    Likes Received:
    208
    Location:
    'Zona
    The only talk of Q4 '21 or Q1 '22 I can remember is a few months back when the regular clickbait rumor sites were posting about tapeouts.

    Edit- Example from wccftech
     
    Lightman likes this.
  3. Ferman

    Joined:
    Sep 30, 2018
    Messages:
    6
    Likes Received:
    1
    With so many not able to buy current generation GPUs, it is going to be very hard to release a refreshed line up. Especially if it will be just as hard to buy the new GPUs.
     
  4. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348


    Basically another confirmation of the specifications known so far
     
    Jawed and pjbliverpool like this.
  5. xpea

    Regular

    Joined:
    Jun 4, 2013
    Messages:
    551
    Likes Received:
    783
    Location:
    EU-China
    n33 6nm 128bit gddr6 perf>6900xt
    With 128 bit bus thus half bandwidth? Hard to believe...
     
  6. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    Yes
    The magick of gigacache!
    Granted not much more perf than 6900XT, but the wattage is also way way lower.
     
    Man from Atlantis and Lightman like this.
  7. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    So in the same discussion the leaker says that the low end cards may appear as a N2X 6nm refresh, due in 2023. If this is true, I'd say that will include only the smallest cards.
     
  8. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    No such thing.
     
  9. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    Would Navi 33 need 256MB of Infinity Cache to exceed 6900XT and compensate for half the GDDR bus width? I don't think so. 128MB, the same as Navi 21, merely clocked higher should be dominant in terms of overall memory efficiency. And if we assume this GPU is aimed at 1440p gaming (instead of 4K) then we could say that this amount of Infinity Cache is comfortable at this performance level.

    6600XT is essentially the same performance as 5700XT. The smaller bus (50%) and die (94%) along with 7% more transistors gives this performance despite using 71% of the power. Infinity Cache is somewhere in the region of 2% of the transistors if we use the 6 transistor per bit rule of thumb and add a bit for supporting hardware.

    We could conclude that the narrower memory bus is allowing Navi 2x to spend more power on die for actual graphics work and that the power consumption of Infinity Cache is practically negligible, since most of those transistors are "idle" at any one time.

    But Navi 3x won't get this Infinity Cache boost. So we come back to 128MB, on its own, as determining the performance of Navi 33.

    I think it's reasonable to expect Navi 33 to be about 320-350mm². This article about TSMC 6nm:

    TSMC Reveals 6 nm Process Technology: 7 nm with Higher Transistor Density (anandtech.com)

    implies 15% more transistors. I suppose that puts it in the region of 21-23B transistors assuming that some of those extra transistors would be available because of a reduction in GDDR PHY size from 192-bit to 128-bit (saves about 16mm²?). That's about 5B short of the transistor count of Navi 21 (6900XT)

    So the extra transistors available for GPU work, versus Navi 22 (6700XT), is about 5B. So that's not going to take Navi 33 to the equivalent of 64 CUs - only about 52. So that implies much higher clocks, e.g. 3.2GHz+.

    I do expect the ALU:TMU ratio to get doubled in Navi 3x, but that's not going to save a huge amount of transistors. 2% of an "equivalent-CU" saving?

    It seems reasonable that ALU:RA will get halved or quartered in Navi 3x, but I wouldn't be surprised if that eats up all the savings of reduced TMU count. I'm assuming RA math doesn't use TMU math units, merely that data-paths and scheduling (queuing and coalescing memory operations) are largely common. If the TMU math units have been generalised to allow them to also do ray-box and ray-triangle tests, then I guess texturing throughput will continue rising to beyond-crazy levels, just to advance ray-acceleration throughput.

    The big unknown is still the hints of a radically different WGP design. AMD's trend with RDNA has been to increase the quantity of "uncore" per WGP, so I think that would put a strong limit on the increase in "equivalent-CU" count.

    In other words I think brute clocks are going to be more significant than brute SIMD count, in getting to 6900XT performance at 350mm² or less. Extra uncore may well unlock yet better "SIMD IPC".

    We may also see that ">6900XT" actually only refers to ray-tracing performance. In pure rasterisation workloads Navi 33 might fall far short of 6900XT when ALU-limited.
     
    Lightman likes this.
  10. LordEC911

    Regular

    Joined:
    Nov 25, 2007
    Messages:
    877
    Likes Received:
    208
    Location:
    'Zona
    I find it hard to believe that they were cut down the bus on N32.


    So what is the minimum price now? $199 or $299?
    We have obviously seen the last of <$150 DGPUs that are current generation and in production.
     
  11. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    ?
    Should be like $450 or so.
    A wee bit more.
    ?
    The ALU count is identical to N21, which is 5120.
     
  12. I'm only assuming N3x is Q2 2022 because @Bondrewd claimed AMD is on a 6 quarter cadence between GPU families. I don't follow random rumors from wccftech.



    Going by AMD's famous graph of cache usage by target resolution, it does look like 256MB would fit some >75% of memory requests at 4K.

    If 75% of the requests are done at ~2TB/s, then the remaining 25% can probably come at 256GB/s because the effective bandwidth will still end at ~1.6TB/s.
     
    #852 Deleted member 13524, Sep 19, 2021
    Last edited by a moderator: Sep 19, 2021
    Lightman likes this.
  13. no-X

    Veteran

    Joined:
    May 28, 2005
    Messages:
    2,451
    Likes Received:
    471
    One thing may be targeted candence, the other one is real situation affected by Covid, overload of TSMC, mining, etc. That could easily cause 4 months delay (Navi 23 was delayed by several months already).
     
  14. Then unless nvidia is getting special treatment by TSMC they're getting Lovelace pushed to 2023.
     
    Lightman likes this.
  15. Qesa

    Newcomer

    Joined:
    Feb 23, 2020
    Messages:
    57
    Likes Received:
    107
    If there's a 25% miss rate and DRAM is 256 GB/s, the effective bandwidth can be no higher than 1 TB/s, no matter how fast the bandwidth is on hits.

    If the cache is serving up 2 TB/s from hits and 0.25 TB/s from misses, by definition that's an 89% hit rate.
     
  16. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,111
    Location:
    New York
    128-bit gddr6 > 6900xt. That would be some impressive voodoo if it’s true at higher resolutions.
     
    DegustatoR and Lightman like this.
  17. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    Still has the N21 limitation of sorta dies at 4k.

    Either way you're not running higher resolutions off an 8GB framebuffer.
     
    trinibwoy likes this.
  18. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,111
    Location:
    New York
    Makes sense.
     
  19. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    Sigh, when I was writing that I felt something was wrong, but couldn't put my finger on it. Bed resolved the problem: I hadn't accounted for bytes! So 8x 2%. ARGH.

    Well, we can subtract 32mm² off for 128-bit GDDR6 and save ~15% area to translate 520mm² on 7nm to about 424mm².

    Perhaps the "new" WGP arrangement means there's only two shader engines, not four. This would reduce the count of ROPS, e.g. to 64. That would save a fair amount of die space... I reckon 32 ROPs are about the same area as a WGP, so just 64 ROPs saves in the region of 10mm². Perhaps 32mm² total saving with only two shader engines?

    So (520-64)/1.15 takes us to 396mm² (ignoring the non-scaling of 128-bit GDDR6).

    If this is really an 8GB card then it seems as if it would need to be positioned as a 1080p card, "7600XT".

    I think power consumption is actually a bigger problem than performance, if this is really a 1080p card and around 150W.
     
    Lightman likes this.
  20. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    Yep!
    Sorta-kinda there.
    Unfortunately yes, clamshells are wildly impractical for anything resembling a mainstream GPU and 24Gb DRAMs are DDR5 only for now.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...