AMD: RDNA 3 Speculation, Rumours and Discussion

Discussion in 'Architecture and Products' started by Jawed, Oct 28, 2020.

Tags:
  1. xpea

    xpea Regular

    Usually tapeout is when the design is completed and you upload the files to the foundry FTP/portal/cloud.
    Then the foundry generates the photomasks and tools to build the silicon. For a 7 die chiplet, with extra time needed for interconnect and packaging, I expect 10 to 12 weeks before the customer gets the silicon back from the package factory (Amkor, ASE, etc)
    Then qualification will take at least another 6 months.
    I think one year from tape out to HVM is realistic if taking into account one minor revision (metal mask)

    Edit: Lately, Nvidia has been extremely efficient and fast from tape out to HVM. Most of their first silicon revision (A0) went into HVM
     
    Last edited: Apr 28, 2022
  2. trinibwoy

    trinibwoy Meh Legend

    That would mean N31 is right on time.
     
  3. TopSpoiler

    TopSpoiler Newcomer

  4. xpea

    xpea Regular

    Yes and more things that they don't disclosure in public. The obvious one is the usage of their Selene supercomputer for intensive algorithm/RTL/floor plan verification. Nvidia spend much more time than before in simulation and their silicon verification lab is now a huge department in order to shorten tape out to HVM time. For small dies like GA106-107, tape out to HVM was less than 5 months and it should be even less for Ada
     
    nnunn, DavidGraham, PSman1700 and 2 others like this.
  5. Granath

    Granath Newcomer

  6. JasonLD

    JasonLD Regular

    Expecting both N31 and AD102 to be less than 2x performance of current top end cards. Hype is getting beyond ridiculous at this point.
     
    techuse likes this.
  7. PSman1700

    PSman1700 Legend

    Amazing performance ahead.
     
    Lightman likes this.
  8. Granath

    Granath Newcomer

  9. xpea

    xpea Regular

    They talk like AMD (or Nvidia for the matter) can change the hardware after tape out. These leakers should learn a bit how an ASIC is designed and when an uarch is frozen. It will help them to improve their lies :roll:
     
  10. That's just daft assessment. Most of these leaks come from firmware level / software leaks / vendor leaks and the information is only as good as AMD gives it at some point in time, what is leaked a month ago might be year old information and out of date. As time approaches to release more material things come out.
     
    Silent_Buddha likes this.
  11. trinibwoy

    trinibwoy Meh Legend



    6 SEs and double wide CUs claimed for Navi31. This would put a single GCD at 15360 FP32 per clock.

    I like how both AD102 and Navi 31 “got stronger” in a matter of days. They probably ate their spinach.

    Actually that would be 30720 FP32. Maybe it’s 3 SEs per GCD and 6 SEs per package. Big numbers.
     
    Last edited: Apr 30, 2022
  12. Jawed

    Jawed Legend

    Four SIMDs sharing a single TMU/RA implies AMD thinks ray acceleration is fast enough and ray tracing performance is now down solely to traversal and hit/miss shader throughput.

    Well, that assumes RA hasn't been changed and that traversal is still in software...
     
    Lightman and Man from Atlantis like this.
  13. TopSpoiler

    TopSpoiler Newcomer

    +1. There weren't notable patents published about traversal hardware for RDNA in past 2 years.
     
  14. trinibwoy

    trinibwoy Meh Legend

    More SIMDs isn’t a solution for divergence though. If AMD continues to run traversal on the SIMDs they must not be serious about RT performance or they’re also doing some funky sorting in software.
     
    Lightman likes this.
  15. TopSpoiler

    TopSpoiler Newcomer


    I barely trust him.
     
    pharma and PSman1700 like this.
  16. PSman1700

    PSman1700 Legend

    Sums up leakers in general.
     
  17. trinibwoy

    trinibwoy Meh Legend

    Nothing new from Nvidia either aside from the AI stuff.

    From AMDs patent on the design of the ray accelerator: "One purpose of using a merged data path unit is to reduce the amount of silicon area that is used by only a single type of instruction, because doing so reduces the total amount of silicon for a chip. This particular merged data path unit is capable of outputting results for four box tests per cycle or one triangle test per cycle."

    So the RA isn't just re-using the TMU L1 memory pipeline. It's also sharing silicon between the box and triangle intersection logic. Very elegant and area efficient but one triangle per clock probably isn't going to cut it for RDNA 3.
     
    Lightman and xpea like this.
  18. Jawed

    Jawed Legend

    Divergence literally lengthens the wall-clock time of a shader. If a WGP (or CU, as the rumour suggests CUs are still a thing in RDNA 3) provides for more shader cycles per RA cycle, then that reduces the walk clock time of divergent shaders.

    This is similar to ALU:TEX ratio. Over time that ratio has increased.

    So if AMD is doubling-down on software traversal it's reasonable to expect that more compute is matched by better local resources to increase the "traversal work-items per clock", such as a bigger LDS, bigger register file, higher-capacity crossbars (or ring-bus?).

    The side-effect of "increased traversal WIPC" is that it helps with hit/miss shader WIPC too, since those shaders are also the victims of divergence. All GPUs have this problem and sorting is part of the solution.

    Of course AMD might do nothing in terms of local resources to help with RT WIPC - such changes might be years off. The fact that patent documents for hardware traversal haven't appeared as yet seems to indicate it's off the cards for a very long time.
     
  19. trinibwoy

    trinibwoy Meh Legend

    Sure but best case it reduces that time 2x. Worst case for divergence is 32x.

    Right, AMD needs to solve for both traversal divergence and shading divergence. Intel and Nvidia are tackling the former by avoiding SIMD altogether. Willl be interesting to see how AMD's gamble works out on next generation RT workloads. They don't have enough control over the market to slow down RT adoption so hopefully that's not their game plan.
     
  20. TopSpoiler

    TopSpoiler Newcomer

    pharma and xpea like this.
Loading...

Share This Page

Loading...