AMD: RDNA 3 Speculation, Rumours and Discussion

Discussion in 'Architecture and Products' started by Jawed, Oct 28, 2020.

Tags:
  1. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    3,244
    Likes Received:
    3,408
    Games will be mostly CPU limited in anything lower than 4K. But that's hardly new, 1024x768 was "high resolution" some time ago.
     
    DavidGraham and PSman1700 like this.
  2. PSman1700

    Legend

    Joined:
    Mar 22, 2019
    Messages:
    7,119
    Likes Received:
    3,093
    NV is ahead in the ray tracing game, it has nothing to do with conspiracy or them 'fiddling with developer input'. Their hardware is better specced, not just in ray tracing but reconstruction and compute aswell.
     
  3. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    I can imagine AMD has variations on traversal. E.g. a shadow ray needs no front to back order of visiting childrens, then there are variants with short stack, or stackless (would need extra pointer in BVH), etc.
    So they might at least test big AAA games and select the fastest variant.
    Really not sure if additional patching in ISA from custom shaders made for console would fit into such worklfow. Likely not. May depend on other low level console extensions. (Edit: or just overall differences to console implementation so custom traversal alone would break)

    Yes, but i think he meant it the other way around: 4K is too high, so CPU can't deliver, and thus monster GPU is pointless. Which does not make much sense to me.
     
  4. tsa1

    Newcomer

    Joined:
    Oct 8, 2020
    Messages:
    89
    Likes Received:
    97
    CP2077 runs at less than 30 fps @ 4K with all bells and whistles on (including all the RT gimmicks), so I don't see why anyone would be against getting a GPU that will allow you to play with smooth 120 fps without an IQ degrading crutch
     
  5. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    But do those 30 fps come from CPU limit? Why? CPU involved in building BVH?
     
  6. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    I for sure could make use of some 2x-3x graphics performance. There's also DRS/VRS for folks who don't want to buy into larger, higher res displays to easily burn GPU cycles and get better images in return.
     
    Lightman and PSman1700 like this.
  7. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Yeah I certainly have no concerns about using up that much performance. Cyberpunk at 4K with Ultra RT gets you 11fps on a 6900XT for example. So even on this new monster it would still be unplayable at that resolution without some upscaling in play.

    This is an extreme case but there will be other similar situations where RT is involved. And then as you say, 120fps at 4K (even upscaled) requires much more than what todays top end GPU's can output in many titles.

    [​IMG]

    https://www.kitguru.net/gaming/dominic-moass/cyberpunk-2077-ray-tracing-on-amd-gpus-benchmarked/
     
    Lightman, xpea and PSman1700 like this.
  8. PSman1700

    Legend

    Joined:
    Mar 22, 2019
    Messages:
    7,119
    Likes Received:
    3,093
    I sure as hell wouldnt say the PS5 is 'too powerfull', if i want to play rift apart at the performance rt mode, my resolution gets dropped to anywhere between 1080p and 1440p, along with reduced fidelity. And thats for a game that really looks amazing, up there with the best, but it aint that large of a leap coming from the best PS4 had to offer in the end. The game's lacking a dynamic GI, water looks.... ye, and theres one ray tracing effect.

    Thats if we assume AMD manages to go another GPU generation without dedicated hardware support for ray tracing, which i cant imagine they will.
     
  9. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Yes fingers crossed that if the 2.7x raster performance is real, it translates into something bigger for RT.
     
    PSman1700 likes this.
  10. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    It would be quite pointless to invest all monster power into just compensating RT shortcomings.
    So that remains the price question. Otherwise so far it looks like a killer arch.
     
    pjbliverpool and PSman1700 like this.
  11. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    What kind of spin is this?
     
  12. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    From AMD PowerPoint- White Template (gpuopen.com) page 18:

    Implies there's at the very least a strong bias towards wave64 being solely for pixel shading.

    So the gotcha (for my argument that the hardware really only has 32 work item hardware threads) is the idea that compute shaders can be issued as 64 work item hardware threads. And, indeed, that 128 work item workgroups could be issued as two 64 work item hardware threads, instead of four 32 work item hardware threads.

    I can't think of a time when "Allows higher occupancy (# threads per lane)" would apply to compute and improve performance. This is the only stated benefit (in the list of two benefits) that applies to compute under the Wave64 column of the table.

    In compute, work items sharing a SIMD lane is not part of the programming model. The closest you can get is with chip-specific data parallel processing (DPP) instructions that work on sub-sets of 8 or 16 work items. And that won't share data across the boundary between work items 0:31 and 32:63.

    So the only scenario where work items sharing a lane is effectively exposed is pixel shading attribute interpolation. Maybe someone can think of something else?

    So how would compute get higher occupancy with Wave64 versus Wave32 and be faster (i.e. worth doing)? Is there a mix of workgroup size combined with VGPR and LDS allocations that does this?

    Yes, that's why AMD PowerPoint- White Template (gpuopen.com) seems to be more precise (yet remains vague, citing "heuristics"). In truth in PC gaming we can never really exclude heuristics, because drivers create too much distance to the metal.

    Locality is explicitly relevant for attribute interpolation (pixels that share a triangle can share parts of the LDS data). And locality affects texture filtering. So both of these relate specifically to pixel shading.

    Developers probably can't access the wave32/64 decision, so it "doesn't matter to them". Well, you could argue that the more dedicated might complain to AMD that the driver stinks for their game and AMD makes a decision in the driver for them.

    In trying to understand the hardware, and why "CUs" are still a part of RDNA, wrapped inside a WGP, the count of hardware thread sizes might be informative.

    CUs might be present simply to soften the complexity of getting RDNA working. Drivers were very troublesome for quite a while after 5700XT released, despite the helping hand of the CU mode. And perhaps would have been worse if there was only a single TMU per WGP, with LDS being a single array, not the two that we currently have.

    CU mode combined with wave64 mode looks intentional as the softest complexity: the backstop when the driver team is struggling to adapt to a new architecture. G-buffer fill might be the perfect use-case, but that's export bandwidth bound these days, isn't it?

    If RDNA 3 has no CUs does it still need wave64 mode? Is wave64 mode more important in that case?
     
    Lightman likes this.
  13. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    RDNA2 copium still lingers on harshly.
     
  14. Rootax

    Veteran

    Joined:
    Jan 2, 2006
    Messages:
    2,401
    Likes Received:
    1,845
    Location:
    France
    They already have hardware RT... I guess they will try to accelerate more "stages" in the futur for sure, but saying they don't have hardware RT is wrong.
     
    Lightman and PSman1700 like this.
  15. PSman1700

    Legend

    Joined:
    Mar 22, 2019
    Messages:
    7,119
    Likes Received:
    3,093
    Ye i dunno what their planning, just increasing the compute by say 3 times, thats impressive, but if the competition's doing that aswell + dedicated hardware rt, it would be the same situation as today, which is kinda boring i think.

    Fully agree, thats why i wrote dedicated.
     
  16. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    nope
     
    Lightman and PSman1700 like this.
  17. PSman1700

    Legend

    Joined:
    Mar 22, 2019
    Messages:
    7,119
    Likes Received:
    3,093
    And you know that because?
     
  18. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    144SM clocked to hell is not quite 240SM.
    But maybe that's just me.
     
    PSman1700 likes this.
  19. Rootax

    Veteran

    Joined:
    Jan 2, 2006
    Messages:
    2,401
    Likes Received:
    1,845
    Location:
    France
    You mean put the Ray Accelerators outside the CU ? My take is they can beef the RA, and still having them in the CUs. It's still dedicated hardware.
     
  20. PSman1700

    Legend

    Joined:
    Mar 22, 2019
    Messages:
    7,119
    Likes Received:
    3,093
    I dont care how they do it, aslong as they drive up competition which hopefully both drives prices down aswell as more pressure in the market to innovate and generally even larger leaps in performance.
    Maybe that happens when Apple joins the game dunno.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...