AMD: RDNA 3 Speculation, Rumours and Discussion

Discussion in 'Architecture and Products' started by Jawed, Oct 28, 2020.

Tags:
  1. no-X

    Veteran

    Joined:
    May 28, 2005
    Messages:
    2,451
    Likes Received:
    471
    As for the APUs and bandwidth: RDNA by itself is more bandwidth-efficient than currently used Vega+. Lets say 1,25×. Moving from DDR4 to DDR5 doubles bandwidth (2×). Infinity Cache / SLC allows to double the effective bandwidth (2×). So from the bandwidth perspective it would be possible to create (1,25*2*2) ~5-times faster integrated graphics than the current Vega+ used in Cezanne (using RDNA2/3 and standard dual-channel DDR5). I think such configuration would be more TDP-limited than bandwidth-limited.
     
  2. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,244
    Likes Received:
    4,465
    Location:
    Finland
    Technically they all will be quad channels, it's just two 32bit channels per DIMM instead of the current one 64bit channel per DIMM (or rather, that's 2x40bit vs 72bit, 8bits for ECC per channel)
    And I don't see mainstream platforms getting more channels than the natural doubling DDR5 brings.
     
    Lightman likes this.
  3. Frenetic Pony

    Regular

    Joined:
    Nov 12, 2011
    Messages:
    807
    Likes Received:
    478
    I'm wondering when chiplet APUs will be feasible, which would make this seem far more imminent. Drop in an IO die, an 8 core CPU die, and a smallish GPU, for example.

    Let's say for AMD next year or two: 20 CU GPU, with TSMC's ultra high density SRAM libraries you could have 64mb LLC and DDR5 is fast enough to provide the rest. Clocked relatively high you're looking at, 5-6 teraflops, high enough to keep up with minimum requirements thanks to Series S. Then a 6 core Zen 4, high clocks and higher efficiency should keep it up with a Zen 2 3700. All you'd need is an NVME and you've got a game ready SFF box, final cost of what, $650 maybe?
     
  4. Esrever

    Regular

    Joined:
    Feb 6, 2013
    Messages:
    846
    Likes Received:
    647
    It would be economically limited at that point. Say they put 20 RDNA2 CUs and 64Mb of infinity cache in to compensate + 8 zen 3 cores. The chip to be like 300mm^2, with so much silicon, it's going to be expensive. Cost/mm^2 of silicon goes up exponentially as you get to bigger and bigger dies. Seems counter productive to do this when even discrete CPU and GPUs are moving towards chiplets.

    100mm^2 CPU + 200mm^2 GPU will be cheaper to produce than a 300mm^2 APU by a long shot. I actually expect AMD to move in the other direction regarding APUs, GPU and CPU being separate dies even in small APUs makes sense economically. Doing 2 80mm chiplets would probably be cheaper than their 154mm Cezanne die, the power consumption and initial design investment has to be there tho.
     
    CarstenS likes this.
  5. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    No, I mean with a 3X more raster performance, a future RTX 4090/ 7900XT will definitely be CPU limited at 4K.
    No, a 3090 / 6900XT is twice as fast as a PS5/Series X, add 3X faster on top of that, and current generation games will be CPU limited quickly. Which is why I suspect that 3X figure is for resolutions of 4K and up. There is no way in hell we can achieve 3X raster performance in most of current games using any CPU we have today.

    During the era of Xbox One/PS4, we didn't suffer much from the stagnation of CPU performance, these consoles had very weak CPUs, and games didn't put much load on the CPUs, now things will change, CPUs will be used to render more complex simulations, which means games will rely more on the CPU now, especially given the single threaded nature of games, and their tendency to not scale well with many cores, which means with these super powerful new GPUs, games will be more CPU limited, through the combination of complex simulation and high fps.
     
    PSman1700 likes this.
  6. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
    An RDNA 2 CU is about 2mm².

    [​IMG]

    An RDNA 3 WGP with 8 SIMDs on 5nm would be around 4mm² assuming 8 SIMDs I suppose.
     
    T2098, Lightman and BRiT like this.
  7. Digidi

    Regular

    Joined:
    Sep 1, 2015
    Messages:
    428
    Likes Received:
    239
    That’s the question why you build a piling where you have such big unbalance. So when we think about micro Polygons, the second Scan Converter runs always empty. It make only sense when you have Polygons which are bigger than 16 pixels….
     
  8. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,058
    Likes Received:
    3,116
    Location:
    New York
    Not sure how you arrive at that conclusion using that math. Let’s use your 6x multiplier. 4K @ 120fps is 9x the pixels of 1440p (upscaled to 4K) @ 30fps. So still GPU limited.
     
  9. Qesa

    Newcomer

    Joined:
    Feb 23, 2020
    Messages:
    57
    Likes Received:
    107
    I think the other part of his argument is that the massive disparity in CPU performance last gen is no longer present. Skylake was vastly faster than jaguar, so running at 4x the frame rate wasn't an issue on PC; now that consoles are on Zen 2, a developer making use of that at 30fps is going to make 100+ a challenge.
     
    DavidGraham likes this.
  10. Putas

    Regular

    Joined:
    Nov 7, 2004
    Messages:
    738
    Likes Received:
    355
    Imagine the slowdown on macro polygons.
     
  11. tsa1

    Newcomer

    Joined:
    Oct 8, 2020
    Messages:
    89
    Likes Received:
    97
    CPU performance is not something that is set in stone, some engines (SoTTR, for example), are mostly GPU-limited even in extreme scenarios (25% scaling, 800x600), while decrepit things like Dunia in all Ubisoft games are mostly CPU limited even in Full HD at low / mid-range GPUs level. And even in this case you will see _some_ (if not most) of the performance increase with more beefier GPU. For example, with 2x-3x more GPU dakka we'll be able to run DX:MD at 4k with 2x msaa or whatever type of anti-aliasing is used there (it's brutal to FPS atm).

    People get caught in the absolutes for some reason and start to worry about a bit of "missing performance" (due to GPU or CPU) instead of just playing the games and noting what changed or not. Pretty sure I did not get 25-50% increase of fps in all games after switching from 3900x to 5900x (apart from e-sports titles where it actually happened), but it still got a lot smoother. I fully expect something of a sort for a similar GPU upgrade.
     
  12. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,058
    Likes Received:
    3,116
    Location:
    New York
    It would be fantastic if developers found a way to make full use of the CPU such that an 8-core is required for 30fps. It’s highly unlikely this will happen though as if there was such a workload we would have seen it in some form already - demo, academic paper etc. Yes 8th generation CPUs were weak but that’s not the main reason for lack of innovation in CPU usage.

    I would love to see high fidelity clothing simulation but that’s probably better suited for GPUs anyway. Maybe there’ll be a revolution in NPC AI. We can only hope.
     
  13. troyan

    Regular

    Joined:
    Sep 1, 2015
    Messages:
    605
    Likes Received:
    1,126
    So the 6600XT with 10 TFLOPs and a 128bit interface needs 160W. The GPU should be over 100W alone. Hope that shows how ridiculous these rumors about RDNA3 are. To deliver 2.7x more performance than that the RNDA3 GPU has to deliver 25TFLOPs with ~100W GPU power.
     
  14. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    Good news!
    It's both a shrink and a new uArch.
     
  15. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    If you understood the thread at this point, you should have considered that:

    - RDNA3 is a new architecture n a new process node (5nm), where as only the cache part is on 6nm (and caches are not the bulk of the power consumption)
    - RX6600XT is clocked quite high, where the N31 clocks are supposed to be more conservative, hence in a much better point of the voltage/frequency curve, espacially considering a new process that TSMC is declaring able to reach higher speeds.
    - N31 is supposed to have an higher power consumption than N21 anyway, while recent leaks point to next highend Nvidia card going over 400W and so this would be the competition this MCM GPU will face
     
    Lightman and BRiT like this.
  16. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    Using the latest UE5 demo, a Series X achieves 1080p30, while 3090/6900XT achieves 1080p60, with unoptimized PC code.

    They will populate the screen with more characters, props and details, draw distance will be expanded, physics will get more complex, this should bring a modern 8 core CPU to it's knees even at 30fps.
     
    PSman1700 and DegustatoR like this.
  17. troyan

    Regular

    Joined:
    Sep 1, 2015
    Messages:
    605
    Likes Received:
    1,126
    RDNA2 is a new architecture, too. Yet with 70% more transistors than RDNA1 it is only ~30% more effcient. The 6600XT has a 128bit interface but uses 160W. A 3060 has 40% more offchip bandwidth and has nearly the same efficiency.
    Navi23 is optimized for 1080p without raytracing and yet it is slighty better than a 3060. Even the PS5 SoC is more effcient and overall a better chip.

    Dont believe it. Even the 350W of the 3090 and 3080TI is way to high.
     
    PSman1700 likes this.
  18. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    Wut.
    Hugging the fmax is very, very nice.
    Who cares, AD102 is >450W.
     
  19. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    N23 over N10 is 760M transistors or +7,3%.
    Power is down from 225W to 160W or -29%.
     
  20. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    Frankly, you are comparing apples with oranges. RDNA2 is 30% (or MORE, as the comparison between N21 and N10 should have shown) efficient then RDNA1 on the very same process. If you want perf/W, you need to spend something for that. Nothing is free in the engineering world. Also, 6600XT and PS5 SoC have different clocks, different targets and different performance, even if PS5 integrates a (mobile) Zen2 CPU. If 6660XT would have been clocked lower, it would have had way lower power consumption. and they are based on the same architecture. So you are saying that RDNA2 is better than RDNA2. Rigghttt. To me, it seems you are only trying to bash AMD without a minimal understanding of tech and tech compromises.

    Lol, so Nvidia is able to put 144 SM on their next gen but these will magically consume much less because? Leakers are quite uniform on this point. .
     
    #680 Leoneazzurro5, Jul 30, 2021
    Last edited: Jul 30, 2021
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...