AMD Radeon RDNA2 Navi (RX 6700 XT, RX 6800, 6800 XT, 6900 XT) [2020-10-28, 2021-03-03]

Discussion in 'Architecture and Products' started by BRiT, Oct 28, 2020.

  1. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    11,467
    Likes Received:
    2,271
    Location:
    New York
    Are you sure about that Rage mode point? They showed numbers before introducing rage mode.
     
    Lightman and no-X like this.
  2. arandomguy

    Newcomer

    Joined:
    Jul 27, 2020
    Messages:
    132
    Likes Received:
    197
    It's also a bit interesting that they used "best API" as opposed to being specific on which API. I'd also wondering if this means their numbers are derived possibly from different APIs for each GPU? This may lead to some possibly interesting divergence when review testing gets done.

    Overall I'd say at the moment this is tracking slightly higher than my expectations. It's good enough and priced just better enough to give people pause but unless some more dramatics get revl

    But I have to go back to something I keep bringing up with every AMD launch in that people should not expect a Terascale vs Tesla.

    I would expect RDNA2 cards to support "full RT" (well at least DXR "full RT") as ultimately it's a driver support question as DXR doesn't define what type of hardware needs to be present.

    Of course the practical issue will be to what extent RDNA2 is capable of performance wise once you "ramp up" the RT effects.

    This could be another repeat of the tessellation situation, except this time the higher fidelity settings in this case will be considerably more impactful.

    It also would lead to some interesting benchmark messaging/interpretation as reviews tend to want to benchmark apples to apples via "max settings."

    I thought directstorage support was mentioned?

    https://www.tomshardware.com/news/a...-with-ryzen-5000-cpus-via-smart-memory-access

    The other new feature is "Smart Memory Access" which maybe is what they're pushing more and over shadowing the above.

    I'm guessing Smart Memory Access won't be an "open" solution. Interesting how leverage changes viewpoints on "open" vs "walled garden" approaches isn't it?
     
    Cyan likes this.
  3. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,378
    Likes Received:
    1,687
    Location:
    London
    [​IMG]

    Is that an HBM PHY (x2) at the "south west" side?
     
    Lightman and ethernity like this.
  4. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,562
    Likes Received:
    4,739
    Location:
    Well within 3d
    I think these days it's generally on-die fuses or BIOS inactivation. Some of the recent code changes for shader array inactivation specifically reference one or both being able to inactivate a resource.
    The number of places that might need to be lasered and how finely structured everything is likely made laser-cutting fall out of favor some time ago.

    It should reduce active power consumption due to memory traffic. Mobile might still be concerned about static leakage, since even with FinFETs a cache of this size may leak enough power to concern a mobile product where the power budget might 1-2 orders of magnitude lower. Smaller caches or aggressive power gating might take care of some of that.

    That makes sense. Seemed odd to me that it would be depicted like AMD physically blanked out those areas of the die.
     
    Shortbread likes this.
  5. ToTTenTranz

    Legend Veteran

    Joined:
    Jul 7, 2008
    Messages:
    12,504
    Likes Received:
    7,495
    I think we can actually calculate how much bandwidth there is in the Infinity Cache, from this slide:


    [​IMG]



    If we assume they're talking about 16Gbps, then the "1.0X 384bit G6" means 768GB/s and the "256bit G6" is 512GB/s.
    If the Infinity Cache is 2.17x the 384bit G6, then its output is 1666.56GB/s. Take away the 512GB/s from the 256bit G6 and we get 1154.56GB/s for the Infinity Cache alone.
    I'm guessing this is an odd number because this LLC is working at the same clocks as the rest of the GPU.. maybe they're using the 2015MHz game clock.
     
    Lightman, Pete and Cyan like this.
  6. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,273
    Location:
    Self Imposed Exhile
    Infinity cache is super interesting. It will be great to see that get dissected down. Which things it will make super fast? Are the edge cases where performance might fall of the cliff and how game developers will handle that?
     
  7. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    18,127
    Likes Received:
    8,403
    So, no real details on RT yet, unfortunately. AMD working on some form of DLSS style upsampling, but no details yet.

    On the plus side, performance when not using RT rivals NV's equivalently placed cards and on the top end only 2x8 pin power connectors are used. So, power draw when the card is really pushed should be lower than the competition.

    I do wonder if AIBs will add a 3rd 8-pin power connector? Not that this is relevant to me as I stopped overclocking cards back with the Radeon x1800xt (whew that's still a mouthful).

    I'm leaning towards giving the red team a shot again as I've been less than impressed with the driver quality for the 1070 in my machine, but I want to see what RT perf. is like and see anything at all about AMD's DLSS style upsampling. I expect the RT to be slower than NV, but if it's good enough that would be fine.

    Much like I didn't universally allow shadows to be enabled in games until the 1070 due to a combination of performance and shadow quality (wonkiness), I feel it'll be a few hardware generations before I commit to allowing RT to be enabled universally in games. For example, RT in Control is relatively good, but RT in Metro: Exodus was horrible.

    Regards,
    SB
     
    Lightman likes this.
  8. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,027
    Likes Received:
    90
    Your question contained a false premise. I challenged it by offering you a chance to think through the premise further. Your dismissal is unwarranted, should you wish to engage in honest discussion of the subject matter.
     
  9. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,960
    Likes Received:
    4,144
    Location:
    Finland
    Huh? AMD had all the same capabilities (and more) with HBCC already long before NVIDIA. And yes, they specifically mentioned RX 6000's support DirectStorage, too.
     
  10. Esrever

    Regular Newcomer

    Joined:
    Feb 6, 2013
    Messages:
    822
    Likes Received:
    616
    6900XT is going to have a hard time matching the 3090 if the 6800XT matches the 3080, there just not that much 8 more CUs can do. Their own comparison had to have both performance enhancing tech no to match it. Not even sure what the 6900XT is for because the 6800XT is just pretty much the same.

    Also a bigger gap between 6800XT and 6800 than I expected but price seems firmly in favor of the 6800XT. Seems weird. AMD might just be pushing everyone to buy the 6800XT.

    Also no 6700s which is kind of disappointing. Really looking to see how these perform since they are probably going to be priced better than these high end cards.
     
  11. P_EQUALS_NP

    Newcomer

    Joined:
    Jun 17, 2020
    Messages:
    14
    Likes Received:
    3
    Pardon my ignorance, but I fail to see how the AMD 6800xt beats the NVidia 3080 considering that the 6800xt has a peak performance of 20.74 TFLOPS vs the 3080 29.77 TFLOPS. the difference is almost 10 teraflops wide!
     
  12. arandomguy

    Newcomer

    Joined:
    Jul 27, 2020
    Messages:
    132
    Likes Received:
    197
    Strategically it's a risky play that could pan out if they actually don't enable DXR support at all until further past launch if RT acceleration is worse.

    Reviewers tend to want to bench apples to apples, this means that if DXR were enabled they'd use RT settings (with a tendency to max settings), with no support it'll be down sans RT. This means launch reviews will show a more favourable performance comparison.

    While post launch benchmarks with RT support can be massaged to promote that it can be done and from a user experience stand point instead of a direct performance comparison.
     
  13. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,273
    Location:
    Self Imposed Exhile
    Performance difference between 3080 and 3090 is miniscule. 3090 is mainly for creatives who us blender type apps or machine learning requiring massive amount of memory. There is no good use case for 3090 on gaming side when considering tiny perf uplift and huge uplift in price.
     
  14. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,534
    Likes Received:
    495
    Location:
    Varna, Bulgaria
    Lightman likes this.
  15. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,273
    Location:
    Self Imposed Exhile
    Flops is very poor measure for gaming performance. It might apply on pure compute loads but even there infinity cache could be game changer.
     
    Lightman likes this.
  16. arandomguy

    Newcomer

    Joined:
    Jul 27, 2020
    Messages:
    132
    Likes Received:
    197
    TFLOPs by itself is just a measure of how many FPUs you have x clock speed. If TFLOPs were the end differentiator with respect to real performance all designs would just be as many FPUs as you can cram in at as a high clock speed. It's a useful technical marketing term as people can deal with the number easier but has severe limitation comparing across architectures.
     
    Cyan likes this.
  17. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,960
    Likes Received:
    4,144
    Location:
    Finland
    Because FLOPS are just theoretical maximum output of the ALUs at FP32. And in case of Ampere, the raw TFLOPS number is misleading for game performance compared to Turing.
     
  18. hkultala

    Regular

    Joined:
    May 22, 2002
    Messages:
    296
    Likes Received:
    38
    Location:
    Herwood, Tampere, Finland
    Nothing odd in this number, it's 2.25 GHz clock and 4096-bit total bus width to the cache.

    2.25 GHz * 512 bytes = 1152 GB/s.

    1152 GB/s + 512 GB/s = 1664 GB/s

    1664 GB/s / 768 GB/s = 2.16666 ~ 2.17
     
    fellix, BRiT, NightAntilli and 12 others like this.
  19. CarstenS

    Legend Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,546
    Likes Received:
    3,475
    Location:
    Germany
    If I had to guess based on the artist's impression(tm), I'd say off-chip IF links.
     
    tsa1 and Lightman like this.
  20. pTmdfx

    Regular Newcomer

    Joined:
    May 27, 2014
    Messages:
    390
    Likes Received:
    355
    If you assume the cache is memory-side, the upper bound would be 32 byte/clk * 2 (bidirectional) * 16 channels, based on Navi 10 data points on L2-to-MC bandwidth.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...