AMD: Navi Speculation, Rumours and Discussion [2019]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

  1. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    11,031
    Likes Received:
    5,573
    Regardless, when AMD mentioned the 50% power efficiency uplift on RDNA2 they were definitely talking about Big Navi (which they mentioned by name).
    So if it's not a 300W GPU with 25% higher performance than the 2080 Ti, then it's a 250W part with ~5% higher performance.

    I don't think AMD is launching what they'd call "Big Navi" with a 225W TDP like Navi 10 XT.
     
  2. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,934
    Likes Received:
    2,264
    Location:
    Germany
    I know what he wrote, no need to get overexited. :)

    I did this fun exercise of yours with Vega56/64 and RX 5700/5700XT respectively, using the TFlops-data from Techpowerup and the percentages from your screenshot.

    Vega 56 -> 64: +20 % TFlops, +9 % performance (I say again: in the screenshot you posted)
    RX 5700 -> 5700 XT: +23 % TFlops, +13 % Performance. (I say again: in the screenshot you posted)
    edit: Don't think it's an AMD bashing:
    RTX 2070 -> 2080 Super: +49 % TFlops, +32 % Performance (I say again: in the screenshot you posted)
    please ignore the above, I meant to use the 2070 Super als based on TU104:
    RTX 2070 Super -> 2080 Super: +23 % TFlops and +15 % performance


    I'll take all of these gladly, once they manifest. Hey, I pay for my graphics card too and I would love stiffer competition and lower prices as well as the next dude.

    I couldn't care less about clock speeds as single number on a sheet of (virtual) paper. To the contrary: Usually, seemingly underwhelming clocks can indicate that an arch is not pushed to or beyond it's breaking point.
     
    #2142 CarstenS, May 26, 2020
    Last edited: May 26, 2020
    Konan65, DavidGraham, pharma and 3 others like this.
  3. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,080
    Likes Received:
    2,949
    Location:
    Finland
    It all comes down to what you decide to pick as comparison points
    For example, using TechPowerUp data:
    5500 XT > 5700 XT: +87% TFLOPS, +90% performance
    or if you want even prettier picture you could pick 5500 XT > 5700: +53% TFLOPS, +68% performance
     
    ethernity, ToTTenTranz and Tarkin1977 like this.
  4. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,934
    Likes Received:
    2,264
    Location:
    Germany
    I was picking1 three points with the least variables (i.e. same underlying chips), as common sense would dictate. And it was not me who brought this kind of math in here.

    1 not even picking, those three were the first that came to mind.
     
  5. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,080
    Likes Received:
    2,949
    Location:
    Finland
    I wasn't trying to indicate you would have "picked" them because they support some specific point or any such, just pointing out that they don't necessarily tell whole story (also RTX 2070 > 2080 Super is different chips).
    Same underlying chips isn't necessarily the best option either to see how specific architecture scales at least when one's trying to guess the performance of unreleased different chip.
     
  6. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,934
    Likes Received:
    2,264
    Location:
    Germany
    Oh, damn, you're right. I meant to use the upgraded 2070 Super. Just a second.
    That's +23 % TFlops and +15 % performance

    It at least introduces the least variables given the frame set out earlier in the thread. Apart from that, I let your point speak for itself.
     
    #2146 CarstenS, May 26, 2020
    Last edited: May 26, 2020
  7. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    11,031
    Likes Received:
    5,573
    I'm not excited.. Are you?


    This is not the fun exercise I did at all.
    The data of point I used was not theoretical TFLOPs vs. effective gaming performance. I used power efficiency because that's what AMD has been using for RDNA (or ever since Raja left).



    [​IMG]



    For RDNA1 Navi 10, they claimed 50% power efficiency over Vega 10, which is what they delivered:

    [​IMG] [​IMG]

    1/0.64 = 1.56 = 56% higher power efficiency for 5700 XT vs. Vega 64, and for 5700 vs. Vega 56


    They did not lie about the power efficiency of Navi 10 over Vega 10. I'm not assuming AMD is lying about the power efficiency increase of Navi 2x over Navi 10, but you're free to, obviously.




    I wonder if you thought the same of Vega 10's underwhelming clocks vs. Pascal.
    Regardless, nvidia has a new 7nm chip of roughly the same die size of its 12nm predecessor and they decreased its clocks despite having a 33% larger power budget.
    In fact, according to nvidia's own "virtual paper sheets", power efficiency per theoretical FP32 and FP64 throughput actually decreased, which is another oddity considering the supposed process gains.
    I mentioned several times that this could mean nothing to Ampere's consumer GPUs, but someone seems pretty eager to throw nvidia's own official data out the window.
     
    w0lfram and no-X like this.
  8. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,532
    Likes Received:
    2,217
    From TechPowerUp's conclusion
    https://www.techpowerup.com/review/amd-radeon-rx-5700-xt/35.html
     
  9. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,080
    Likes Received:
    2,949
    Location:
    Finland
    ToTTenTranz, BRiT and CarstenS like this.
  10. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,934
    Likes Received:
    2,264
    Location:
    Germany
    Then I did misinterpret your starting point "It's not hard to assume a 80 CU Navi 2x will be 20-30% over the 2080 Ti. Just do Navi 10 x2 that's where you stand." from which you expanded into power consumption as well.


    I am not assuming anyone lies until I see tangible proof. But I don't buy any single hype-building marketing slide either. I just wait until the product arrives and see

    To be brutally honest, this is my conviction since the days of AMDs K5 and just has been proven time and again since then.
    I did even buy (yes, my own money given to a graphics card company) a Vega 56 for my gaming machine - and surely not to troll around forums how disappointed I was, which I wasn't.

    Who would do that? Ampere's book still has some leaves left unturned, I guess.
     
  11. Benetanegia

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    288
    Likes Received:
    189
    Could you point me in the right direction, cause the only official numbers I've seen in regards to power is the TDP itself, which says nothing at all in regards to FP32 and FP64 efficiency.
     
  12. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    11,031
    Likes Received:
    5,573
    Care to point out where in that post do I mention TFLOPs numbers, which you used in your comparison?
    80 CU Big Navi is simply the CU count that's been up on the rumor mill. You were the first one to bring TFLOPs to the table and CU count isn't enough to determine TFLOPs.

    All I wrote was a 300W Big Navi would be up to 30% faster than a 2080 Ti, if Big Navi has a 300W TDP and AMD's claims of a 50% jump in power efficiency were as true as the 50% jump in efficiency they claimed for Vega 10 vs. Navi 10 (which became true).

    It says FP32 and FP64 throughput in regards to TDP. It also suggests a 1425MHz core clock for the 400W GA100, which is contrasting to the 1455MHz clocks for the 300W GV100.
    A lower clock on a similar sized chip with higher TDP, despite the jump to a (supposedly) significantly improved process node, together with a modest increase in FP32 and FP64 throughput.

    I could repeat the "this might have nothing to do with consumer Ampere though" disclaimer but somehow that keeps getting ignored...
     
  13. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,278
    Likes Received:
    3,523
    Which 5500XT? 4GB? 8GB? And at which resolution?
     
    pharma likes this.
  14. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    656
    Likes Received:
    309
    New TCs are very green, and very mean.
    I doubt it hits 400W workload power in generic non-GEMM FP32/64/you name it.
    Yeah the client Volta3 is a bit less impressive than some (ergo plebbitors) dream of.
    Still real solid product so.
     
  15. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,934
    Likes Received:
    2,264
    Location:
    Germany
    Yeah, my bad. By "80 CU Navi 2x will be 20-30% over the 2080 Ti. Just do Navi 10 x2 that's where you stand." you obviously meant something totally not connected to the number of CUs and frequency (which would be TFlops for instance), which is why you recommended "just to do Navi 10 x2".

    You seem to mistake me for someone who is contending the possibility of this becoming reality. Instead I wrote "After 2+ years and on a full node advantage, I'm really looking forward to your math coming true."
     
    #2155 CarstenS, May 26, 2020
    Last edited: May 26, 2020
  16. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,934
    Likes Received:
    2,264
    Location:
    Germany
    1410 MHz, FWIW. And what GA100 seemingly has done is investing a large portion of "7nm goodness" into more transistors. They won't switch free of charge, and they won't come free in terms of clock speeds either, considering how tightly they are packed.
     
    pharma and DavidGraham like this.
  17. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    656
    Likes Received:
    309
    More specialization.
    You can also wait till Graphcore shrinks their vaporware for real many dumb xtors@mm^2 goodness.
     
  18. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,278
    Likes Received:
    3,523
    AMD measured perf/w on a standard test of The Division 2 running 1440p Ultra details.

    Your TPU aggregate perf/w chart is not accurate, it just adds performance numbers to official TDP claims, it's not based on actual measurements. You would have to measure power consumption in each game, then do an aggreggate chart.
    Also, NVLink requires significant power consumption.

    V100 SXM2 NVLink operates @1540MHz and 900GB/s HBMs with a TDP of 300w
    V100S PCIe operates @1610MHz and 1100GB/s HBM2 with a TDP of 250w

    [​IMG]

    The V100S PCI-E shaves 50W of power despite running at faster clocks for the core and memory just because it dumps the NVLink for PCIe.
     
    pharma likes this.
  19. Benetanegia

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    288
    Likes Received:
    189
    It says, where? All your claims seem to be referencing the specs sheet, where we can see various throughput numbers and TDP. But it is you who's making the link 19.5 TFlops @ 400w, or so it seems. If not, if you've seen it somewhere, that's what I'm asking for. Personally, I've yet to see that TDP linked to any specific task, but it's immediately obvious to me that it probably refers to the TDP required for the 320 TFlops in tensor cores, which is a 2.5x increase over Volta. There's literally no logical reason to believe that Ampere FP32 "cores" are somehow less efficient despite a new node, while at the same time Ampere Tensor Cores are >2x as efficient, while also providing much more functionality at the same time.

    EDIT: Basically V100 did 16 FP32 TFlops and 130 TC FP16 Tflops. 130 / 16 = 8 times more.
    A100 does almost 20 and 320, an 16x times more. If the TCs where not the most consuming units in V100, they most definitely are in A100.

    It's probably ignored because it's irrelevant until it is resolved whether or not AI/HPC has anything to "fix". I don't think there's anything suspect or out of place at all with GA100's "normal FP32 cores", so why would I discuss about "something" being "different" in consumer Ampere.
     
    #2159 Benetanegia, May 26, 2020
    Last edited: May 26, 2020
    pharma and DavidGraham like this.
  20. neckthrough

    Newcomer

    Joined:
    Mar 28, 2019
    Messages:
    14
    Likes Received:
    31
    Seems reasonable if you're trying to do a perf comparison at roughly iso-power.

    5700XT = 225W
    Vega56 = 210W
    Vega64 = 295W

    5700XT vs. Vega56 is a 7% difference. Vega64 vs. 5700XT is a 31% difference.
     
    pharma likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...