Speculation: GPU Performance Comparisons of 2020 *Spawn*

Discussion in 'Architecture and Products' started by eastmen, Jul 20, 2020.

Thread Status:
Not open for further replies.
  1. Benetanegia

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    376
    Likes Received:
    385
    ModEdit: removed unnecessary descriptor

    The extreme examples aren't meant to be taken literally. BTW Dreams for PS4 is a fully compute shader based game, I don't know the FP:INT mix. But while not 2,7x times faster as with pure FP32, Ampere is still 2x times faster on compute with current mix.

    In the near future, for the various games, the ratio could vary from what you mention to something like "from 2,1:1 to around 3,1:1" which is a minor shift and on performance charts the 3080 would appear to be 15% faster than it is now. Even more if more compute is used in general vs relying on dozens of render targets. That's how aggregate performance works. There are already very real "gaming workloads" where Ampere is over 2x faster...
     
    #701 Benetanegia, Oct 6, 2020
    Last edited by a moderator: Oct 7, 2020
    PSman1700, pharma and DegustatoR like this.
  2. pTmdfx

    Regular Newcomer

    Joined:
    May 27, 2014
    Messages:
    340
    Likes Received:
    278
    Without knowing the details of the circuity looks like, I would errr from drawing any "hardware is wasted" conclusion:

    1. Nvidia has apparently been giving their integer SIMD (and now a 2nd FP SIMD) an actual hardware pipeline and datapath (operand forwarding/routing + RF) for dual-issue arithmetic — GCN/RDNA doesn't have that.
    2. Floating point and integer arithmetic units may share circuitry, and there is no way to know unless AMD released the details in public domain. Nvidia having made the choice of having them separated does not imply AMD having made the same choices.
    3. Maybe more variety of ops in the same pipeline is cheaper than dual-issue. Maybe not. I can't tell.

    This topic is likely an endless merry-go-round with the information currently available on this forum. It is kinda like arguing about which CPU cores "waste more hardware" by simply extrapolating from execution unit distributions across issue ports. :p
     
    #702 pTmdfx, Oct 6, 2020
    Last edited: Oct 6, 2020
  3. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    1,795
    Likes Received:
    713
    Location:
    msk.ru/spb.ru
    Not my choice of words. Should've used quotes on "wasting" in the previous post.
    I (and I presume that NV and AMD too) don't care about "wasted" h/w as long as it helps with providing competitive performance per transistor.
    Which is why this whole topic of how Ampere is "wasting" h/w seems completely pointless. There are tensor cores in Ampere which are being about 99% wasted everywhere beyond games with DLSS - does this make them bad? Should they remove them and deprecate DLSS in future h/w?

    This is not what is shown on the schematics and written in the whitepaper though. They clearly state a separate sets of "CUDA cores" (i.e. ALUs) are being used on the same datapath. This datapath can lead to one SIMD with two sets of ALUs - just like in all other GPUs on the market.

    What does "share circuitry" even mean here? NV can't use these ALUs in parallel which obviously mean that they "share circuitry" too. The amount of such reuse may be different of course between different architectures - but again, does it even matter?

    Agreed.
     
  4. Leoneazzurro5

    Newcomer

    Joined:
    Aug 18, 2020
    Messages:
    84
    Likes Received:
    119
    ModEdit: Removed unnecessary bits

    I not only agreed that Ampere has 2x the FP32 PEAK performance, I've written it myself. What I don't agree is that being representative of typical gaming workload, and no, there is NO game out there with 2x the performance improvement respect to Turing, with the same SM number. Probably you are comparing the 3080 to the 2080, but 3080 has the same SM count as 2080TI. Which has more bandwidth and lower base clocks, making the comparison useless. I will see that 2x when it will be on independent benchmarks, like I would not buy a 2x of Navi 21 over Navi 10 without any real world benchmark.
     
    #704 Leoneazzurro5, Oct 7, 2020
    Last edited by a moderator: Oct 7, 2020
    w0lfram likes this.
  5. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    17,275
    Likes Received:
    17,678
    I suggest everyone take a break away from the discussion to realize this is supposed to be a Technical forum for open positive discussions. Cooler and logical heads should prevail.

    This is a complete mess and is going to take a long time to sort out how to get everything back on track. Le Sigh.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...