Nvidia Ampere Discussion [2020-05-14]

Discussion in 'Architecture and Products' started by Man from Atlantis, May 14, 2020.

Tags:
  1. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,111
    Location:
    New York
    Most thorough review I've seen this round. Will have to remember to check them out for RDNA2 as well.
     
  2. Rootax

    Veteran

    Joined:
    Jan 2, 2006
    Messages:
    2,400
    Likes Received:
    1,845
    Location:
    France
    Maybe a stupid question, but , is the driver a key component to "fully" utilize the double fp32 units (when it's not used for int32), or it's mostly a hardware thing ?
     
  3. RedVi

    Regular

    Joined:
    Sep 12, 2010
    Messages:
    407
    Likes Received:
    59
    Location:
    Australia
    2000 series doesn't have HDMI 2.1 output so can only do 4K G-Sync up to 60hz. Not exactly that special unfortunately.
     
  4. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    The driver contains the shader compiler. So it can make a difference. We could expect that NVidia will improve the shader compiler. On the other hand shader compilation is something you can refine years ahead of the silicon arriving.

    I found an analysis I did 12 years ago on the compilation of Perlin Noise. Interestingly, on AMD's VLIW-5 GPUs, the utilisation was about 89%. G80 was about 5% more efficient.. This means that instruction dependency was not that significant.

    So that makes me even more puzzled why Ampere is "slow". The texturing workload is not substantial, so I don't believe that's relevant.
     
    CarstenS and Rootax like this.
  5. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    15,134
    Likes Received:
    7,679


    Looks like you can get a very minor overclock in the 50-70 MHz range with a slight undervolt, which actually beats trying to overclock with a +100MHz core offset on this Gigabyte card. I think the best play will be to set the power limit as high as the BIOS allows in msi afterburner or evga precision x1 and then find the highest frequency you can maintain under the power limit with undervolting. 100% stable clock is much better than having the clock jumping around. Interesting that there are some comments saying ray tracing is more sensitive to undervolting and you may find stable undervolts for raster games that crash in ray-traced games. It'll probably be necessary to use something like Port Royal to check overclock, undervolt results.
     
  6. NightAntilli

    Newcomer

    Joined:
    Oct 8, 2015
    Messages:
    104
    Likes Received:
    131
    The 3080 really reminds me of the Vega 56/64 cards.
     
  7. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    15,134
    Likes Received:
    7,679
    Yep.
     
  8. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,111
    Location:
    New York
    Lodix and pharma like this.
  9. SimBy

    Regular

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    We began shipping GPUs to our partners in August, and have been increasing the supply weekly.

    So how long does it take to 1st partners actually get the chips, 2nd partners make the cards, 3rd partners ship those cards to retailers around the world? Shipping anything around the world right now is a nightmare.

    I don't see any real supply until 21. Same goes for AMD probably.
     
  10. yuri

    Regular

    Joined:
    Jun 2, 2010
    Messages:
    283
    Likes Received:
    296
    RTX 3000 are undoubtly the most powerful cards at the market. Unlike the Vegas.

    However, statements like "wait for the games to catch up" or "this is not the full potential" bring back memories. HD 2900 was like: "This is a DX11 card, wait for games to catch up!". GTX 480 was: "Wait for games to finally utilize all the geometry stuff!". Vega was like: "Wait for the drivers to utilize DSBR, NGG, HBCC and games to use FP16!"...
     
    DavidGraham and sonen like this.
  11. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    15,134
    Likes Received:
    7,679
    I think the difference is all of the features for Ampere are standardized in DirectX Ultimate and are available on Xbox Series X. You don't have to optimize for Ampere specifically. It's just the standard feature set for D3D. Example: Mesh Shaders will leverage the compute power of Ampere, Xbox Series X and the upcoming RDNA2 gpus.
     
    PSman1700 and sonen like this.
  12. Clukos

    Clukos Bloodborne 2 when?
    Veteran

    Joined:
    Jun 25, 2014
    Messages:
    4,688
    Likes Received:
    4,353
    I've ordered a 3080 TUF from overclockers uk within the first 3 hours and I'm probably going to get it in November (if even that), worst product launch I've witnessed the past decade :lol:

    At least I can cancel and get an RDNA2 GPU if AMD delivers the goods.
     
    #1772 Clukos, Sep 22, 2020
    Last edited: Sep 22, 2020
    Lightman likes this.
  13. Rootax

    Veteran

    Joined:
    Jan 2, 2006
    Messages:
    2,400
    Likes Received:
    1,845
    Location:
    France
    Well, if big navi is awesome, I can see the same kind of shortage...
     
  14. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    Yep, except HD2900 was a DX10 card.
     
    Lightman, yuri and PSman1700 like this.
  15. yuri

    Regular

    Joined:
    Jun 2, 2010
    Messages:
    283
    Likes Received:
    296
    Of course, you are right. Time flies...
     
    Frenetic Pony likes this.
  16. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    832
    Likes Received:
    505
    From the ixbit review (BTW very welcome to see this kind of low level feature benchmarking again, brings back memories of hardware.fr), the TMUs reportedly have been upgraded, doubling texel read speed, that is when not using filtering. These kind of TMU reads are often used in compute shaders. That is pretty cool.
     
    #1776 Voxilla, Sep 22, 2020
    Last edited: Sep 22, 2020
  17. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    Why doesn't GA100 have 128x FP32 per SM like GA102? Why does it only have 64? For a compute card that seems like a major omission.
     
  18. Cat Merc

    Newcomer

    Joined:
    May 14, 2017
    Messages:
    161
    Likes Received:
    179
    Probably just die size reasons. NVIDIA couldn't make the die bigger even if they really wanted to, so they'd have to cut out other parts to do it. As for why not reduce SM count to fit double ALU per SM, balance of resources is the logical answer here.
     
  19. troyan

    Regular

    Joined:
    Sep 1, 2015
    Messages:
    603
    Likes Received:
    1,123
    FP32 doesnt matter for GA100. For training they will use TF32 per default.
     
    LeStoffer likes this.
  20. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    3,240
    Likes Received:
    3,395
    Another possible scenario is that GA100 was made considerably earlier than GA10x and the updated FP32/INT h/w wasn't ready for it. We've seen something similar between Volta and Turing previously.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...