Nvidia Ampere Discussion [2020-05-14]

Discussion in 'Architecture and Products' started by Man from Atlantis, May 14, 2020.

Tags:
  1. JasonLD

    Regular

    Joined:
    Apr 3, 2004
    Messages:
    421
    Likes Received:
    68
    It doesn't matter. It isn't like Xbox/PS5 is going to turn into full fledged Windows 10/Linux machine. It is still going to be playing games/streaming media contents.
    There is nothing about next generation consoles that will disrupt anything beyond console gaming market.
     
    xpea and PSman1700 like this.
  2. Benetanegia

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    394
    Likes Received:
    425
    That makes the most sense to me. I had already thought about it several months back, when the 2x FP32 rumor started and I think it seems the most efficient way of doing it without increasing Register size/bandwidth or wrecking it (Kepler). Plus, 10k FP32 + 5k INT32 units sound like a ridiculously high number. 10k where half of them will share FP and INT ops seems more reasonable.

    Edit: Either way fun times ahead with TFLOP reporting. lol
     
  3. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,590
    Likes Received:
    4,317
    It's not a dual GPU, NVIDIA dumped those a long time ago, along side their SLi initiative, I have a strong feeling it's something far more sinister, given NVIDIA's marketing push and comparison to the first industry GPU.

    This whole launch screams secrets and surprises, first GDDR6X out of the blue, then a sudden imminent launch, now 2X FP32 units, and who knows what's there to uncover?
     
    PSman1700 likes this.
  4. Benetanegia

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    394
    Likes Received:
    425
    I didn't say that it is dual-GPU, yes or yes, based on the points I argued*. And I certainly didn't say or think of SLI. Maybe the surprise is they finally figured out multi-GPU rendering. I mean that they must have because Hopper is supposed to be chiplet design, but maybe they can actually do it now, and that is the surprise.

    *I don't think you guys understand what Occam's Razor means:

    https://en.wikipedia.org/wiki/Occam's_razor
     
    nnunn likes this.
  5. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,891
    Likes Received:
    4,079
    Location:
    Finland
    This one, it's the same cooler without shroud, fans or pcb
    upload_2020-8-15_4-7-13.png
     
  6. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    8,377
    Likes Received:
    4,815
    Location:
    Pennsylvania
    I think it's rather disappointing that the apparent halo 3090 has only 12Gb of VRAM.
     
  7. Benetanegia

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    394
    Likes Received:
    425
    That's the one. In my mind the PCB was sandwitched in between, somehow. But like I said, that cooler is for 3080 anyway as per your other pic and goes alongside a V shaped PCB which barely has space for 1 GPU let alone 2. So I was wrong on that.
     
  8. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,027
    Likes Received:
    90
    You and me both. How many years is it now with 12GB at the top of the product stack? Yeah, I know the Titan RTX has 24GB but I don’t consider a $3000 GPU a consumer product.

    Also, I can’t believe the GDDR6X rumor was true. Sounded like complete bs. I hope the double FP throughput rumor turns out to be true. 2080 Ti going up on ebay!
     
  9. troyan

    Regular Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    353
    Likes Received:
    692
    Didnt they do the same with GDDR5X? It was only accessible and used by nVidia.
     
    yuri, chris1515 and PSman1700 like this.
  10. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    11,314
    Likes Received:
    1,947
    Location:
    New York
    You're right I never noticed that. Polaris was GDDR5 and all the high end stuff after that was HBM.
     
  11. Bondrewd

    Veteran Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    1,242
    Likes Received:
    581
    No, G5X was an actual JEDEC spec.
     
  12. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,414
    Likes Received:
    1,963
    Location:
    msk.ru/spb.ru
    Do we have any idea on how FP64, FP32 and INT32 are scheduled on GA100? Can they all run in parallel?

    For reference:

    [​IMG]
     
  13. troyan

    Regular Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    353
    Likes Received:
    692
    A100 can schedule two FP16 vec2 operations concurrently on the FP32 and TensorCores.
     
    DavidGraham and Bondrewd like this.
  14. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,414
    Likes Received:
    1,963
    Location:
    msk.ru/spb.ru
    It can? Turing used TCs for all FP16 math AFAIK, there were no FP16 capability in main FP32 SIMDs. It also can't run FP32+INT32 and TCs concurrently.
     
  15. Samwell

    Newcomer

    Joined:
    Dec 23, 2011
    Messages:
    134
    Likes Received:
    158
    Yes, because of that A100 has a 4xFP16 rate compared to FP32. Could you maybe use the normal FP32 and somehow FP32 from the tensor cores to double the theoretical throughput?
     
  16. Qesa

    Newcomer

    Joined:
    Feb 23, 2020
    Messages:
    28
    Likes Received:
    48
    The tensor core has double the throughput of Volta all on its own; vector fp16 also doubling isn't surprising.
     
    PSman1700 likes this.
  17. Bondrewd

    Veteran Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    1,242
    Likes Received:
    581
    Isn't FP32 from A100 TCs a non-IEEE one?
     
  18. Qesa

    Newcomer

    Joined:
    Feb 23, 2020
    Messages:
    28
    Likes Received:
    48
    Yeah, missing 13 bits of mantissa
     
  19. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,414
    Likes Received:
    1,963
    Location:
    msk.ru/spb.ru
    This rate is maintained on all TC precision modes though which means that it's not coming from FP32 SIMDs, no?

    I see two possibilities for gaming Ampere here:

    1. Double width FP32 SIMDs which will likely lead to a double width of INT32 SIMD as well. They've done this previously between GP100 and GP10x.

    2. A second 16-wide FP32 SIMD in place of the FP64 one of GA100. But for that to work well they'll need to be able to schedule FP32+FP32+INT32 or it will be either FP32+FP32 or FP32+INT32 per clock which will result in utilization issues.
     
  20. troyan

    Regular Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    353
    Likes Received:
    692
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...