Nvidia Ampere Discussion [2020-05-14]

Discussion in 'Architecture and Products' started by Man from Atlantis, May 14, 2020.

Tags:
  1. JasonLD

    JasonLD Regular

    It doesn't matter. It isn't like Xbox/PS5 is going to turn into full fledged Windows 10/Linux machine. It is still going to be playing games/streaming media contents.
    There is nothing about next generation consoles that will disrupt anything beyond console gaming market.
     
    xpea and PSman1700 like this.
  2. Benetanegia

    Benetanegia Regular

    That makes the most sense to me. I had already thought about it several months back, when the 2x FP32 rumor started and I think it seems the most efficient way of doing it without increasing Register size/bandwidth or wrecking it (Kepler). Plus, 10k FP32 + 5k INT32 units sound like a ridiculously high number. 10k where half of them will share FP and INT ops seems more reasonable.

    Edit: Either way fun times ahead with TFLOP reporting. lol
     
  3. DavidGraham

    DavidGraham Veteran

    It's not a dual GPU, NVIDIA dumped those a long time ago, along side their SLi initiative, I have a strong feeling it's something far more sinister, given NVIDIA's marketing push and comparison to the first industry GPU.

    This whole launch screams secrets and surprises, first GDDR6X out of the blue, then a sudden imminent launch, now 2X FP32 units, and who knows what's there to uncover?
     
    PSman1700 likes this.
  4. Benetanegia

    Benetanegia Regular

    I didn't say that it is dual-GPU, yes or yes, based on the points I argued*. And I certainly didn't say or think of SLI. Maybe the surprise is they finally figured out multi-GPU rendering. I mean that they must have because Hopper is supposed to be chiplet design, but maybe they can actually do it now, and that is the surprise.

    *I don't think you guys understand what Occam's Razor means:

    https://en.wikipedia.org/wiki/Occam's_razor
     
    nnunn likes this.
  5. Kaotik

    Kaotik Drunk Member Legend

    This one, it's the same cooler without shroud, fans or pcb
    upload_2020-8-15_4-7-13.png
     
  6. Malo

    Malo Yak Mechanicum Legend Subscriber

    I think it's rather disappointing that the apparent halo 3090 has only 12Gb of VRAM.
     
  7. Benetanegia

    Benetanegia Regular

    That's the one. In my mind the PCB was sandwitched in between, somehow. But like I said, that cooler is for 3080 anyway as per your other pic and goes alongside a V shaped PCB which barely has space for 1 GPU let alone 2. So I was wrong on that.
     
  8. ShaidarHaran

    ShaidarHaran hardware monkey Veteran

    You and me both. How many years is it now with 12GB at the top of the product stack? Yeah, I know the Titan RTX has 24GB but I don’t consider a $3000 GPU a consumer product.

    Also, I can’t believe the GDDR6X rumor was true. Sounded like complete bs. I hope the double FP throughput rumor turns out to be true. 2080 Ti going up on ebay!
     
  9. troyan

    troyan Regular

    Didnt they do the same with GDDR5X? It was only accessible and used by nVidia.
     
    yuri, chris1515 and PSman1700 like this.
  10. trinibwoy

    trinibwoy Meh Legend

    You're right I never noticed that. Polaris was GDDR5 and all the high end stuff after that was HBM.
     
  11. Bondrewd

    Bondrewd Veteran

    No, G5X was an actual JEDEC spec.
     
  12. DegustatoR

    DegustatoR Veteran

    Do we have any idea on how FP64, FP32 and INT32 are scheduled on GA100? Can they all run in parallel?

    For reference:

    [​IMG]
     
  13. troyan

    troyan Regular

    A100 can schedule two FP16 vec2 operations concurrently on the FP32 and TensorCores.
     
    DavidGraham and Bondrewd like this.
  14. DegustatoR

    DegustatoR Veteran

    It can? Turing used TCs for all FP16 math AFAIK, there were no FP16 capability in main FP32 SIMDs. It also can't run FP32+INT32 and TCs concurrently.
     
  15. Samwell

    Samwell Newcomer

    Yes, because of that A100 has a 4xFP16 rate compared to FP32. Could you maybe use the normal FP32 and somehow FP32 from the tensor cores to double the theoretical throughput?
     
  16. Qesa

    Qesa Newcomer

    The tensor core has double the throughput of Volta all on its own; vector fp16 also doubling isn't surprising.
     
    PSman1700 likes this.
  17. Bondrewd

    Bondrewd Veteran

    Isn't FP32 from A100 TCs a non-IEEE one?
     
  18. Qesa

    Qesa Newcomer

    Yeah, missing 13 bits of mantissa
     
  19. DegustatoR

    DegustatoR Veteran

    This rate is maintained on all TC precision modes though which means that it's not coming from FP32 SIMDs, no?

    I see two possibilities for gaming Ampere here:

    1. Double width FP32 SIMDs which will likely lead to a double width of INT32 SIMD as well. They've done this previously between GP100 and GP10x.

    2. A second 16-wide FP32 SIMD in place of the FP64 one of GA100. But for that to work well they'll need to be able to schedule FP32+FP32+INT32 or it will be either FP32+FP32 or FP32+INT32 per clock which will result in utilization issues.
     
  20. troyan

    troyan Regular

Loading...

Share This Page

Loading...