Nvidia Ampere Discussion [2020-05-14]

Discussion in 'Architecture and Products' started by Man from Atlantis, May 14, 2020.

Tags:
  1. Benetanegia

    Benetanegia Regular

    There's a lot more to it than just FP or INT...

    And for what it's worth it's 55% faster on TPU:

    [​IMG]

    I don't know if it's system bottleneck for some strange reason or just scene selection, which is actually a huge difference that reviewers need to take into account and can't always get it right.
     
    PSman1700 likes this.
  2. CarstenS

    CarstenS Legend Subscriber

    That's assuming, you are fully limited by FP32-throughput in the Witcher 3. You can only expect linear scaling with FP32 throughput when you're limited by it all the way. Apparently, here's some other limitations as well: RTX 2070 and 1080 perform identical.
     
  3. DegustatoR

    DegustatoR Veteran

    Well, let's see.
    2080Ti has ~16 tflops FP32 at 1.8GHz boost.
    Let's say that about 17% of TW3 math on it is handled by INT h/w - this results in ~18.7 tflops in Ampere metrics.
    3080 @ 1.8GHz is about 31.3 tflops which is about 167% of 18.7.
    So in the absolute best case of scaling here you should be getting +67% but it is closer to 3/5th of that on practice.
    (And +55% from above is actually pretty close to +67% theoretical maximum.
    Edit: Actually, scratch that, it's +32% for 3080 there, not +55% - which is for 3090 OC card.)

    Why? Who knows. Maybe it's limited by memory bandwidth or the CPU or the data loads aren't fast enough for TW3 or something else.
    This isn't that surprising for a game from 2015 running on DX11 really.

    Not in any official capacity since it's D3D11. But I doubt that it would be of much help - if it's already not compute limited then moving compute to async won't improve anything.
     
    Last edited: Oct 1, 2020
    PSman1700 likes this.
  4. Scott_Arm

    Scott_Arm Legend

    @DegustatoR Yah, I understand why older games get benchmarked if they're popular. People want to know that the game they play will run faster. But they're not a particularly good way to analyze newer gpu architectures in terms of scaling and performance.
     
    nnunn and PSman1700 like this.
  5. Scott_Arm

    Scott_Arm Legend



    tldr: There's nothing wrong with the capacitor configurations. The new driver has the same performance, but eliminates crashes. The clock frequency vs voltage curve is nearly identical and power consumption is nearly identical. There were minor tweaks, probably in the boosting algorithm so it wouldn't change quite as rapidly. The linux driver was always stable. It was the windows driver that had crashing problems.
     
    Last edited: Oct 1, 2020
    Pete, Cyan, Lightman and 7 others like this.
  6. trinibwoy

    trinibwoy Meh Legend

    Why would anyone expect games to scale perfectly with FP32? Did something happen recently where bandwidth, fillrate, geometry, texturing etc doesn't matter any more?
     
    Lightman, xpea, nnunn and 8 others like this.
  7. Digidi

    Digidi Regular

    @trinibwoy What you hear from the experts, everybody is saysing that we are havely shader bound. That's why i was suprised, that the real word scaling wasn't as good as data looked on the paper.
     
  8. Scott_Arm

    Scott_Arm Legend

    I think people expect that generation to generation gpus will increase performance in a particular ratio ie. if you double alu you also double rops and texture units. The problem is memory bandwidth. GPUs are going through what CPUs have been going through for a long time. Advances in processor performance are outpacing memory performance significantly. At some point gpu advancements are going to get very hard unless there's a memory breakthrough. The rops are high-bandwidth consumers, so it'll get harder to keep adding rops without faster memory. Maybe shaders start to get longer and more complex simple because writing short shaders will bottleneck other parts of the gpu. Right now game engines are transitioning away from object-oriented designs that are not cache friendly, purely to get around how slow memory is. I'm not as knowledgeable about how shaders tend to be written, but I imagine they're already largely performance focused in that way.

    People expectations will have to adjust with the reality that future gpus will probably not scale the same way past gpus have.
     
    pharma and iroboto like this.
  9. DegustatoR

    DegustatoR Veteran

    "Shader bound" isn't the same as "FP32 math bound" though. Shaders can be bandwidth limited and in case of simpler shaders from a 2015 engine this is the most likely scenario.
     
    Picao84, pharma and PSman1700 like this.
  10. Scott_Arm

    Scott_Arm Legend

    Yah, my understanding is instruction cache is small so most games have short shaders as an optimization, which tends to lead to them being bandwidth bound. Ampere doubled L1 cache, but I'm not sure if that's data or both data and instruction.
     
  11. Well in some console forums, these two apparently don't matter anymore.
    j/k
    :)
     
    Lightman likes this.
  12. CarstenS

    CarstenS Legend Subscriber

    Even if you go just a nuance above what the electrical design of a card can handle, it crashes. If you dial back that very nuance and keep the card inside it's safety margins, that means that you were too optimistic with the combination of your v/f curve and the cards electrical properties in the first place.

    Good for Nvidia and their customers, that it apparently was just a nuance too much and they could fix it without percetible performance regression. That fact that some cards were more prone than others to crashing speaks to it, that there was an electrical problem in the first place and to what it was related to.
     
    Cyan, Lightman, Ext3h and 2 others like this.
  13. Rootax

    Rootax Veteran

    From what I've watched on Youtube, that's not sure I believe. Like some had Asus crashed a lot, FE, etc... So in the end I'm not sure that some models are more impacted than others. Maybe it was just the cards more sold / available...
     
    PSman1700 likes this.
  14. Digidi

    Digidi Regular

    Lightman likes this.
  15. DegustatoR

    DegustatoR Veteran

    I think it's more than just a card model, PSUs and other system components play their role too here. Which is why some models crashed for some people while being rock stable for others.
     
    Rootax and PSman1700 like this.
  16. Scott_Arm

    Scott_Arm Legend

    Watch the hardware unboxed video. A crashing card would not crash in Linux. The windows driver had boosting behaviour issues that could cause power spikes that would crash the card. Changes to the frequency vs voltage curve are negligible. The cards are now stable in windows with corrections to the boosting behaviour with essentially zero performance loss.
     
    Cuthalu and PSman1700 like this.
  17. Digidi

    Digidi Regular

    @Scott_Arm Linux driveres i think have less performancen than windows drivers. How many people use linux for a gaming card? I think we speeak about a low % rate. All benchmarks are done on Windows, Nvidia wanted to shine, that's why they got to the limit of the silicon.

    And in this case i belave Igor much more than hardwareunboxed. Igor analyse everything wiht expensive test equipment, because he is a electrical engineer which do special work for electrical company he knows what he is talking about.
     
    Last edited: Oct 1, 2020
    Lightman likes this.
  18. pharma

    pharma Veteran

    Seeing is believing, I suggest you watch the video. At the time people had crashes someone using the quadro driver experienced no crashes.
     
    Cuthalu and PSman1700 like this.
  19. Digidi

    Digidi Regular

    @pharma quadro drivers are not made for high performance. Quadro was alwys made for satbility.

    At igorslab findings you can clearly see that they lowerd the peak of ther power consumtion and that they also lowerd the voltage/clock curve.
     
    Lightman likes this.
  20. Scott_Arm

    Scott_Arm Legend

    All cards were affected to some degree, whether they used mlcc capacitors or not. They came up with a zero cost fix in software without making any noticeable adjustments to the voltage vs frequency curve. The issue was not seen in their linux drivers. This looks like they windows driver was pushing the boosting behaviour a little too far in windows. You design the software around the hardware, not the other way around.
     
    Cuthalu, PSman1700 and pharma like this.
Loading...

Share This Page

Loading...