Tensors! *spawn*

Discussion in 'Architecture and Products' started by 3dilettante, Jun 2, 2017.

  1. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    I am sure I read some documentation/paper on this and if I get the time will try and find it, pretty sure some time ago I mentioned on here but going to be a pain to find.
     
    #41 CSI PC, Mar 23, 2018
    Last edited: Mar 23, 2018
    Arun likes this.
  2. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    Do you mean that each sub-core is divided in 4 groups of 4 FMAs basically? Would be very interested if there was anything hinting at that!

    If you mean the Hotchips presentation on Volta, that split each SM in 4 sub-cores, but doesn’t split the sub-cores further IIRC
     
  3. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Hotchips comes to mind but there was also another document/paper on the subject, started to try and trawl through either what I was given or some of the papers/presentation I read, sadly like a needle in a haystack.
    It was more on the operation-instruction you describe written/read twice data, I would not like to say if it explained too much detail on the implementation until I can find it - is bugging me :)
     
  4. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    786
    Likes Received:
    215
    NVIDIA didn't announce any new architectures at GTC 2018, and I'm still interested in what analysis and speculation you can come up with.
     
    ImSpartacus likes this.
  5. Tarkin1977

    Joined:
    Mar 10, 2018
    Messages:
    3
    Likes Received:
    6
  6. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    It does not seem to be using the Tensor cores if one looks at the results between V100 and Pascal GPUs; can be understandable as not everything can be reduced-optimised to fp16 but worth noting Google TPU2 is also only mixed precision so the direction and momentum is going that way.
    Looking at other TensorFlow results the gap is much larger.
    Table results at bottom for some general testing without optimisation: https://www.pugetsystems.com/labs/h...s-and-Testing-of-FP16-for-Deep-Learning-1141/

    But yeah there are caveats to getting the most out of the Tensor cores and how used, and nice to see AMD improving from TF1.1, crux is how quickly they can get up to TF 1.6 support.
     
    OlegSH likes this.
  7. Nate etc.

    Joined:
    Feb 15, 2018
    Messages:
    1
    Likes Received:
    1
    I think you're referring to something a research team at Citadel did, and they tracked down the tensor core threads/matrices/registers mapping (for their part, NV refers to the fragments' identity and locality as undefined). They did a presentation of this at one of the recent GTCs, though not Hot Chips IIRC.

    I tried extrapolating from that paper and NVIDIA's various dev documentation on the topic, though I'm not confident in the resulting accuracy.
     
    pharma likes this.
  8. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Thanks for the link Nate.
    I recognise the names and seen some of their work before, maybe one of their earlier pre-publish papers without geometry aspects *shrug*.
     
  9. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,528
    Likes Received:
    2,215
    Accelerating Reduction and Scan Using Tensor Core Units
    November 23, 2019
    https://arxiv.org/pdf/1811.09736.pdf
     
    Lightman likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...