Nvidia Turing Product Reviews and Previews: (2080TI, 2080, 2070, 2060, 1660, etc)

Discussion in 'Architecture and Products' started by Ike Turner, Aug 21, 2018.

  1. Frenetic Pony

    Regular Newcomer

    Joined:
    Nov 12, 2011
    Messages:
    291
    Likes Received:
    74
    Ow, pricepoints. That's really what Nvidia should be concentrating on for the next arch, rather than new features. There's so much overlap of function in the silicon here.

    But, well, at least it's something in the $2XX price range. So they've got that going for them.
     
    vipa899 likes this.
  2. vipa899

    Regular Newcomer

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    Agree, price is the only problem with nvidias gpu's. They need competition from AMD and Intel, if they come with products at about the same performance and features for human prices they will have to adjust.
     
  3. Ryan Smith

    Regular Subscriber

    Joined:
    Mar 26, 2010
    Messages:
    596
    Likes Received:
    941
    Location:
    PCIe x16_1
    Thanks for pointing that out. I forgot to edit that after NVIDIA confirmed the dedicated FP16 cores and how they work.

    There are numerous good reasons to have the FP16 rate be 2x the FP32 rate, even when using tensor cores. This includes register file bandwidth and pressure, and consistency with Turing parts that don't get tensor cores (since NV has to lay down dedicated FP16 cores on those parts).

    IMO, the whitepaper didn't do a very good job of explaining it. But according to NVIDIA, for TU102/104/106, general (non-tensor) FP16 operations are definitely done on the tensor cores. They are part of the SMs, after all.
     
  4. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,004
    Likes Received:
    109
    Big Turing doing fp16 as part of the tensor core is rather intriguing, but makes sense I suppose. It's just a bunch of fp16 multipliers and adders after all. For non-matrix operations you basically only need 1/4 of them, without any complex cross-lane wiring.
    In that sense dedicated fp16 cores would really be just the remains of the tensor cores.
    I'm wondering though actually what fp16 operations turing can do with twice the rate of single precision, that is, can they do more than mul/add/fma? Obviously for the tensor operations you don't really need anything else, but otherwise things like comparisons would be quite desirable.
     
    Heinrich4 likes this.
  5. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,613
    Likes Received:
    2,222
    I am still quite lost on this. Let's give an example, Far Cry 5 supports RPM, Vega does it on the ALUs, the 2080Ti does it on the Tensor Cores? If so then how is it able to maintain 2x FP32 rate? Are the tensor cores capable of such feat?
     
  6. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    9,547
    Likes Received:
    4,210
    The tensor cores in "Big Turing" can do "linear" (non-matrix) FP16 at 1/4th their matrix op rate.
    It looks like the dedicated FP16 units in TU116 are stripped down tensor units.

    AFAIK Far Cry 5 doesn't support RPM per se, it just uses FP16 pixel shaders. Vega (and GP100/GV100) uses RPM to process FP16 at 2x FP32 rate, Turing does it differently.
     
    entity279, pharma and Ryan Smith like this.
  7. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,613
    Likes Received:
    2,222
    Thanks. But If Big Turing uses only Tensor cores for FP16, and the tensor cores do it at quarter of their matrix capability then Turing isn't really capable of 2x FP32.
     
  8. entity279

    Veteran Regular Subscriber

    Joined:
    May 12, 2008
    Messages:
    1,210
    Likes Received:
    403
    Location:
    Romania
    Depends of just how many tensor cores there are, right ?
     
  9. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,693
    Likes Received:
    107
    Yeah, seems like a decent enough card, but nobody wants to buy a 6 GB card in 2019 for $280 regardless of what benchmarks show. Pretty out of touch...
     
  10. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,613
    Likes Received:
    2,222
    Precisely my point.
     
  11. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,010
    Likes Received:
    1,709
    Location:
    Finland
    How so? Big Turings Tensor OPS rate is 8x FP32, doing FP16 on those at quarter of matrix speed would result in 2x FP32
     
    DavidGraham likes this.
  12. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    15,570
    Likes Received:
    4,469
    Hmmm, so the 1660 Ti is basically similar to a 1070 in performance (sometimes a little faster, sometimes a little slower) with slightly lower power consumption and slightly higher noise levels? Oh and 2 less GB of memory (6 GB vs. 8 GB).

    Not bad, although you can still occasionally find 1070's at 299 USD (one on Newegg right now) which may or may not be a better deal. Of course, eventually those will all disappear leaving just the 1660 Ti's.

    Regards,
    SB
     
    BRiT likes this.
  13. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,613
    Likes Received:
    2,222
    It seem I somehow missed that fact. Though this has the implication of limiting DLSS performance in games that heavily utilize FP16 shaders.
     
    #753 DavidGraham, Feb 23, 2019
    Last edited: Feb 23, 2019
  14. troyan

    Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    119
    Likes Received:
    179
    No, Tensor operation will always run alone. DLSS is post processing AA which runs after the creation of the frame.
     
  15. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,010
    Likes Received:
    1,709
    Location:
    Finland
    I believe this would be correct
    Not sure how that changes anything, the time spent on DLSS as post processing the tensor cores could be already crunching FP16 shaders for next frame - it all depends on the loads
     
  16. troyan

    Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    119
    Likes Received:
    179
    Future workload would be overlapping with the current frame creation.
     
  17. jlippo

    Veteran Regular

    Joined:
    Oct 7, 2004
    Messages:
    1,276
    Likes Received:
    358
    Location:
    Finland
    Didn''t Jensen implicate that rest of the GPU would idle when tensor cores are active?
     
  18. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,010
    Likes Received:
    1,709
    Location:
    Finland
    If my memory serves me correctly, this only applies to DXR denoising, not tensor cores in general?
     
  19. troyan

    Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    119
    Likes Received:
    179
  20. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,613
    Likes Received:
    2,222
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...