Nvidia Pascal Reviews [1080XP, 1080ti, 1080, 1070ti, 1070, 1060, 1050, and 1030]

Discussion in 'Architecture and Products' started by Love_In_Rio, May 17, 2016.

  1. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,003
    Likes Received:
    51
    It's true that we don't have all the data for many of these reviews, though I don't think the boost clock differential can account for this great of variability, unless a reviewer disabled boost on their 1070 sample altogether and did not do the same for their 1080. 33% is the maximum difference of any functional unit comparing 1070 to 1080, a few MHz here or there can't account for the remaining 17-18% seen in some of those other tests, it would need to be quite a large difference (several hundred MHz).
     
  2. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,935
    Likes Received:
    2,268
    Location:
    Germany
    I can only speak for our review, were we made sure the clocks are as consistent as the benchmark runs themselves (i.e. very low variability) and from what I saw at a quick glance (didn't do all the excel math), we have no case among the gaming tests where the 1080 is more than 29% faster than the 1070. In synthetics, I think 33% is occuring once.

    We also have two outliers (one being the higher texture bandwidth rate for the 1070 with 8 black textures, repeatable, might be cache-line related), and one for which i await feedback from Nvidia, where the 1070 is in some Luxmark scenes (from 3.0 and 3.1, not in the online review) faster than the 1080, which normally should not be possible, but is also repeatable.

    But yes, boost clock difference should not account for massive performance differences.
     
    pharma likes this.
  3. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    384
    Likes Received:
    389
    Why shouldn't that be possible? More favorable ratio of L2 size to active threads possibly, coincidentally hitting a sweet spot.
     
  4. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,935
    Likes Received:
    2,268
    Location:
    Germany
    Should've written: Normally not possible.
     
  5. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,003
    Likes Received:
    51
    You're with PCGH? Great work you guys do :yes: I'd seen that Luxmark anomaly in another thread I posted about this question on Tech Report's forums, a user there cited HotHardware's Luxmark results. Strange indeed.
     
    CSI PC and CarstenS like this.
  6. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,003
    Likes Received:
    51
    Is L2 in Pascal located outside of the GPC and also a portion not disabled for the GP104 found in 1070?
     
  7. Ryan Smith

    Regular

    Joined:
    Mar 26, 2010
    Messages:
    623
    Likes Received:
    1,095
    Location:
    PCIe x16_1
    L2 is located with the ROPs and memory controllers. 1080 and 1070 are both fully enabled in that respect, so both get 2MB of L2.
     
    ShaidarHaran likes this.
  8. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,003
    Likes Received:
    51
    Thanks, makes sense. How's that AT 1080 review coming along? I'd probably have known the answer to my question already if it was here... ;)
     
  9. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,003
    Likes Received:
    51
    Update: looks like I don't know how to calculate. 1 GPC being disabled means the shader and geometry processing differences between 1070 and 1080 are identical, unlike the figures I was working with (28.75% and 37% differences, respectively - it should be 37% across the board).
     
  10. Clukos

    Clukos Bloodborne 2 when?
    Veteran Newcomer

    Joined:
    Jun 25, 2014
    Messages:
    4,513
    Likes Received:
    3,871
    pharma likes this.
  11. xEx

    xEx
    Regular Newcomer

    Joined:
    Feb 2, 2012
    Messages:
    939
    Likes Received:
    399
  12. Clukos

    Clukos Bloodborne 2 when?
    Veteran Newcomer

    Joined:
    Jun 25, 2014
    Messages:
    4,513
    Likes Received:
    3,871
    Physics is always CPU

    i7-4790K vs i7-5960X
     
    I.S.T., CSI PC, Razor1 and 2 others like this.
  13. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,538
    Likes Received:
    2,226
    A1xLLcqAgt0qc2RyMz0y and Razor1 like this.
  14. homerdog

    homerdog donator of the year
    Legend Veteran Subscriber

    Joined:
    Jul 25, 2008
    Messages:
    6,270
    Likes Received:
    1,038
    Location:
    still camping with a mauler
    The 1070 looks like a really great card and a solid upgrade from my 970. Still I'll probly wait for Volta (1170?)
     
  15. Rys

    Rys PowerVR
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,164
    Likes Received:
    1,461
    Location:
    Beyond3D HQ
    I've been playing around with FP16 throughput in CUDA with a willing 1080 owner, in a little benchmark thing I've been working on for a couple of days. Might release the source, might not, but here's the quick documentation I wrote.

    https://gist.github.com/rys/f427c0a85fcc367087c40fd8ffbdccb7

    Nothing really new, other than the performance data near the end.
     
    Lightman likes this.
  16. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    So that means most upper Maxwell 2 cards are faster at FP16 than GTX1080?
    That is I thought they had a 1:1 relationship, or was that just certain Maxwell models?

    Rather strange logic Nvidia has applied as older models outperform this generation - if right that some of the Maxwell models could do 1:1 for FP16.
    I guess we will not know for sure what is going on until the GP102 is released for each sector; Tesla-Quadro-Titan.
    Any chance you can get your hands on a Tegra X1?

    Cheers
     
  17. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,875
    Likes Received:
    767
    Location:
    London
    Since I'm on a roll

    https://forum.beyond3d.com/posts/1886240

     
  18. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,380
    Rys' code mentions compute level 5.3 (Tegra X1) and higher.
    Maxwell 2 is compute level 5.2.

    IOW: Maxwell 2 doesn't have native FP16 at all.

    At least that's how understand it.
     
  19. pixelio

    Newcomer

    Joined:
    Feb 17, 2014
    Messages:
    47
    Likes Received:
    75
    Location:
    Seattle, WA
    No... sm_50 and sm_52 do not have this capability.

    If so you would just be able to access half-words and would achieve 64 ops/clock/SMM.

    So 2*M*N fp16 vs. 1*M*N fp32 takes 6x more time?

    A simulated (necessary for pre-sm_53) half2 FMA takes 11 instructions but you get 2 fp16 FMAs per thread.

    Sounds like 6 ops per fp16 to me.

    One way to discover more about what's going on under the hood is to not perform FMAs but just MULs.

    If F2F conversion is happening then you'll be skipping unpacking/packing of the addend.
     
    #279 pixelio, Jun 1, 2016
    Last edited: Jun 1, 2016
    CSI PC and spworley like this.
  20. homerdog

    homerdog donator of the year
    Legend Veteran Subscriber

    Joined:
    Jul 25, 2008
    Messages:
    6,270
    Likes Received:
    1,038
    Location:
    still camping with a mauler
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...