Nvidia BigK GK110 Kepler Speculation Thread

Discussion in 'Architecture and Products' started by A1xLLcqAgt0qc2RyMz0y, Apr 21, 2012.

Tags:
  1. Tridam

    Regular Subscriber

    Joined:
    Apr 14, 2003
    Messages:
    541
    Likes Received:
    47
    Location:
    Louvain-la-Neuve, Belgium
    The 192 cores SMX configuration for GK110 has been confirmed by Nvidia.
     
  2. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    Who says GK110 is half-rate?

    http://www.nvidia.com/content/tesla/pdf/NV_DS_TeslaK_Family_May_2012_LR.pdf

    PCGH reports GK110 reaches 80-85% efficiency in DGEMM: http://www.pcgameshardware.de/aid,8...Us-auf-GTC-2012-vorgestellt/Grafikkarte/News/

    14 SMX * 192SPS * 2 FLOPs * 0.85 * 880MHz /4 = ~1000 GFLOPs DP
     
    #122 AnarchX, May 15, 2012
    Last edited by a moderator: May 15, 2012
  3. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
  4. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    The M2090 is rated at 665 GFLOPS double precision. If it's only achieving around half of that in DGEMM it would be a remarkable improvement if Kepler doubled efficiency in that case.
     
  5. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    With half-rate they could marketing a much higher gain over Fermi.
    Maybe it is 1/3 SP per SMX with only 4 of the 6 ALUs processing it, 64 DP-FMA per SMX.
     
  6. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
  7. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    Yeah, but the result includes the 2 FLOPs per SP.

    But I think 1/3 seems more likely. It would be processed like on GF100/110, the the third super-scalar executed ALU would be without work.
    So a bit over 1 TFLOPs DP could be reached with only ~700MHz to hold power down.
     
  8. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Big iron, as in, not my MacBook Air 11" running MS Word. :wink:
     
  9. A1xLLcqAgt0qc2RyMz0y

    Veteran

    Joined:
    Feb 6, 2010
    Messages:
    1,589
    Likes Received:
    1,490
    GK110 White paper

    Has a white paper been released for the GK110?
     
  10. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
  11. dkanter

    Regular

    Joined:
    Jan 19, 2008
    Messages:
    360
    Likes Received:
    20
    No offense, but you should get your facts straight:

    1. GK104 could have ECC on DRAM
    2. GK104 *does not* have ECC on the register files, L1 or L2, go check NV's website

    #2 is actually a very significant problem. Soft errors are much more problematic for on-chip SRAM than for DRAM. So having ECC on the DRAM really is just a marketing ploy.

    Also, I'd point out that GK104 still sucks for general purpose workloads. If you talk to anyone at Nvidia with half a brain and a shred of honesty, they will readily admit that for quite a few workloads, Fermi is better than GK104.

    GK110 is meant for computing, GK104 isn't.

    DK
     
  12. dkanter

    Regular

    Joined:
    Jan 19, 2008
    Messages:
    360
    Likes Received:
    20
    Very nice work : )

    DK
     
  13. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    If true then it's a pretty weird layout; 6 GPCs with 15 SMX?
     
  14. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Well, it's all about a good guess at this point, without some official spec's information. If not else, the two additional setup pipes are a welcome increase in the fragment output capacity, considering the vastly improved shader throughput, presented by all the 15 chubby multiprocessors.
     
  15. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
    Just like 192 was confirmed for GK104, while it was really 192+8? Is this also 192+something, or just 192 which of x can do FP64?
     
  16. Man from Atlantis

    Regular

    Joined:
    Jul 31, 2010
    Messages:
    961
    Likes Received:
    855
  17. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    I must have been missing something, but why exactly is it 192+8 on GK104?



    http://www.behardware.com/articles/857-2/review-nvidia-geforce-gtx-680.html
    [/FONT]
     
  18. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    It won't take long until the funky go merry rumors appear that the chip actually has N clusters, yet they hammered out X of them for 15 to remain only in the end :twisted:
     
  19. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
    And according to several other sites like http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2
     
  20. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    The omission of ECC on-die was made more obvious when Nvidia listed the feature set of the GK110 against GK104. There must be a subset of HPC that can tolerate transient errors at the level of errors to be expected of gamer GPU SRAM, which may not be held to the same error rates a chip like Opteron would.

    Is this a reactionary move to guard big Kepler's underside from Tahiti or its successor?


    A fair number of the features outlined are understandably included in GCN's roadmap for now, soon, or very soon, though it could be just one of a number of instances where AMD gets to its own starting line later.
     
    #140 3dilettante, May 16, 2012
    Last edited by a moderator: May 16, 2012
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...