NVIDIA Maxwell Speculation Thread

Discussion in 'Architecture and Products' started by Arun, Feb 9, 2011.

Tags:
  1. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    Not really what? Sure always over 50% of what I write on speculation threads ends up wrong.
    I was thinking that in the past, GF100 ended up in laptops.
     
  2. tviceman

    Newcomer

    Joined:
    Mar 6, 2012
    Messages:
    191
    Likes Received:
    0
    GK104 had 4x the cores of GK107. I think its as good a guess as any to say GM104 will follow that pattern. Not sure about memory bus and ROP's though.... GK104 really needed 7ghz vram to shinez and even then was still somewhat ROP limited. Maxwell's bus config may also depend on whether gddr6 is ready for use.
     
  3. Picao84

    Veteran Regular

    Joined:
    Feb 15, 2010
    Messages:
    1,528
    Likes Received:
    687
    For what is worth, I found that speculation rather unexciting for the type of efficiency that is being hyped up for Maxwell. If GM206, GM204 and GM200 would be 28nm parts sure, it looks nice, but with a die shrink to 20nm, not so much. Unless they are a complete revolution in power consumption (say GM200 with 200W TDP),having a GM200 performing between GTX Titan SLI and GTX 770 SLI would be par for the course, comparable to the jump between GF110 and GK110. The same thing for a GM204 performing like GK110. After all GM107 is only being deemed impressive because of its efficiency. If that efficiency does not scales up to other levels, as the table seems to not expect, Maxwell will be nothing special from the performance point of view.
     
    #923 Picao84, Feb 14, 2014
    Last edited by a moderator: Feb 14, 2014
  4. DSC

    DSC
    Banned

    Joined:
    Jul 12, 2003
    Messages:
    689
    Likes Received:
    3
  5. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    771
    Likes Received:
    200
    "Maxwell 1st Generation"? So maybe the core configuration will change in the 2nd generation?
     
  6. DSC

    DSC
    Banned

    Joined:
    Jul 12, 2003
    Messages:
    689
    Likes Received:
    3
    Yeah, so Maxwell 20nm/16nm FINFET might have more performance per watt.

    [​IMG]

    [​IMG]

     
    #926 DSC, Feb 14, 2014
    Last edited by a moderator: Feb 14, 2014
  7. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,486
    Likes Received:
    397
    Location:
    Varna, Bulgaria
    So, the (sub)multiprocessor configuration is confirmed to be 128 ALU lanes?
     
  8. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,491
    Likes Received:
    909
    Or maybe just the process, which would still increase performance per watt.

    Yes, at most (and it's unlikely to be less).
     
  9. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,749
    Likes Received:
    2,515
    The Maxwell block shows 128 core alright , but the Kepler block shows 256! .. why is that? I thought they should be 192? are they counting special units?
     
  10. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    "Displayport 1.2 (Optional)"
    I hope it gets common.
     
  11. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,416
    Likes Received:
    178
    Location:
    Chania
    192 SPs FP32 + 64 SPs FP64 (GK110) = 256
     
  12. DSC

    DSC
    Banned

    Joined:
    Jul 12, 2003
    Messages:
    689
    Likes Received:
    3
    Kepler already supported DisplayPort 1.2 but for these cards Nvidia is leaving it up to the AIBs instead of making them mandatory, sigh.

    Would rather have 3 DP 1.2 + 1 HDMI 1.4b(not sure if Maxwell has HDMI 2.0 support) as standard rather than outdated DL DVI-D and VGA.
     
  13. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    One DL-DVI-I is nice to have as you can use 2.5K DVI displays and VGA displays (CRT, LCD, projector) with no adapter or cheap passive adapter. Then DP can be used for a second such monitor (or even a third one with a MST hub).
     
  14. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    In a scaled down mobile device, make one SMM with two sub-blocks instead of four?
    I'm picturing such GPU with a dual core Cortex-A12, 16bit LPDDR4 or LPDDR3.

    Maybe it doesn't make sense because at that point "control logic" and the front-end before it use a great deal of area already. But anyway some < 1W stuff for embedded, low end phones would be interesting and useful.
    ROFL I guess that eventually something can sit on a bicycle handlebar, displaying Google Earth on an OLED or other display while being powered by the bike itself.
     
  15. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,749
    Likes Received:
    2,515
    Thanks .. one more thing, the Maxwell block shows what seems to be an increase in control logic area .. we know Kepler had a (66%) hardware scheduling and (33%) software (just like GF104/114). Does that mean Maxwell will restore that back to 100%?
     
  16. Novum

    Regular

    Joined:
    Jun 28, 2006
    Messages:
    335
    Likes Received:
    8
    Location:
    Germany
    Your percentages are weird. Kepler had four Warp schedulers per SMX and each one could issue two instructions per clock if they are independent. So how it gets 6 instructions to issue is flexible.

    Anyway, yes, it seems that Maxwell does away with co issue and each CU has one scheduler for 32 ALUs.
     
  17. spworley

    Newcomer

    Joined:
    Apr 19, 2013
    Messages:
    146
    Likes Received:
    190
    NVidia released CUDA 6.0 RC this week. Allanmac from the NVIDIA CUDA forums and I were poking through the new include files. An interesting addition is in the new cuda_occupancy.h include.

    It shows that sm_50 increases the maximum number of resident blocks per SM to 32 (from sm_35's 16). We don't know if the upcoming GM107 is sm_50 or not, but it doesn't seem likely.

    The more interesting detail is the reveal of a new architecture type, sm_37, with a different minimum shared memory size per SM of 80K (81920 bytes). Current sm_30 and sm_35 maximum shared memory is only 48K (and its minimum is 16K when you configure it to prefer L1). This sm_37 device is labeled as "GK210".

    Finally, sm_50 may not have an L1/shared memory split. This is a tenuous conclusion based on the fact that in this include file, sm_50 does not use the L1/shared cache hints at all, unlike older architectures.

    sm_50's shared memory size is not listed in the occupancy include file.

    It's hard to interpret this, but such tenuous clues are great fodder for speculation threads such as this. And any sm_50 predictions are likely especially shaky.

     
  18. itaru

    Newcomer

    Joined:
    May 27, 2007
    Messages:
    156
    Likes Received:
    15
    GK210=GK20A=tegra K1 ??
     
  19. DSC

    DSC
    Banned

    Joined:
    Jul 12, 2003
    Messages:
    689
    Likes Received:
    3
    Does this mean Maxwell will have 96KB or 128KB configurable L1 cache?

    GM107 is Maxwell, the slide clearly states it.
     
  20. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,416
    Likes Received:
    178
    Location:
    Chania
    What is really awkward in those charts is that they're comparing a GK110 cluster with a GM107 cluster; I do get the point the slide is trying to make even with 192 vs. 4*32. Are there no dedicated FP units in Maxwell or was some of the marketing guys just to overeager and thought 256 look "prettier" on the left?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...