Nvidia Post-Volta (Ampere?) Rumor and Speculation Thread

Discussion in 'Architecture and Products' started by Geeforcer, Nov 12, 2017.

Tags:
  1. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    None of those two families had yet-another-execution-ressource coupled to the data fetch and delivery. At some point you certainly will cross the threshold, where your functional overhead in the command-and-control section is a greater cost to carry than the effort to design a distinct one for lesser featured mass products. Maybe this point was just here?
     
    Kaotik likes this.
  2. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    Thats true but wasn't the FP32/64 units still present in both products? Its just the ratio that was different. Having to no longer route data to an execution unit that is no longer there is different than just changing ratios. Well we'll find out eventually. FWIW on this page: https://devblogs.nvidia.com/parallelforall/inside-volta/ in the diagram for an SM the tensor cores take up a lot of space and seem to count 64 'squares'... don't know if it translates to large die size. Finally why throw out all that R&D money by not basing a product on the basic premise of Volta? I mean you already invested R&D, and claim gains over your previous architecture why skip a product based on it?
     
    ImSpartacus likes this.
  3. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Also need to put into context that Tesla is made up of more than just a P100 and V100, both of which are much more distinct to the rest of the family that has a closer semblance between Tesla/Quadro/Geforce - exception being the Quadro GP100.
     
  4. McHuj

    Veteran Regular Subscriber

    Joined:
    Jul 1, 2005
    Messages:
    1,442
    Likes Received:
    560
    Location:
    Texas
    I doubt it. I expect both Apple chips in 2018 to be 10nm.
     
  5. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    523
    Likes Received:
    239
    Yeah, there's also P40 and P4, but people forget about them.
    N7SOC goes HVM H1 2018.
    So yeah, your expectations are weird.
     
  6. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,496
    Likes Received:
    910
    2.2/2.6 GHz base/turbo on the Centriq 2460 also sounds (more than) adequate for a high-end GPU.
     
  7. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    523
    Likes Received:
    239
    That's a CPU with unknown number of pipeline stages (QC is pretty cagey on details of Falkor (or not, I forgot the HotChips slide deck)).
    You need to fab something high-performance like desktop CPU to find the fmax.
     
    #47 Bondrewd, Nov 18, 2017
    Last edited: Nov 18, 2017
  8. el etro

    Newcomer

    Joined:
    Mar 9, 2014
    Messages:
    95
    Likes Received:
    12
    But big GPUs/FPGAs and Phone SoC CPU cores are far different. Bigger GPUs/FPGAs begs for a true HP node.
     
  9. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,496
    Likes Received:
    910
    Apparently Falkor has 10 to 15 stages. That's admittedly slightly vague, but Qualcomm provides some latency figures for the various caches, and they're quite tight.

    I'm not saying the 10nm process (Samsung's, in this case), is appropriate for desktop GPUs, I'm saying I have yet to see some justification as to why it isn't that amounts to more than hand-waving.
     
  10. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    A ratio of 1:2 vs 1:32 still means that the SM layout will be very different.

    It’s only a tiny little bit different, not very different. :)

    The size of the multipliers are not linear. FP64 is not just 2x FP32 either.

    Agreed.

    It’d be kinda fun to see the reactions of a bunch of people lose their minds about a 7nm chip still just being Maxwell based (Ampwell!) with some minor changes to address a few flaws and still see it beat the competition. The Maxwell design was really very good.

    But I hope we’ll something based on Volta. It’s just more interesting.
     
    A1xLLcqAgt0qc2RyMz0y likes this.
  11. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    Fair enough.
    I just meant having a huge gap in middle of an SM of unused real estate would be quite silly, and moving every thing around to have maximum occupancy can be quite tricky.
    I know it not linear although I haven't got around to studying dada multipliers yet.
    Were the higher clocks in pascal purely due to process differences? If not I'd say maxwell was good and Pascal was very good.
     
  12. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    523
    Likes Received:
    239
    Ye. Moving from planar to FinFETs was surely nice.
     
  13. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    There are a lot of different multiplier architectures, but if you’re not doing iteratively, their number of gates is quadratic with the size of the mantissa.

    Process is the biggest part of it. And, per Nvidia, “path optimization”. (See https://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/6 )
     
  14. el etro

    Newcomer

    Joined:
    Mar 9, 2014
    Messages:
    95
    Likes Received:
    12
    I expect Ampere based on Volta.
     
  15. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    BTW I remember reading that the Drive PX Pegasus was shown at GTC Europe 2017. It was said it was based on a post Volta architecture... so does anyone think that is/is related to Ampere?
     
  16. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    773
    Likes Received:
    200
    According to Fudzilla, the Volta successor is for machine learning while the GeForce line will be targeted by a separate architecture, both set for 2018.

     
  17. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    299
    Location:
    UK
    Irrespective of codenames, I wonder how many new GPUs we're getting on 12nm, versus parts of the line-up waiting for 7nm? And whether 7nm will start on a monster GPU (like 16nm started on P100 and 12nm on V100) or if NVIDIA will handle that differently this time?

    I think they believe deep learning and HPC will be a more competitive market than gaming, and more power sensitive (rather than area sensitive) which gives a greater incentive for 7nm on the monster GPU first. But I don't know whether it's worth to refresh the entire line-up on 12nm when 7nm is coming relatively soon?
     
    Lightman and CarstenS like this.
  18. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    I think Nvidia is possibly caught between nodes soon if they delay much longer with regards to 12nm for their next gen *shrug*.
    The V100 and Titan needed launching within a specific window and still early enough to make sense; It could be Nvidia does not want to get caught splitting a model range between nodes for the whole Tesla/Quadro/Geforce that has synergy.
    But tbh Volta was always presented primarily as an HPC-DL model, with the presentations I have anyway and doing a quick search all roadmaps on tech sites with Volta show it with or part of the SGEMM/DP context slides

    May be semantics, one could argue the Tesla P100 and Quadro GP100 are a distinct architecture from the rest of the Pascal line; so with talk of a different architecture to Volta it may this differentiation and/or possibly a finfet node change, there is that looming shadow of the new architecture as well but feels too early to me (not seen any notable reference in Tesla presentations).

    And any successor to Volta in HPC-scaled out DL space will IMO still be collaboration with IBM, albeit implemented in distinct platforms such as Tegra just as we see now.
    Another change could be launch cycle for the platforms where in the recent past the Tegra followed the accelerator/GPU by 6 months and that usually the development kit-sampling with select clients.
     
    #58 CSI PC, Dec 22, 2017
    Last edited: Dec 22, 2017
  19. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    Given the competitive landscape, a GP102 shrunk to 12nm (and possibly outfitted with GDDR6, depending on the rework necessary for memory controllers) should carry them just fine from a business perspective until 7nm is at a yield level viable for price sensitive consumer markets.
     
  20. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    523
    Likes Received:
    239
    I have a weird feeling that won't happen until EUV insertion.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...