Nvidia Volta Speculation Thread

Discussion in 'Architecture and Products' started by DSC, Mar 19, 2013.

Tags:
  1. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Volta was always meant to be 2017.
    The 2014 'Whitepaper' had it as 2017 (still have a copy), presentations from IBM has it as 2017, HPC presentations from Nvidia has it as 2017, the announcement for several of the 1st Volta supercomputers had it as 2017, one of the project workflows-milestones has 2017 (not saying will complete as these are massive projects with much training/code optimisation/multiple diverse technologies to be implemented/etc), and one of those labs (Oak Ridge National Labs) even in 2016 just after the launch of Pascal in the description section of their Youtube said 2017 for their Volta implementation.
    We have also been given the actual revised performance spec now from the initial estimate for both Xavier and also 2 of the supercomputers that Nvidia is contractually obliged to do.
    Xavier the 'Tegra' version of Volta will be in limited manufacturing and sampling status very early Q4, Nvidia launch the Tegra model after the dGPU but usually announce it 1st.

    The 2018 date came about because of some rumours it would be 10nm, and others looking at the more general product roadmap slide to suggest that was explicitely saying must mean it will launch 2018.

    Here is the Oak Ridge National Labs release importantly after Pascal launched.
    June 28th 2016:
    Need to go to the Youtube link itself to see the description as they just talk generally about Summit in the vid.

    3,400 nodes is over 20,000 'V100' accelerators.
    Cheers
     
    #101 CSI PC, Jan 19, 2017
    Last edited: Jan 19, 2017
    nnunn and pharma like this.
  2. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Well if they cannot do it then that means there will be problems providing sampling status Volta Xavier to automobile manufacturers in very late Q3/early Q4 this year :)
    The Tegra design comes after the dGPU/Tesla.
    But that does not mean it will be available generally, same approach we saw with P100, although we were offered a full DGX-1 in mid-late Q3 2016 with only a 1-week wait for guaranteed delivery and I know of one Nvidia Elite Solutions Providor who had a certified 'node' (their own box with 8xP100) available in under a week by early Q4 2016.
    It could slip I agree, but that would be the surprise rather than the launch process of P100 repeated a bit later in 2017 with Volta 'V100' (sometime mid-maybe late summer).
    Cheers
     
    #102 CSI PC, Jan 19, 2017
    Last edited: Jan 19, 2017
  3. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,420
    Likes Received:
    179
    Location:
    Chania
    I'm still trying to figure out how one can get to 20 DL TOPs; going up to 30 is a too advanced lesson for the layman here :p
     
  4. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Hehe I must say his claims for Drive PX2 at CES last year were rather misleading (20 DL TOPs back then was a mix of the 2x 'Parker' SoC and dual GPU but was not made clear until full details came out, so was the full PX2 solution that was on offer rather than the smaller modules) and so not putting too much weight on the latest claims of the Xavier performance because they seem too radical even with the double core and the new custom ARM , but the 2 supercomputers with their final spec released by Nvidia/IBM is a different matter as these are contractual obligations and has serious ramifications in many ways not just for those 2 projects but also others and both their reputations in the science-analytics-AI markets.
    Cheers
     
  5. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,224
    Likes Received:
    1,895
    Location:
    Finland
    Yet on their roadmap they have it on 2018, not 2017.
    I'm aware of the supercomputers supposedly being ready towards the end of 2017, but that doesn't necessarily make Volta "2017 product" really
     
  6. xpea

    Regular Newcomer

    Joined:
    Jun 4, 2013
    Messages:
    372
    Likes Received:
    309
    last rumors are custom TSMC 12nm process for Volta with lot of wafers already allocated

    most of Xavier TOPS come from CVA (Compute Vision Accelerator), not Volta
     
    no-X, pharma and iMacmatician like this.
  7. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    It is a general product roadmap-cycle not an explicit timeline of launch - this is what catches people out with AMD launches, read the actual HPC presentations that has specific dates regarding Volta including from IBM.
    As I mention every one of them has 2017.
    Cheers
     
  8. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    All nV gens have been launched in their one and half year launch cycles which would be if as always end up at end of 2017 or early (1st Q) 2018. So to expect anything else would mean delayed or something went unexpected.
     
  9. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Yeah that could be possible as it is not much of a node shrink risk, was meant to be part of the 16nm group options originally by TSMC?
    Xavier, after Drive PX2 going to wait and see this time how it all pans out in terms of what the complete architecture-solution is required for the 30DL TOPS.
    Thanks
     
  10. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Just to add the supercomputers will probably not go live until early 2018, the amount of work on these projects is massive, the project workflow-update I saw quite awhile ago covers 12 months of activity involving all parties.
    For just 3 supercomputers (3 project-linked labs stated for 2017 but as I mention not necessarily going live until early 2018) requires I think just over 40,000 'V100' dGPUS, along with the diverse range of advanced tech that also goes beyond Volta and critically the training-code transition some of which will require large scale nodes.
    But to be clear, IMO I see November-December being the point when GV100 is in ramped up manufacturing to meet all core client demands (large scale clients projects directly involving Nvidia with IBM/Cray/etc), but will be supplied like we saw with the NVLink P100 in a similar way but from mid-late summer.
    But IMO this will be a 2017 product same way P100 was a Q2-Q3 2016 product even though you could not purchase an individual PCIe P100 until this year (still not sure you can even yet).
    Some sites was using the fact you cannot buy an individual PCIe P100 as evidence Nvidia is having issues manufacturing the P100 or it should be classified as not manufacturing status, but as I mentioned I have not seen evidence of that even back in Q3 if looking to buy a full node.
    What is less clear is whether Nvidia will stay with the normal linked schedule for 1-2 consumer GPUs or hold them back until early next year from a business-sales product strategy perspective, but even if they maintain some trend of Pascal cycle it would not be until mid-late Q4 earliest.
    I doubt Nvidia has decided themselves just yet but if they did release say a 1080 replacement, they could hold back on lower GPUs for 4 months like they did with Maxwell.
    The 980 launched 19th September 2014 while the 960 launched 22nd January 2015, with Pascal the gap between 1080 and 1060 was reduced to 7 weeks.
    Against this decision though would be a Volta 'V1080' competing with the 1080ti, but then it would be price-margin competitive (both from manufacturing and retail) against Vega.

    Cheers
     
    #110 CSI PC, Jan 20, 2017
    Last edited: Jan 20, 2017
  11. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,420
    Likes Received:
    179
    Location:
    Chania
    If you want to compare funky TOP numbers, Mobileye claims from its 2nd generation PMAs (EyeQ5) 15 TOPs out of a 5W SoC.
     
    Razor1 likes this.
  12. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Thats impressive if true and without caveats, I guess it depends also if it has all the functions like the Drive PX2.
    Cheers
     
    pharma and Razor1 like this.
  13. xpea

    Regular Newcomer

    Joined:
    Jun 4, 2013
    Messages:
    372
    Likes Received:
    309
    From what I heard, 12nm is an improved 16nm for high performance GPUs (NVDA didnt want to use 10nm as its purely a SoC node for Apple/QC, like was 20nm. And 7nm is too far away)
    Xavier CVA may be the commercial version of MIT Eyeriss projet:
    http://people.csail.mit.edu/emer/slides/2016.02.isscc.eyeriss.slides.pdf
     
    nnunn likes this.
  14. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    939
    Likes Received:
    35
    Location:
    LA, California
  15. Samwell

    Newcomer

    Joined:
    Dec 23, 2011
    Messages:
    112
    Likes Received:
    129
    I expected more from Mobileye. Q5 is on 7nm process. Put Xavier on that too and you'll get 30 TOPs in 15W as it's two node jumps. So in theory roughly 7,5 W Nvidia Chips vs 5W Mobileye at 15 TOPs if it would be at the same node. Of course Q5 will be more size efficient, but Xavier will have a lot of other stuff and will be more general purpose. A specialized Chip should be able to do more, like Google Deep Learning Chip which had ~10 times efficiency over GPUs.

    Where are the 12nm rumours coming from? I actually believe this will be another mobile optimized node, like 28nm HPM. Gpus only launched on the major High Performance node and so i think it'll be 16FF+ again.
     
  16. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,420
    Likes Received:
    179
    Location:
    Chania
    Not that it really matters, but I wouldn't be so fast to count 10FF as a node in the reasoning you do. Especially since from all signs both GPU IHVs will skip 10FF for a very good reason. Currently the EyeQ4 is with 2 PMAs at 2.5 TOPs under 28nm FD-SOI at 3W and will launch in cars by the end of this year while Xavier will be only sampling at that timeframe. EyeQ5 is slated for 2020, so it's more a minus/plus speculation when each solution will end up in cars then what each process can or cannot deliver in theory.

    I posted it only because obviously NV increased from 20 TOPs@20W to 30 TOPs@30W which most likely comes mostly from CVA additions since power doesn't scale linearly with frequency increases. IHVs don't necessarily pump up specifications without a reason. I didn't know myself but BMW has struck a deal both with Mobileye & Intel for fully autonomous driving for which Intel sounds quite determined lately to strike ground in the automotive market also: http://s2.q4cdn.com/670976801/files/doc_news/2017-BMW-Intel-Mobileye-release_CES.pdf
     
  17. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Yeah that is what I heard as well, although seemed the split decision came from TSMC regarding 12nm and 16nm.
    Thanks for the link really interesting.
    You sure that has anything to do with Xavier?
    Looks to be a very specific CNN and chip design at MIT with Nvidia partial involvement in some way as they do a lot with MIT.
    Sort of reminds me when Krashinsky was researching Temporal SIMT while at MIT and then went on to work for Nvidia - also some of his work was presented along with collaboration from Nvidia, or Jan Lucas Temporal SIMD he built on Nvidia hardware to prove his model, although this MIT tech is seriously much further along than either of these.
    Thanks.
    Edit:
    Reading more about Eyeriss, it is funded to some extent by DARPA, makes sense.
     
    #117 CSI PC, Jan 21, 2017
    Last edited: Jan 21, 2017
  18. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    716
    Likes Received:
    283
    So much blah, blah, about the need for another 10x in compute power.
    Nobody talks about efficient algorithms, these can often solve problems much faster on slower hardware than straightforward algorithms on fast hardware.
     
  19. entity279

    Veteran Regular Subscriber

    Joined:
    May 12, 2008
    Messages:
    1,235
    Likes Received:
    424
    Location:
    Romania
    Which would also make the efficient algorithms run 10x faster
     
    Razor1 and xpea like this.
  20. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    It's a little bit ridiculous that assume that those who rent (or are given) precious super computer time, and often have to wait in line for quite a while to get it, would waste that resource by throwing code at it that only runs at a fraction of its potential.

    My experience with people who use this kind of thing is that they spend a lot of time optimizing.

    But even if the code wastes 80% of the cycles, the new computer would still run 2x faster than the previous theoretically could.
     
    pharma and Razor1 like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...