Nvidia Volta Speculation Thread

Discussion in 'Architecture and Products' started by DSC, Mar 19, 2013.

Tags:
  1. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    More like 300W per GPU and 300W for the rest of the system?
     
    DavidGraham, CSI PC, pharma and 3 others like this.
  2. xpea

    Regular Newcomer

    Joined:
    Jun 4, 2013
    Messages:
    399
    Likes Received:
    416
    In 12FFN, the N stands for Nvidia. It's a custom mode the for the green team ;-)
     
  3. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    781
    Likes Received:
    211
    Looks like you're right, since NVIDIA mentions in the devblog that the V100 has a 300 W TDP.
     
  4. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    So a bit like Samsung's 14nm LPE and 10nm LPE, but nvidia bought everything?
     
  5. gamervivek

    Regular Newcomer

    Joined:
    Sep 13, 2008
    Messages:
    732
    Likes Received:
    223
    Location:
    india
    A 800mm2 chip is clocked at 1.4-1.5Ghz, so I think it's not out of the realm of reason to expect desktop chips at 2.0Ghz stock. If nvidia put out a gaming chip with the same CUDA cores as the behemoth, it should do 20TFLOPS, about 70% more than the big gaming Pascal. Hopefully, the improvements add another 10-20% and we have a pretty decent gaming card by next year's end.
     
  6. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,247
    Likes Received:
    3,447
    They'll have to discard the Tensor cores and the DP units (like they usually do), Also Volta has a new scheduling hardware to handle all of these cores. We could see a reduction in that section as well.

    https://devblogs.nvidia.com/parallelforall/inside-volta/?ncid=so-twi-vt-13918
     
  7. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,346
    Likes Received:
    3,864
    Location:
    Well within 3d
    The thread scheduling method now tracks thread context per work item, and allows for instructions belonging to both paths to issue rather than the hardware running down one path until it reaches its end and then starting on the other. I'll have to take some time to digest the information. Nvidia seems to describing one benefit to their solution as removing the deadlock threat of synchronization operations being split between diverged paths.
     
    Alexko, pharma and DavidGraham like this.
  8. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Yeah.
    More impressive though is staying within 300W with the FP64 while also expanding NVLink 2 performance, that is where a lot of the power demand/TDP comes from (more specifically FP64 but NVLink Mezzanine is pretty demanding).

    It is a very interesting design and impressive also with the spec, I mentioned to someone else awhile ago it is a bit like Kepler->Maxwell repeated this time Pascal->Volta.
    They increase the die by 33.6% while impressively keeping with same 300W and yet they go further:
    FP32 compute increases by 41.5% or 2x (yeah depends upon function with Tensor).
    FP64 compute increases by 41.5%
    FP16 compute increased by 41.5% or 4x (yeah depends upon function with Tensor).
    Squeezing into that 33.6% die increase an extra 41% Cuda cores and importantly with additional functions/units.

    And other important aspects such as a heavily revised Thread Scheduling and Cache performance behaviour:
    Specific sections they are in is: Independent Thread Scheduling, and for L0/L1 Cache both Volta SM (Streaming Multiprocessor) and then ENHANCED L1 DATA CACHE AND SHARED MEMORY
    https://devblogs.nvidia.com/parallelforall/inside-volta/

    More of a monster than I expected TBH, but fits with what was being said quite awhile ago about how it is another jump from Pascal with arch changes (and also critically efficiency looking at those specs).
    It will be interesting to see how GV100 pans out as a Quadro 2nd half next year, shame no-one has yet tested the Quadro GP100 with the dual NVLink to see how well it works with certain Professional applications-devs Nvidia work closely with for Quadros.
    Cheers

    Edit:
    Sorry Graham did not read your post before posting so I see you also reference the additional info on the devblog.
    But I think you will find a version of the Tensor cores on certain other CUDA/Volta GPUs.
    Also forgot to say, NVLink 2 as thought is increasing the number of links supported from 4 to 6 and now 50GB/s individually rather than 40GB/s.

    Edit2:
    Was tired just corrected Tensor specifics on proof read.
     
    #188 CSI PC, May 10, 2017
    Last edited: May 10, 2017
  9. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,412
    Likes Received:
    2,070
    [​IMG]
     
    xpea likes this.
  10. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,309
    Likes Received:
    407
    Location:
    Australia
    Have they confirmed this is one die? To me at 800mm sq it makes far more sense for this to be two dies. Given that its using HBM, there is already an interposer so having two 400mm chips with a high bandwidth fabric between L2 slices* makes far more sense to me.

    *or similar position in the architecture
     
  11. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    they stated reticle limit so one die.
     
    Lightman and pharma like this.
  12. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    1,938
    Likes Received:
    809
    Location:
    Earth
    pharma and Razor1 like this.
  13. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,309
    Likes Received:
    407
    Location:
    Australia
    Well it truly is insane, gotta love the ambition, poor old kights-*.
     
    Razor1 likes this.
  14. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY

    anandtech article, pretty good, not sure why they mentioned less flexibility vs more performance, I know it was talked about in the presentation when comparing to other chip types (FPGA's and CPU's) but I don't think Volta's architecture is going to be less flexible than past GPU architectures from nV, maybe Ryan can explain.
     
  15. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,309
    Likes Received:
    407
    Location:
    Australia
    he was talking only about the Tensor Cores which are less flexible, they are targeting a specific subset of workloads.
     
    ToTTenTranz and Razor1 like this.
  16. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    ah ok thx!
     
  17. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,247
    Likes Received:
    3,447
    Maybe they can retain some of them in a GV102 core for the Titan crowd? Judging by previous trends, a GV104 core will most likely discard them completely. Speaking of which, I think we can expect a full GV104 to be roughly 20~30% faster than a TitanXp (full GP102), If NV managed high enough clocks.
     
  18. Ryan Smith

    Regular

    Joined:
    Mar 26, 2010
    Messages:
    623
    Likes Received:
    1,095
    Location:
    PCIe x16_1
    Yep, one die. It's just insane. And that doesn't even get into the interposer (you can't get a traditional interposer large enough).

    In 5 years when everyone starts throwing these out, I'm going to have to get one to add to the GPU collection...
     
    CSI PC, Lightman, pharma and 2 others like this.
  19. shiznit

    Regular Newcomer

    Joined:
    Nov 27, 2007
    Messages:
    334
    Likes Received:
    84
    Location:
    Oblast of Columbia
    Is there a game rendering application for the Tensor units?
     
  20. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    this kinda shows how much they are expecting in sales from DL and HPC to push the limits like this though.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...