Nvidia Post-Volta (Ampere?) Rumor and Speculation Thread

Discussion in 'Architecture and Products' started by Geeforcer, Nov 12, 2017.

Tags:
  1. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    843
    Well going down that path, when did Nvidia state "There’s definitely functionality in Volta that accelerates" for Amber?
    I have posted the results in the past for FP32 Amber and they are quite shockingly good relative to Pascal, partially due to cache and V100 structure.
    They do state it accelerates Amber.
    But they do not say functionality in Volta.
     
    #161 CSI PC, Mar 20, 2018
    Last edited: Mar 20, 2018
  2. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,737
    Likes Received:
    1,970
    Location:
    Germany
    While having nothing to do with a broader press briefing, in which's context Nvidias representative said the part in question, but apparently was not asked about Amber and Volta....
    edit: That sentence made no sense. What I meant was: Amber is not related to the question asked and debated. Tamasi said this in a broader press briefing (maybe not exclusively, but also) in order to give an answer to the question without being specific. I was merely pointing out that a fulfilling condition of this quote is such a very basic bit of information, that it does not bring the discussion forward.
     
    #162 CarstenS, Mar 20, 2018
    Last edited: Mar 20, 2018
  3. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    843
    OK maybe I am missing something.
    You stated the quote should be ignored because as part of marketing when a technical VP says "functionality in Volta to accelerate Raytracing" he probably means cache.
    However for the various marketing of V100 with Amber and some other applications that benefit strongly also from the V100, they only say accelerated.
    There is no reference to functionality in Volta in that instance, and that is marketing documentation and also Volta related documentation.

    So from what I understand marketing to date has not used the term "functionality in Volta" in the context of cache and acceleration to application/solutions.
     
    #163 CSI PC, Mar 20, 2018
    Last edited: Mar 20, 2018
  4. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,737
    Likes Received:
    1,970
    Location:
    Germany
    Just to end this from my side: When an IHV talks to press, it's always marketing.
     
    Silent_Buddha, Malo and Lightman like this.
  5. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    843
    Well that would mean we could not take anything Timothy Prickett Morgan at TheNextPlatform says is worthwhile, considering his contact and sources at IHVs when he has spoken to them about their launch technologies, I could name other reputable tech journalists/analysts as well.
    Seems a flippant response Carsten.
     
    #165 CSI PC, Mar 20, 2018
    Last edited: Mar 20, 2018
  6. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    843
    Last post looking at this objectively
    The gains from DGX-1P100 GPU to a DGX-1 V100 GPU in their early raytracing demo that I posted without AI is comparable to the gains seen with Amber going from DGX-1 P100 to TitanV, gains is around 1.8x for both tests-demos.
    Amber is not the only application well suited to the acceleration V100 design provides in part due to its architecture in terms of cache-SM-register, but in all the various Nvidia launch snippets/interviews I have looked at I cannot find in any way marketing putting this in the context of "functionality in Volta".

    The context is with the AI-tensor disabled for raytracing.
    If it is a marketing gimmick in terms of "functionality in Volta", then it is one they have not done in the past for Volta.
     
    #166 CSI PC, Mar 20, 2018
    Last edited: Mar 20, 2018
  7. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,737
    Likes Received:
    1,970
    Location:
    Germany
    Well, since you seem to like to go word by word on everything, let's start with this:
    He is dodging the question, filling in something that is cleared to state publicly while giving the impression to have positively answered the question. Media training, single-digit lesson (I guess).

    I seem to not have made it clear enough, so: I was referring to broader press briefings in the context of this discussion - which i wrote in the post before. Tamasi said as much in the one I was listening in as well. Further, „an IHV“ was meant in the sense of official communication via aforementioned broad press briefings and the like, not individuals talking over a beer, maybe reveling confidential information. The latter I would strongly assume, is not the case here, since a) no good journalist would expose his sources in this way were they inofficial and b) no friendly contact at an IHV would give you such an obvious marketing line for an answer if he wants to keep you as a friend.

    Apart from that, I don't know Timothy Prickett Morgan, so I cannot comment on what he says, has said or will say in any qualified manner.

    edit: spelling.
     
    #167 CarstenS, Mar 20, 2018
    Last edited: Mar 21, 2018
  8. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    843
    So "functionality in Volta" seems to be directed to a recent update with regards to Raytracing and the originally shown chart with context now around OptiX for CUDA that was not part of the the earlier slides (nor if interested the Ray tracing extensions Vulkan):

    [​IMG]

    Reason being it fits the earlier narrative is that Nvidia states:
    So an evolution of Optix, that while working with older architectures Nvidia says works best with Volta not just in performance but also other enhancements, including traversal being integral and controlled by OptiX.
    https://devblogs.nvidia.com/nvidia-optix-ray-tracing-powered-rtx/

    Separately, with the Ray tracing extension for Vulkan coming soon, it will be interesting to see if any games will be looking to use this with that API albeit with tempered expectations as early days.

    Edit:
    This was the original slide presented back mid-March when there was mention of further functionality within Volta.
    Makes it a bit clearer to see how this narrative seems to be with regards to OptiX in the April slides and the detail about compiler optimisation/traversal aligning strongly with Volta.
    [​IMG]
     
    #168 CSI PC, Apr 4, 2018
    Last edited: Apr 4, 2018
    pharma likes this.
  9. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    691
    Likes Received:
    243
    Was about to start a new thread for "Next HPC Nvidia GPU speculation (Ampere?)", but this existing thread could serve just as well.
    Ampere name would not be unlogical, as Volt and Ampere are closely related being units of electric potential/current.
    Some speculation
    - 7 nm (not much doubt about that)
    - huge increase in tensor cores like x2 (hardware for AI NN training is still much in demand)
    - 6 stacks of HBM2, for 48 GB (can not be less as max for Turing can it ?)
    - no RT cores (not much use beyond raytracing)
     
    #169 Voxilla, Sep 16, 2018
    Last edited: Sep 16, 2018
    pharma and Heinrich4 like this.
  10. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,405
    Likes Received:
    401
    Location:
    New York
    I expect nVidia’s first foray into 7nm will be a straight shrink of Volta/Turing + higher clocks similar to Maxwell -> Pascal. 12nm Volta and Turing chips are huge and they’ll want to scale that back on an expensive new process, especially for gaming skus.

    The monster compute chip will likely still be very large though as the target market supports the required pricing.
     
  11. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    691
    Likes Received:
    243
    Not much room for higher clocks this time it looks, the jump in clock from 28nm to 16nm was unusual, also related due to use of finFet.
    A 600mm2, 7 nm GPU would be reasonable, that was also the size of the P100.
    7nm SoCs like the Kirin 980 have 7 billion transistors on ~100mm2
    So that would be ~42 billion transistors for the new HPC GPU, twice that of the V100.
    (Suddenly big Turing TU102 18.6 billion transistor on 754mm2, looks not that 'big')
     
    #171 Voxilla, Sep 16, 2018
    Last edited: Sep 18, 2018
  12. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    346
    Likes Received:
    139
    N7 HPC is considerably less dense than the SoC variant.
     
  13. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    691
    Likes Received:
    243
    Any more hard data/references to backup that statement ?
    The best I could find:
    "N7 PPA (versus 16) is 3X density improvement, 35% speed gain, 65% power reduction. The N7 HPC track provides a further 13+% speed gain over N7 mobile."
     
    #173 Voxilla, Sep 18, 2018
    Last edited: Sep 18, 2018
  14. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    346
    Likes Received:
    139
     
  15. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    691
    Likes Received:
    243
    67 MTr/mm2 (for HPC as you believe) equates to 6.7 B transistors for 100mm2, that is close to my previous statement of 7nm GPUs having 7 billion transistors on ~100mm2
     
  16. Samwell

    Newcomer

    Joined:
    Dec 23, 2011
    Messages:
    108
    Likes Received:
    123
    You can't take it this way. These are marketing numbers for special cases, from the great 96 MTr/mm² for N7 Mobile, just 70 MTr/mm² are reached with real products Huawei Kirin). If we take the same scaling, marketing to real world into account, we get just 4,9 billion transistors on 100mm². We shouldn't calculate with much more at the moment. 7nm HPC is just a doubling of transistors/mm vs 16nm.
     
    #176 Samwell, Sep 28, 2018
    Last edited: Sep 28, 2018
  17. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    691
    Likes Received:
    243
    Apple A12 is 6.9 B transistors on 83.27 mm2 or 8.3 B transistors / 100mm2 or 83 Mtr/mm2.
    Turing is only 2.47 B transistors / 100mm2, for a 7 nm GPU 6 B tr/100mm2 should not be too far off.
     
    McHuj likes this.
  18. ECH

    ECH
    Regular

    Joined:
    May 24, 2007
    Messages:
    682
    Likes Received:
    7
    Has there been any benchmarks done between Turing and Volta?
    How would one gauge new arch that replaces Volta if there hasn't been any recent tests done?
     
  19. del42sa

    Newcomer

    Joined:
    Jun 29, 2017
    Messages:
    59
    Likes Received:
    39
    Lightman and nnunn like this.
  20. Rootax

    Regular Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    930
    Likes Received:
    426
    Location:
    France
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...