Nvidia Post-Volta (Ampere?) Rumor and Speculation Thread

Status
Not open for further replies.
He is not inferring anything, but giving a marketing answer that's within the scope of his briefing and at the same time giving the impression of having adressed the question asked.

Well going down that path, when did Nvidia state "There’s definitely functionality in Volta that accelerates" for Amber?
I have posted the results in the past for FP32 Amber and they are quite shockingly good relative to Pascal, partially due to cache and V100 structure.
They do state it accelerates Amber.
But they do not say functionality in Volta.
 
Last edited:
Well going down that path, when did Nvidia state "There’s definitely functionality in Volta that accelerates" for Amber?
While having nothing to do with a broader press briefing, in which's context Nvidias representative said the part in question, but apparently was not asked about Amber and Volta....
edit: That sentence made no sense. What I meant was: Amber is not related to the question asked and debated. Tamasi said this in a broader press briefing (maybe not exclusively, but also) in order to give an answer to the question without being specific. I was merely pointing out that a fulfilling condition of this quote is such a very basic bit of information, that it does not bring the discussion forward.
 
Last edited:
While having nothing to do with a broader press briefing, in which's context Nvidias representative said the part in question, but apparently was not asked about Amber and Volta....
OK maybe I am missing something.
You stated the quote should be ignored because as part of marketing when a technical VP says "functionality in Volta to accelerate Raytracing" he probably means cache.
However for the various marketing of V100 with Amber and some other applications that benefit strongly also from the V100, they only say accelerated.
There is no reference to functionality in Volta in that instance, and that is marketing documentation and also Volta related documentation.

So from what I understand marketing to date has not used the term "functionality in Volta" in the context of cache and acceleration to application/solutions.
 
Last edited:
Well that would mean we could not take anything Timothy Prickett Morgan at TheNextPlatform says is worthwhile, considering his contact and sources at IHVs when he has spoken to them about their launch technologies, I could name other reputable tech journalists/analysts as well.
Seems a flippant response Carsten.
 
Last edited:
Last post looking at this objectively
The gains from DGX-1P100 GPU to a DGX-1 V100 GPU in their early raytracing demo that I posted without AI is comparable to the gains seen with Amber going from DGX-1 P100 to TitanV, gains is around 1.8x for both tests-demos.
Amber is not the only application well suited to the acceleration V100 design provides in part due to its architecture in terms of cache-SM-register, but in all the various Nvidia launch snippets/interviews I have looked at I cannot find in any way marketing putting this in the context of "functionality in Volta".

The context is with the AI-tensor disabled for raytracing.
If it is a marketing gimmick in terms of "functionality in Volta", then it is one they have not done in the past for Volta.
 
Last edited:
Well that would mean we could not take anything Timothy Prickett Morgan at TheNextPlatform says is worthwhile, considering his contact and sources at IHVs when he has spoken to them about their launch technologies, I could name other reputable tech journalists/analysts as well.
Seems a flippant response Carsten.

Well, since you seem to like to go word by word on everything, let's start with this:
If it was Cache, then they would not be so hesitant to comment on that being the functionality as it is already a known factor.
He is dodging the question, filling in something that is cleared to state publicly while giving the impression to have positively answered the question. Media training, single-digit lesson (I guess).

Well that would mean we could not take anything Timothy Prickett Morgan at TheNextPlatform says is worthwhile, considering his contact and sources at IHVs when he has spoken to them about their launch technologies, […]
I seem to not have made it clear enough, so: I was referring to broader press briefings in the context of this discussion - which i wrote in the post before. Tamasi said as much in the one I was listening in as well. Further, „an IHV“ was meant in the sense of official communication via aforementioned broad press briefings and the like, not individuals talking over a beer, maybe reveling confidential information. The latter I would strongly assume, is not the case here, since a) no good journalist would expose his sources in this way were they inofficial and b) no friendly contact at an IHV would give you such an obvious marketing line for an answer if he wants to keep you as a friend.

Apart from that, I don't know Timothy Prickett Morgan, so I cannot comment on what he says, has said or will say in any qualified manner.

edit: spelling.
 
Last edited:
So "functionality in Volta" seems to be directed to a recent update with regards to Raytracing and the originally shown chart with context now around OptiX for CUDA that was not part of the the earlier slides (nor if interested the Ray tracing extensions Vulkan):

Fig3_NVIDIA_raytracing_hierarchy-625x338.png


Reason being it fits the earlier narrative is that Nvidia states:
The core of OptiX is a domain-specific just-in-time compiler. The compiler generates custom ray-tracing kernels by combining user-supplied programs for ray generation, material shading, object intersection, and scene traversal. High performance is achieved by using a compact object model and ray-tracing compiler optimizations that map efficiently to the new RTX Technology and Volta GPUs.
So an evolution of Optix, that while working with older architectures Nvidia says works best with Volta not just in performance but also other enhancements, including traversal being integral and controlled by OptiX.
https://devblogs.nvidia.com/nvidia-optix-ray-tracing-powered-rtx/

Separately, with the Ray tracing extension for Vulkan coming soon, it will be interesting to see if any games will be looking to use this with that API albeit with tempered expectations as early days.

Edit:
This was the original slide presented back mid-March when there was mention of further functionality within Volta.
Makes it a bit clearer to see how this narrative seems to be with regards to OptiX in the April slides and the detail about compiler optimisation/traversal aligning strongly with Volta.
gdc_2018_pre_brief_deck_1521168014-4_575px.png
 
Last edited:
Was about to start a new thread for "Next HPC Nvidia GPU speculation (Ampere?)", but this existing thread could serve just as well.
Ampere name would not be unlogical, as Volt and Ampere are closely related being units of electric potential/current.
Some speculation
- 7 nm (not much doubt about that)
- huge increase in tensor cores like x2 (hardware for AI NN training is still much in demand)
- 6 stacks of HBM2, for 48 GB (can not be less as max for Turing can it ?)
- no RT cores (not much use beyond raytracing)
 
Last edited:
I expect nVidia’s first foray into 7nm will be a straight shrink of Volta/Turing + higher clocks similar to Maxwell -> Pascal. 12nm Volta and Turing chips are huge and they’ll want to scale that back on an expensive new process, especially for gaming skus.

The monster compute chip will likely still be very large though as the target market supports the required pricing.
 
I expect nVidia’s first foray into 7nm will be a straight shrink of Volta/Turing + higher clocks similar to Maxwell -> Pascal. 12nm Volta and Turing chips are huge and they’ll want to scale that back on an expensive new process, especially for gaming skus.

The monster compute chip will likely still be very large though as the target market supports the required pricing.

Not much room for higher clocks this time it looks, the jump in clock from 28nm to 16nm was unusual, also related due to use of finFet.
A 600mm2, 7 nm GPU would be reasonable, that was also the size of the P100.
7nm SoCs like the Kirin 980 have 7 billion transistors on ~100mm2
So that would be ~42 billion transistors for the new HPC GPU, twice that of the V100.
(Suddenly big Turing TU102 18.6 billion transistor on 754mm2, looks not that 'big')
 
Last edited:
67 MTr/mm2 (for HPC as you believe) equates to 6.7 B transistors for 100mm2, that is close to my previous statement of 7nm GPUs having 7 billion transistors on ~100mm2

You can't take it this way. These are marketing numbers for special cases, from the great 96 MTr/mm² for N7 Mobile, just 70 MTr/mm² are reached with real products Huawei Kirin). If we take the same scaling, marketing to real world into account, we get just 4,9 billion transistors on 100mm². We shouldn't calculate with much more at the moment. 7nm HPC is just a doubling of transistors/mm vs 16nm.
 
Last edited:
You can't take it this way. These are marketing numbers for special cases, from the great 96 MTr/mm² for N7 Mobile, just 70 MTr/mm² are reached with real products Huawei Kirin). If we take the same scaling, marketing to real world into account, we get just 4,9 billion transistors on 100mm². We shouldn't calculate with much more at the moment. 7nm HPC is just a doubling of transistors/mm vs 16nm.

Apple A12 is 6.9 B transistors on 83.27 mm2 or 8.3 B transistors / 100mm2 or 83 Mtr/mm2.
Turing is only 2.47 B transistors / 100mm2, for a 7 nm GPU 6 B tr/100mm2 should not be too far off.
 
Has there been any benchmarks done between Turing and Volta?
How would one gauge new arch that replaces Volta if there hasn't been any recent tests done?
 
Status
Not open for further replies.
Back
Top