Nvidia Pascal Announcement

A total guess to explain the extra GP107 variant: supporting a single HBM2 module would be a powerful 4GB discrete mobile GPU, and HBM2 would make the package tiny, lower wattage and mechanically easier to cool. Not sure the HBM2 cost for a mobile part makes economic/business sense but it'd be very attactive for a high end but compact laptop. It'd also be a nice part for a dense 4x or even 8x GRID card using an onboard PLX switch.
I'd think such GPU would get it's own name, rather than just "f", since it would be completely different chip with different memory controller
 
A total guess to explain the extra GP107 variant: supporting a single HBM2 module would be a powerful 4GB discrete mobile GPU, and HBM2 would make the package tiny, lower wattage and mechanically easier to cool. Not sure the HBM2 cost for a mobile part makes economic/business sense but it'd be very attactive for a high end but compact laptop. It'd also be a nice part for a dense 4x or even 8x GRID card using an onboard PLX switch.

Your post made me wonder if using HBM and an interposer results in a significantly higher chip, or if those things aren't significant relative to e.g. the thickness of the heat spreader. (Thinness seems to be a thing in the laptop market these days, so that might be a non-cost related negative)
 
Interposer is a fraction of a mm thick, the RAM stacks are probably no taller than ye typical flash device, which IIRC also uses chip stacking (only using wirebonding to connect them to substrate rather than through-vias...)
 
I'd think such GPU would get it's own name, rather than just "f", since it would be completely different chip with different memory controller

What i was think, in addition of the cost of developpement of this sku, and the questionnability of the use of HBM on an entry level chip as the GP107. production cost... High end laptop will use high end mobile, if they are aimed at game and mobile workstation. If they are just aimed at be an ultra thin notebook, they dont really need to have a bandwith increase.. You save way more space / watt by just use an CPU with Integrate chips ( As every ultra thin mobile today ) that use an discrete mobile gpu. This GP107+HBM will cost a bit too much for maybe 1 laptop who will use it ( and as a marketing point, more than an efficient one ). This will look to me as a real curiosity.

On the other hand, HBM could well more worth it on an APU AMD style system, with DDR4 for the cpu memory system.
 
I'd think such GPU would get it's own name, rather than just "f", since it would be completely different chip with different memory controller
GK110B (née GK180) is different than GK110 with improved L1 and GPU Boost. But that was an upgrade that replaced GK110, not a parallel chip line.
 
GK110B (née GK180) is different than GK110 with improved L1 and GPU Boost. But that was an upgrade that replaced GK110, not a parallel chip line.
Also the Tesla model jumped from K20 to K40 with the GK110B?
This also meant more cores, but did they still manufacturer the K20?
Anyway checking the K20C was still spec'd with GK110 and while more Cuda Cores than the K20, still less than you get with the GK110B and the K40.

Cheers
 
PCI Express version of Tesla P100 was announced today...

http://www.anandtech.com/show/10433/nvidia-announces-pci-express-tesla-p100

Will be shipping in Q4 2016

Two versions was announced. 16GB and 12GB (with one HBM2 stack disabled). PCIe cards must be in a lower power envelope, at 250W, so they will have lower clocks compared to the mezzanine card version. Nvidia also confirmed that the Piz Daint supercomputer in Switzerland will swap out their 1000 Tesla K20 cards with P100 cards. So I guess that will be the first supercomputer with Pascal cards.
 
PCI Express version of Tesla P100 was announced today...

http://www.anandtech.com/show/10433/nvidia-announces-pci-express-tesla-p100

Will be shipping in Q4 2016

Two versions was announced. 16GB and 12GB (with one HBM2 stack disabled). PCIe cards must be in a lower power envelope, at 250W, so they will have lower clocks compared to the mezzanine card version. Nvidia also confirmed that the Piz Daint supercomputer in Switzerland will swap out their 1000 Tesla K20 cards with P100 cards. So I guess that will be the first supercomputer with Pascal cards.

Was this expected? I feel like it came out of left field a little.

Nvidia could take a gp100 variant with 12GB of vram and a couple fewer compute resources and make a "Titan P" relatively soon (I.e. early 2017), no?

I was expecting a gp102-based titan.
 
Was this expected? I feel like it came out of left field a little.

Nvidia could take a gp100 variant with 12GB of vram and a couple fewer compute resources and make a "Titan P" relatively soon (I.e. early 2017), no?

I was expecting a gp102-based titan.
Maybe a bit sooner than expected.
Which suggests there are producing a fair amount of P100s, possibly pushed by clients that wanted a PCIe and not the NVLink/Mezzanine - I would expect quite a few of their core clients they sell direct to would fit into this category along with those buying from sales channels in the future.

Cheers

Edit:
Just found the Register/NextPlatform that explains it a bit more, yeah as I thought it is also (not just their core clients who buy from Nvidia) directed at the sales channel and that is being brought foward.
http://www.nextplatform.com/2016/06/20/nvidia-rounds-pascal-tesla-accelerator-lineup/
http://www.theregister.co.uk/2016/06/20/nvidia_tesla_p100_pcie_card/

Looks like the price of the cheapest model is going to be pretty competitive and I am surprised, raises questions about GP102 if those prices are correct, but it still could fit as it would be less DP focused.
 
Last edited:
Was this expected? I feel like it came out of left field a little.

Nvidia could take a gp100 variant with 12GB of vram and a couple fewer compute resources and make a "Titan P" relatively soon (I.e. early 2017), no?

I was expecting a gp102-based titan.
They likely wouldn't call it a Titan as it would be focused on existing scientific environments. That's not the sort of card you'd want to game on assuming it even has display outputs. The timeframe isn't really a surprise, they already had P100 and sticking it on a board was the next logical step and not overly complicated. No NVLink and lower TDP to fall in the PCIE specs.
 
PCI Express version of Tesla P100 was announced today...

http://www.anandtech.com/show/10433/nvidia-announces-pci-express-tesla-p100

Will be shipping in Q4 2016

Two versions was announced. 16GB and 12GB (with one HBM2 stack disabled). PCIe cards must be in a lower power envelope, at 250W, so they will have lower clocks compared to the mezzanine card version. Nvidia also confirmed that the Piz Daint supercomputer in Switzerland will swap out their 1000 Tesla K20 cards with P100 cards. So I guess that will be the first supercomputer with Pascal cards.

They was sometthing who had interrogate me since the P100 "launch".. Why only 720GB/s at 1.4Gbs memory speed ? HBM2 shouldnt allow 1000GB/s ?

Availability for consumer was set at Q1 2017.. well start of 2017 for be more exact. so starting to ship them on Q4 2016 ( aka december if my infos are right) dont do much a difference. Its not a sector with big volume ..

One example with supercomputer center is they never deploy all chips at the same time, theres a lot of test phases, then move on, but it can take a good time for finish it. you dont need to send them 4500 working chips at once.
 
Last edited:
They was sometthing who had interrogate me since the P100 "launch".. Why only 720GB/s at 1.4Gbs memory speed ? HBM2 shouldnt allow 1000GB/s ?
Tesla card in general are more "conservative" on the memory clock if you compare it to a Titan or a GeForce.
Tesla K40 has a bandwidth of 288GB/s compared to the Titan Black at 336GB/s.
 
They was sometthing who had interrogate me since the P100 "launch".. Why only 720GB/s at 1.4Gbs memory speed ? HBM2 shouldnt allow 1000GB/s ?

Availability for consumer was set at Q1 2017.. well start of 2017 for be more exact. so starting to ship them on Q4 2016 ( aka december if my infos are right) dont do much a difference. Its not a sector with big volume ..

One example with supercomputer center is they never deploy all chips at the same time, theres a lot of test phases, then move on, but it can take a good time for finish it. you dont need to send them 4500 working chips at once.
I think the 720GB/s was to ensure they hit 300W with the NVLink P100 model.

Regarding shipping and why this shows it has been brought forward or at least production is going moderately well, the NextPlatform and Register mention this, the links above in earlier post of mine provide great information:
The PCI-Express variants of the Tesla P100 cards are beginning production now and are expected to ship in volume in the fourth quarter of this year, with Cray, Dell, Hewlett-Packard Enterprise, and IBM being at the point of the spear for pushing them into the market.
.....
Nvidia could have probably cut back on the number of SMs and therefore CUDA cores and FP64 cores supported in the PCI-Express versions of the Pascal Tesla cards to provide a differentiated product, but it looks like yields are good enough that it doesn’t have to do this.
Even the price competitive lower model P100 still has 3584 cores, and I am surprised they are going to do it a price comparable to the K80 (even a bit higher is still relatively cheap so yields look to be good due maintaining same Cuda cores and the very competitive price).
One of the links I included earlier has a chart with the various Tesla parts and rough prices.

The core clients who spend multi-millions on a project are buying directly from Nvidia bypassing the sales channel, many tech manufacturer companies operate in a similar way.
Yeah supercomputer deployment is rather complex, and not just because replacing 1000s of cards but also the coding requirement changes/training staff/etc, that is a massive part of the schedule for The Swiss National Supercomputing Center.
More so than swapping the cards over.
I used to love the old days when one could swap out a super/HPC-datacenter mainframe over a weekend (yeah a lot of work I agree and many months of pre-planning/blueprints/etc, and excitement-pressure :) ), even more fun when only ever been proven inside IBM research labs before client implementation.
Cheers
 
Last edited:
I'd ask, "why cap the chip at 250W for PCIe version when they happily allowed GF480 to suck down some 400 watts? (And sometimes perhaps even north of that)", but I assume this announcement is for Tesla product line only for now? No Geforce version (yet)?

Probably for the best... I wouldn't want to feel annoyed that I couldn't afford $3.5k for a 600sqmm 16nm finfet NV GPU. :LOL:
 
They likely wouldn't call it a Titan as it would be focused on existing scientific environments. That's not the sort of card you'd want to game on assuming it even has display outputs. The timeframe isn't really a surprise, they already had P100 and sticking it on a board was the next logical step and not overly complicated. No NVLink and lower TDP to fall in the PCIE specs.

So you don't see a gp100-based prosumer titan even though gp100 is on pcie now?

I guess I had assumed that a gp100-based titan would functionally be the same as a gp102-based titan but with more dp performance on tap (ignoring memory differences and other pro features). It sounds like it's more complex than that.
 
So you don't see a gp100-based prosumer titan even though gp100 is on pcie now?

I guess I had assumed that a gp100-based titan would functionally be the same as a gp102-based titan but with more dp performance on tap (ignoring memory differences and other pro features). It sounds like it's more complex than that.

Thats if Titan was really aimed at compute performance (for an affordable price ), but i think they are more aimed with it now to a be at the top notch price. First, im pretty sure they will limit FP16, for dont push it in the feet of the Tesla, and when they are at it, im pretty sure they will disable FP64 too.
( if not fully, with a ratio ala 1/8-1/16 ) Difference could come from memory. as it have been allready the case, and full SP enabled.

So GP100-GP102, i dont know, i dont really see what could make so different a GP102 from a GP100, specially when we look at thoses Tesla PCIe.

Looking at the smaller one, i will not be surprised that the 1080TI will come with 12GB and the titan with 16GB.. ( or who know, maybe a 1070TI )
 
Last edited:
So you don't see a gp100-based prosumer titan even though gp100 is on pcie now?

I guess I had assumed that a gp100-based titan would functionally be the same as a gp102-based titan but with more dp performance on tap (ignoring memory differences and other pro features). It sounds like it's more complex than that.
I'm not sure what practical applications that would benefit. There would be a lot of useless hardware and I'm not sure where the overlap would occur. For rendering FP16/64 are largely pointless and for computation IGP with a dedicated card(s) is likely better. I'm not sure the chip was actually designed with any of the typical display/video hardware either. At least I haven't seen any mention of supported display outputs, ROPS, etc for a full GP100.
 
So you don't see a gp100-based prosumer titan even though gp100 is on pcie now?

I guess I had assumed that a gp100-based titan would functionally be the same as a gp102-based titan but with more dp performance on tap (ignoring memory differences and other pro features). It sounds like it's more complex than that.
GP100 is so different from the other Pascals (this is obviously assuming GP102 shares GP104 DNA, but why would they build more or less two similar sized chips [GP100, GP102], if it didn't?), that at least I don't see even prosumer-versions of GP100 happening.
 
Thats if Titan was really aimed at compute performance (for an affordable price ), but i think they are more aimed with it now to a be at the top notch price. First, im pretty sure they will limit FP16, for dont push it in the feet of the Tesla, and when they are at it, im pretty sure they will disable FP64 too.
( if not fully, with a ratio ala 1/8-1/16 ) Difference could come from memory. as it have been allready the case, and full SP enabled.

So GP100-GP102, i dont know, i dont really see what could make so different a GP102 from a GP100, specially when we look at thoses Tesla PCIe.

Looking at the smaller one, i will not be surprised that the 1080TI will come with 12GB and the titan with 16GB.. ( or who know, maybe a 1070TI )

I'm not sure what practical applications that would benefit. There would be a lot of useless hardware and I'm not sure where the overlap would occur. For rendering FP16/64 are largely pointless and for computation IGP with a dedicated card(s) is likely better. I'm not sure the chip was actually designed with any of the typical display/video hardware either. At least I haven't seen any mention of supported display outputs, ROPS, etc for a full GP100.

GP100 is so different from the other Pascals (this is obviously assuming GP102 shares GP104 DNA, but why would they build more or less two similar sized chips [GP100, GP102], if it didn't?), that at least I don't see even prosumer-versions of GP100 happening.

I appreciate the feedback.

So it sounds like gp100 really is destined to stay in the tesla lineup and a gp102 truly is necessary for Pascal's high end geforce lineup.

I'm interested in seeing how this segmentation plays out in future generations.
 
GP100 is so different from the other Pascals (this is obviously assuming GP102 shares GP104 DNA, but why would they build more or less two similar sized chips [GP100, GP102], if it didn't?), that at least I don't see even prosumer-versions of GP100 happening.
Do we know that the GP100 and GP102 have similar size? I was expecting the GP102 to be about 1.5x the GP104, which would make it around 450 mm^2.
 
Last edited:
Back
Top