Nvidia Pascal Speculation Thread

Alexko · Oct 16, 2015

silent_guy said:
I don't think that market warrants a completely different piece of silicon.

It might be the other way around: for small to mid-sized GPUs, laptops might represent the largest volume share by now. Plus, margins should be higher, so if they don't represent the largest volume share, they might still be bigger than their desktop counterparts in revenue or profit.

silent_guy · Oct 17, 2015

Voxilla said:
I tried to explain that in a post above (I recon you might need to be an electrical engineer to understand that).
Differential signalling uses 2 PCB wires in a pair, which enables higher frequencies and less interference.

I understand differential signaling well enough. And the DDR is already differential signals for the strobes (though the reason for this may be to make it easier to clock on both edges of the clock).

But differential signaling also means that you'll have 2 IOs and wires toggling at crazy rates instead of one, which is going to make power consumption significantly higher.
Furthermore, differential signaling is especially useful for longer cables, or longer traces that cross connectors, where its primary benefit comes from twisted pairs cancelling out common mode noise or reflections. When you use a non-twisted pair, as would be the case for a PCB, and very short traces, differential signaling won't be as effective.

So for now, I'm guessing they're still using single-ended.

Voxilla · Oct 17, 2015

RAMBUS seems to manage low power differential signaling for memory IO

silent_guy · Oct 17, 2015

Voxilla said:
RAMBUS seems to manage low power differential signaling for memory IO

Interesting article. I guess I should stay in the digital domain and leave the analog to others!

Ext3h · Oct 17, 2015

To be honest: 10Gbit respectively 10 Gigabaud still sound reasonable for GDDR5X. Sure, EMI gets worse. And the skin effect for silicon is slowly taking its toll, and the routing of these traces isn't going to be fun. Still, possible.

But they actually don't even need to increase the baud rate any further. Adding additional symbols to the encoding, by using 2 additional voltage levels would be sufficient as well. Still perfectly doable, especially on such short distances. And doesn't require any fundamentally new PCB or chip layouts, only 1-2 additional comparators per line on each end.

Forget about differential signaling. Doubling from 384 (well, actually a few more, since there are also 3 clock lines per channel and some control signals) to 768 signal lines?
Where is that supposed to fit? Two additional layers to the PCB and larger package for the BGA? Mind the costs, that would be madness.
Stuff like that only works with more integrated technologies like HBM where you don't have to worry about routing your traces since you are staying on silicon.

Voxilla · Oct 18, 2015

Ext3h said:
To be honest: 10Gbit respectively 10 Gigabaud still sound reasonable for GDDR5X.

Forget about differential signaling. Doubling from 384 (well, actually a few more, since there are also 3 clock lines per channel and some control signals) to 768 signal lines?
Where is that supposed to fit?

The rumors are talking 448GB/s with a 256 bit bus, thus requiring 14 gb/s, twice the current max.
Differential could deliver that, without routing issues.
Strange there is so little to be found about this GDDR5X, as it should be in products within half a year.

mczak · Oct 19, 2015

Voxilla said:
Strange there is so little to be found about this GDDR5X, as it should be in products within half a year.

I think that's a bit optimistic. "Expected in products H2/2016" looks like marketing speech for "not before 2017 even in your wildest dreams" to me...
(And FWIW even if you'd believe that official H2/2016 quote, that still would not be "within half a year".)

Voxilla · Oct 19, 2015

I assumed the article was talking about upcoming consumer Pascal based GPUs.
These should arrive in H1/2016. So clearly these will not use GDDR5X.

Maybe the new mid-range high end Pascal (faster as current high end Maxwell) will launch first,
using 2 stacks of HMB2, for 512 MB/s.

lanek · Oct 19, 2015

Voxilla said:
I assumed the article was talking about upcoming consumer Pascal based GPUs.
These should arrive in H1/2016. So clearly these will not use GDDR5X.

Maybe the new mid-range high end Pascal (faster as current high end Maxwell) will launch first,
using 2 stacks of HMB2, for 512 MB/s.

Well, i will try to stay realist and imagine that GP100 high end will use HBM2.. But for low -midrange ( same for AMD ), im really not thinking we will see HBM on them.,

Blazkowicz · Oct 19, 2015

Ailuros said:
That bloke has "sources"?

While I'm sure that GP100 will have HBM and anything <GP104 not, I'm leaving specifically GP104 with a big question mark for the memory until further notice.

I have a question mark for the existence of GP104 and lower itself. They would rely on a cheaper full GM204 card, cut down GM200 etc., then the first GPU introduced after GP100 would be a "first gen Volta".

tunafish · Oct 19, 2015

Blazkowicz said:
I have a question mark for the existence of GP104 and lower itself. They would rely on a cheaper full GM204 card, cut down GM200 etc., then the first GPU introduced after GP100 would be a "first gen Volta".

Traditionally, the top of midrange cards have been what they wanted to move to new processes ASAP, but if 14/16nm processes are sufficiently more expensive than 28nm, it might make sense to rely on huge 28nm models instead.

Frenetic Pony · Oct 20, 2015

tunafish said:
Traditionally, the top of midrange cards have been what they wanted to move to new processes ASAP, but if 14/16nm processes are sufficiently more expensive than 28nm, it might make sense to rely on huge 28nm models instead.

It isn't. TSMC's 16nm isn't the same price drop per transistor that was traditional until 28nm, but per transistor it's still cheaper and so all cards, for both NVIDIA and AMD should be on the new process and nothing older. The actual price drop per transistor of course relies on the specific design, each IHV will have to balance transistor density, clock speed, and power usage for each chip as there's tradeoffs between all of these.

And unlike what some claims on this thread are, 16nm does offer "up to" two times the density of 28nm. But that of course comes at the trade off of the aforementioned heat and etc. Regardless, the point that all cards next year will end up on a new process doesn't really change.

Voxilla · Oct 20, 2015

lanek said:
Well, i will try to stay realist and imagine that GP100 high end will use HBM2.. But for low -midrange ( same for AMD ), im really not thinking we will see HBM on them.,

We know GP100 will have 4 stacks of HMB2 (comes in 4 and 8 GB per stack) for 16/32 GB.
It seems it could make sense for a GPU half it's size ex a GP104 to have 2 stacks of HMB2 for 8 GB.
(still good for 512 MB/s)

Grall · Oct 20, 2015

Frenetic Pony said:
And unlike what some claims on this thread are, 16nm does offer "up to" two times the density of 28nm. But that of course comes at the trade off of the aforementioned heat and etc.

I think you merely re-phrased what others have already said on this topic. I doubt anyone actually claimed 16nm can't do up to twice the density, but rather that you never actually hit that point in a real-world scenario. "Up to" is a well-known marketing/PR term, which most people know to automatically take with n spoonfuls of salt...

Frenetic Pony · Oct 20, 2015

Grall said:
I think you merely re-phrased what others have already said on this topic. I doubt anyone actually claimed 16nm can't do up to twice the density, but rather that you never actually hit that point in a real-world scenario. "Up to" is a well-known marketing/PR term, which most people know to automatically take with n spoonfuls of salt...

Merely pointing out that there's little question that, of course, all the new cards are on the new node. Heck half the reason a new node exists at all is because it's cheaper than the last one per transistor. That's beginning to slow become less and less true as each node, as each node gets exponentially harder to shrink than the last one and going below 28nm seems to have been hitting the proverbial "hockey stick" portion of an exponential graph.

Ailuros · Oct 23, 2015

Alexko said:
That bloke is Gian Maria Forni from Bitsandchips.it, for what it's worth.

I just called the Pope's excorsist just in case....

Blazkowicz said:
I have a question mark for the existence of GP104 and lower itself. They would rely on a cheaper full GM204 card, cut down GM200 etc., then the first GPU introduced after GP100 would be a "first gen Volta".

IMHO the entire "Pascal" core family is in relative terms just a bunch of Maxwell cores on steroids (or for GP100 = Maxwell@steroids + HBM). If you consider how quickly they changed their roadmap, postponed anything Volta probably for 10FF or alike, then it doesn't take a crystal ball to see that there simply wasn't enough time for major architectural changes for anything Pascal.

Not necessarily a bad thing either since both Maxwell as the competitions Pirate islands family (or whatever they're called...) are efficient enough. It was just a mischief that both didn't make sense to appear on anything else but 28nm.

For me personally as long as perf/W can rise significantly with each "generation" it might be interesting to analyze matters from a technical perspective, but I'm still satisfied as a consumer.

Deleted member 2197 · Oct 26, 2015

Micron slides leaked ...

AMD and NVIDIA plan to use GDDR5X for mass graphics cards

AMD and NVIDIA are working on GPUs with support GDDR5X. Some media earlier this month reported that graphics cards based on processors NVIDIA GP104 (architecture Pascal) can use GDDR5X. In addition, sources familiar with the plans of AMD, saying that some of GPU-based architectures GCN 1.3 will also rely on GDDR5X. Unfortunately, the details of the GPU are not known, but it is logical to expect their appearance in the second half of 2016.
....
Apparently, GDDR5X was developed by Micron from the center of the development of graphics memory in Munich, Germany (Munich Graphics Design Center), which the company received as a result of the purchase of Elpida Memory, and she – from a bankrupt many years ago Qimonda. Currently, the expansion of the standard GDDDR5X is not an industry standard in the full sense of the word, but the company is working with the committee JEDEC for formal standardization GDDR5X, which would allow other companies to produce the corresponding chip.

http://www.unlockpwd.com/amd-and-nvidia-plan-to-use-gddr5x-for-mass-graphics-cards/

GDDR5X is your standard GDDR5 memory however, opposed to delivering 32 byte/access to the memory cells, this is doubled up towards 64 byte/access. And that in theory could double up graphics card memory bandwith. Early indications according to the presentation show numbers with the memory capable of doing up-to 10 to 12 Gbps, and in the future 16 Gbps. So your high-end graphics cardsthese days hover at say 400 GB/s. With GDDR5X that could increase to 800~1000 GB/sec and thus these are very significant improvements, actually they are competitive enough with HBM.

http://www.guru3d.com/news-story/graphics-card-memory-gddr5x-to-make-an-appearance.html

Deleted member 2197 · Nov 3, 2015

NVIDIA GeForce driver 358.66 adds Vulkan, Pascal and Volta support

Apparently details in driver 358.66 for Windows 10 adds support for upcoming software & hardware. StefanG3D at LaptopVideo2Go found the following in the driver.

Driver branch: MS358_05-44
OpenCL runtime exposes new compute capabilities

Pascal
-D__CUDA_ARCH__=600
-D__CUDA_ARCH__=610
-D__CUDA_ARCH__=620

Volta
-D__CUDA_ARCH__=700

OpenGL runtime contains following extensions and functions:

VK_EXT_KHR_device_swapchain
VK_EXT_KHR_swapchain

vkCreateInstance
vkEnumerateInstanceExtensionProperties
vkGetDeviceProcAddr
vkGetInstanceProcAddr
vkGetProcAddressNV

Extensions are not recognized by GPUCapsViewer yet.
GLEW based apps fail to launch.

This driver comes with a new runtime "nv-vk32.dll", which exposes following functions

vkAcquireNextImageKHR
vkCreateDevice
vkCreateSwapchainKHR
vkDestroySwapchainKHR
vkEnumerateDeviceExtensionProperties
vkGetDeviceProcAddr
vkGetPhysicalDeviceSurfaceSupportKHR
vkGetSurfaceFormatsKHR
vkGetSurfacePresentModesKHR
vkGetSurfacePropertiesKHR
vkGetSwapchainImagesKHR
vkQueuePresentKHR
vkCreateInstance
vkEnumerateInstanceExtensionProperties
vkGetPhysicalDeviceMemoryProperties
vkGetInstanceProcAddr
vkEnumeratePhysicalDevices
vkCreateImage
vkDestroyImage
vkAllocMemory
vkFreeMemory
vkBindImageMemory
vkGetImageMemoryRequirements
vkQueuePresentNV

http://forums.laptopvideo2go.com/to...r-35866-adds-vulkan-pascal-and-volta-support/

fellix · Nov 4, 2015

Pascal
-D__CUDA_ARCH__=600
-D__CUDA_ARCH__=610
-D__CUDA_ARCH__=620

Looks like NV is prepping three architecture branches for Pascal -- SoC, consumer (mobile/desktop) and HPC?

xpea · Nov 4, 2015

Looks like I'm outdated, What is Vulkan ?
and by now, we should be close to Pascal announcement, no ?

Nvidia Pascal Speculation Thread

Alexko

silent_guy

Voxilla

silent_guy

Ext3h

Voxilla

mczak

Voxilla

lanek

Blazkowicz

tunafish

Frenetic Pony

Voxilla

Grall

Invisible Member

Frenetic Pony

Ailuros

Epsilon plus three

Deleted member 2197

Guest

Deleted member 2197

Guest

fellix

xpea

Similar threads