Nvidia Pascal Speculation Thread

Status
Not open for further replies.
Starting with CUDA 6, Unified Memory simplifies memory management by giving you a single pointer to your data, and automatically migrating pages on access to the processor that needs them. On Pascal GPUs, Unified Memory and NVLink will provide the ultimate combination of simplicity and performance. The full-bandwidth access to the CPU’s memory system enabled by NVLink means that NVIDIA’s GPU can access data in the CPU’s memory at the same rate as the CPU can. With the GPU’s superior streaming ability, the GPU will sometimes be able to stream data out of the CPU’s memory system even faster than the CPU.

So many years waiting for that.
Cool.
 
Last edited by a moderator:
For Pascal, or we could expect to see it early ?

Pretty sure the latest roadmap from March 2014 showed Pascal introducing stacked DRAM in 2016.
PascalRoadmap.jpg
 
ok, completely forget this roadmap.. i was ask me if we could expect this for 2015 GPU's. but effectively the roadmap is 100% clear. ( i was mix it with the old roadmap, and forget plans have change in between )
 
Last edited by a moderator:
It says 3D rather than stacked, is there a difference between the two or is Nvidia just using a different term for the same thing?

I think it's basically the same thing but with differences in architecture (ie, NVLink).

Pascal (the subject of a separate discussion/article) has many interesting features, not the least of which is build-in, or rather I should say, built-on, memory. Pascal will have memory stacked on top of the GPU. That not only makes a tidier package, more importantly it will give the GPU 4x higher bandwidth (~1 TB/s), 3x larger capacity, and 4x more energy efficient per bit.

Basically the already high-speed GPU to video memory bandwidth will go up four orders of magnitude. That alone will help speed up things, but Nvidia took it one-step further and added GPU-to-GPU links that allow multiple GPUs to look like one giant GPU.

Today a typical system has one or more GPUs connected to a CPU using PCI Express. Even at the fastest PCIe 3.0 speeds (8 Giga-transfers per second per lane) and with the widest supported links (16 lanes) the bandwidth provided over this link pales in comparison to the bandwidth available between the GPU and its system memory.
NVLink addresses this problem by providing a more energy-efficient; high-bandwidth path between the GPU and the CPU at data rates 5 to 12 times that of the current PCIe Gen3. NVLink will provide between 80 GB/s and 200 GB/s of bandwidth.
http://www.eetimes.com/author.asp?section_id=36&doc_id=1321693&page_number=2
 
4x is "four orders of magnitude" :!:
I think that article was submitted a bit too quickly.
One order of magnitude, maybe (there's no universal definition of what is if, but at least 4 is bigger than 2, ln 10 or "e"!)
 
Where is the "3x larger capacity" coming from? Are the memory modules more dense somehow?
 
One order of magnitude, maybe (there's no universal definition of what is if, but at least 4 is bigger than 2, ln 10 or "e"!)

In the physical sciences I'm aware of, one order of magnitude is taken to mean a factor of ten. So four orders of magnitude would be 10000. A factor 4 is just over half an order of magnitude.
 
Status
Not open for further replies.
Back
Top