NVIDIA Maxwell Speculation Thread

xDxD · Oct 22, 2013

http://wccftech.com/alleged-nvidia-...ions-unveiled-project-denver-maxwell-refresh/

lanek · Oct 22, 2013

xDxD said:
post n. 28: http://www.xtremesystems.org/forums...1-Nvidia-unveils-the-GeForce-GTX-780-Ti/page2

(I admit, however, does not know who the user SKYMTL is)

SKYMTL is the boss, reviewer and writer of HardwareCanucks.com http://www.hardwarecanucks.com/

In general he's really well informed.

xDxD · Oct 22, 2013

lanek said:
SKYMTL is the boss, reviewer and writer of HardwareCanucks.com http://www.hardwarecanucks.com/

In general he's really well informed.

ok, many thanx

Kaotik · Oct 23, 2013

iMacmatician said:
Could be, wasn't GF110 also known as GF100b?

Yes, but unlike the "GF100b"-name suggests, there were actual differences between the chips, like GF110/100b having GF104's superior texturing units compared to GF100

LiXiangyang · Oct 24, 2013

IF (a big one) the recent rumor about Maxwell is true, the specs of GM100 is a bit strange, how could downgrade the FP32 to FP64 throughout from fermi to 2:1 to Kepler's 3:1 to Maxwell's 4:1 is considered as an optimization of FP64 performance?

The only thing I can think of about the logic behind this is, Nvidia go the kepler route: adding more floating-point uints at the expense of interger performance of the device.

So we expect a FP32 monster who is pretty weak at interger/bit operations, even weaker than Kepler in terms of peak Giops/peak Gflops etc.

I hope not, since many algrothims depend on interger ops and the weak Giops of Kepler has already become the performance bottleneck on GK110 for some of my applications to the degree I have to give some of these work to CPU to handle.

DSC · Oct 25, 2013

https://devtalk.nvidia.com/default/...inux-solaris-and-freebsd-driver-331-17-beta-/

Added nvidia-uvm.ko, the NVIDIA Unified Memory kernel module, to the NVIDIA Linux driver package. This kernel module provides support for the new Unified Memory feature in an upcoming CUDA release.

Maxwell coming soon?

rapso · Oct 25, 2013

what is the difference to the "Unified Virtual Addressing" they had so far? (or was that a win7/8 64bit only feature so far?)

RecessionCone · Oct 25, 2013

rapso said:
what is the difference to the "Unified Virtual Addressing" they had so far? (or was that a win7/8 64bit only feature so far?)

No, UVA has worked on Linux for quite a while now. UVM lets the GPU page non-page-locked CPU memory. The first implementation, which NVIDIA demonstrated at GTC in the spring and called UVM-lite, works on Kepler as well, but requires the memory be allocated through a special malloc that uses a kernel extension to handle page faults coming from the GPU. If you allocate memory with this allocator, you don't have to explicitly move data to and from the GPU, it gets paged back and forth as required by your program. The full UVM that Maxwell brings should remove the need for a special allocator, letting the GPU access any memory in the system.

MfA · Oct 25, 2013

RecessionCone said:
letting the GPU access any memory in the system.

Any memory in the process space I assume? (Unless you are running something on the GPU from root.)

rapso · Oct 25, 2013

RecessionCone said:
No, UVA has worked on Linux for quite a while now. UVM lets the GPU page non-page-locked CPU memory. The first implementation, which NVIDIA demonstrated at GTC in the spring and called UVM-lite, works on Kepler as well, but requires the memory be allocated through a special malloc that uses a kernel extension to handle page faults coming from the GPU. If you allocate memory with this allocator, you don't have to explicitly move data to and from the GPU, it gets paged back and forth as required by your program. The full UVM that Maxwell brings should remove the need for a special allocator, letting the GPU access any memory in the system.

thx for the detailed explanation.
So, with maxwell, there won't be any page faults to copy to the GPU in a software emulation, but native hw access?

@MfA
I think if GPU and CPU shares the virtual address space, then it's process space from ur application. just like it's done now in the emulated way RecessionCone has described.

RecessionCone · Oct 25, 2013

MfA said:
Any memory in the process space I assume? (Unless you are running something on the GPU from root.)

Yes, UVM respects process isolation.

RecessionCone · Oct 25, 2013

rapso said:
thx for the detailed explanation.
So, with maxwell, there won't be any page faults to copy to the GPU in a software emulation, but native hw access?

There will still be page faults, it's just that the GPU and CPU will share the same page tables with Maxwell. When you write an application using UVM, you won't need to copy data to the GPU explicitly. You just allocate data on the CPU as normal, and run the program on the GPU using pointers from your CPU program. When the GPU accesses data that's sitting on the CPU, the GPU page faults and requests the page from the CPU. The CPU then pages the memory out as normal with virtual memory: except that instead of paging it to disk, it pages it to the GPU. When the CPU accesses memory that's sitting on the GPU, the CPU page fault page it back in from the GPU. For simple applications, you won't have to worry about where your memory is, which will make it easier to program GPUs.

As I said, I'd guess that this particular kernel extension is for UVM-lite, which is very similar, except it can only operate on memory allocated with NVIDIA's allocator: it can't access arbitrary memory in the process. But UVM-lite runs on Kepler and so it's a step towards the full UVM.

Pinstripe · Nov 9, 2013

RecessionCone said:
As I said, I'd guess that this particular kernel extension is for UVM-lite, which is very similar, except it can only operate on memory allocated with NVIDIA's allocator: it can't access arbitrary memory in the process. But UVM-lite runs on Kepler and so it's a step towards the full UVM.

Has UVM-lite already been released for Kepler GPUs on Windows drivers? I never heard Nvidia making a big fuss about it. Or is this still to be released?

RecessionCone · Nov 9, 2013

Pinstripe said:
Has UVM-lite already been released for Kepler GPUs on Windows drivers? I never heard Nvidia making a big fuss about it. Or is this still to be released?

No, it hasn't been released yet. Nvidia demoed it at GTC 2013 running on Linux. But it hasn't been released yet on Linux, and I'd imagine Windows support will be more difficult for them to implement.

LiXiangyang · Nov 9, 2013

Some of the nvidia marketing stuff, however, from my personal experience, it contain certain degree of fact, from my experience, its much easier to make the maximum out of nvidia's GPU comparing that of MIC, due to the former has rather predictable execute pipeline and can do some low level stuff with rather simple codes.

http://www.nvidia.com/object/justthefacts.html#foot

trinibwoy · Nov 10, 2013

We got programmable. We got unified. We got general compute. We went from VLIW to "scalar". We have scalable tessellation.

What's the next frontier in graphics architecture? There are some rumblings about maxwell adding chunkier caches but that on its own isn't very exciting.

rpg.314 · Nov 10, 2013

trinibwoy said:
We got programmable. We got unified. We got general compute. We went from VLIW to "scalar". We have scalable tessellation.

What's the next frontier in graphics architecture? There are some rumblings about maxwell adding chunkier caches but that on its own isn't very exciting.

Au contraire, more caches, more coherency an more cpu's on die are very exciting.

DavidGraham · Nov 10, 2013

trinibwoy said:
What's the next frontier in graphics architecture?

Unified memory space. GPU can access system RAM and CPU can access video RAM.

trinibwoy · Nov 10, 2013

rpg.314 said:
Au contraire, more caches, more coherency an more cpu's on die are very exciting.

Yeah more is always good but on the surface it doesn't seem that thrilling. More bandwidth and more flops is nice too but is standard fare.

DavidGraham said:
Unified memory space. GPU can access system RAM and CPU can access video RAM.

This is definitely cool and will help developers write cleaner, faster code but it's still DMA over PCIe for discrete setups. It's probably a bigger win for HPC/Linux anyway. Does DirectCompute even have any kind of DMA support?

Blazkowicz · Nov 10, 2013

More cache (or global RAM or global store or whatever you call it) is maybe great, if even the low end GPUs have at least say 1024K you now can plan to use algorithms in your game or app that need that amount of storage to work.

Little observation, the geforce 700 series (barring laptop stuff) all have at least 512K L2 : even the 750 ti has a 256bit GK104, which allows to keep the full size, and the small GK208 (seen on a few All-in-one, perhaps sold on future retail GT 7xx cards) has as much L2 cache.
You can thus run computing / shader stuff that runs fine with 512K, but would absolutely tank on a GPU with less L2 because of all that hitting the slow RAM.

NVIDIA Maxwell Speculation Thread

xDxD

lanek

xDxD

Kaotik

Drunk Member

LiXiangyang

DSC

rapso

RecessionCone

MfA

rapso

RecessionCone

RecessionCone

Pinstripe

RecessionCone

LiXiangyang

trinibwoy

Meh

rpg.314

DavidGraham

trinibwoy

Meh

Blazkowicz

Similar threads