NVIDIA Maxwell Speculation Thread

I'd like to stand corrected but I wouldn't estimate that an area optimized Denver core under 20SoC/TSMC would consume more than say 5mm2. If NV engineering has deemed having Denver cores onboard for the top dog & HPC a necessity I don't see why even "just" 4 cores wouldn't be enough and that would be half the die area they'd need for FP64 units to exceed 3 TFLOPs DP.
 
GPU Technology Conference (GTC) 2014

Will the GM110 be announced at this conference like the GK110 was announced at GTC 2012?

http://www.gputechconf.com/page/home.html

We first saw Tesla K20 at NVIDIA’s 2012 GPU Technology Conference, where NVIDIA first announced the K20 along with the already shipping K10. At the time NVIDIA was still bringing up the GPU behind K20 – GK110 – with the early announcement at GTC offering an early look at the functionality it would offer in order to prime the pump for developers. At the time we knew quite a bit about its functionality, but not its pricing, configuration, or performance.

http://www.anandtech.com/show/6446/nvidia-launches-tesla-k20-k20x-gk110-arrives-at-last
 
Unlikely, they only just launched the Tesla K40 late last year and won't replace it immediately. Not to mention 500mm2 GPU on TSMC's 20nm process isn't gonna happen anytime soon, even GM107 didn't get 20nm.
 
Unlikely, they only just launched the Tesla K40 late last year and won't replace it immediately. Not to mention 500mm2 GPU on TSMC's 20nm process isn't gonna happen anytime soon, even GM107 didn't get 20nm.

I don't even recall when they released the first GK110 whitepaper 2012, but it was definitely before early September before it went into production and after the GK104 launch in April 2012.

If they'd hypothetically go into mass production with the Maxwell top dog in Q3 or Q4 this year under 20SoC manufacturing costs and/or yields would be a major headache if they wouldn't address the HPC market first. Any reason why the successful Kepler recepy should change?

In any case if their usual sentiments haven't changed the likeliest timeframe for a Maxwell top dog whitepaper introduction is after the Maxwell performance chip has been released IMHO at least.
 
GK110 whitepaper was released at GTC 2012 in May 2012.

http://www.anandtech.com/show/5840/...gk104-based-tesla-k10-gk110-based-tesla-k20/2

http://www.geeks3d.com/20120517/nvi...r-2880-cuda-cores-and-compute-capability-3-5/

I guess it is possible to show a Big Maxwell GM110 at GTC 2014 but it won't arrive for months and I don't see any sessions about new GPUs at GTC 2014, GTC2012 they did list a session about Big Kepler.

http://www.geeks3d.com/20120423/nvidia-gk110-the-true-7-billion-transistor-kepler-gpu/

In this talk, individuals from the GPU architecture and CUDA software groups will dive into the features of the compute architecture for Kepler – NVIDIA’s new 7-billion transistor GPU.
 
May was two months after its final tape out, meaning we still have time ahead ;)
 
Unlikely, they only just launched the Tesla K40 late last year and won't replace it immediately. Not to mention 500mm2 GPU on TSMC's 20nm process isn't gonna happen anytime soon, even GM107 didn't get 20nm.
I stated announced not released.

And the first GK110 products were the Tesla K20 and K20X that were announced at GTC 2012 and released November 2012.

If that timing holds true for the GM110 then an announcement could happen at GTC 2014, white paper a little later and product availability late this year.

Note the K40 replaced the K20X which itself was a year old. If Nvidia is trying to refresh the top Tesla on a yearly basis then this November we should expect the GM110.
 
I stated announced not released.

And the first GK110 products were the Tesla K20 and K20X that were announced at GTC 2012 and released November 2012.

If that timing holds true for the GM110 then an announcement could happen at GTC 2014, white paper a little later and product availability late this year.

Note the K40 replaced the K20X which itself was a year old. If Nvidia is trying to refresh the top Tesla on a yearly basis then this November we should expect the GM110.

Because GK110 was already in mass production and they had shipped 18.7k of them in Sept 2012.

We should already have/heard of 20nm products for your timeline to make sense.
 
Last edited by a moderator:
Has there been any mention of transistor count with GM107? Bus size and memory controller count is the same as GK107 while cuda core count is up 2.5x, but die size is estimated / rumored to be ~156mm^2, an increase of only 32% on the same node process. Interested to see if Nvidia will surpass AMD on having the most transistor-dense chip.
 
Last edited by a moderator:
Your safest bet is to wait for the final tape out of a chip and then count for a best case scenario another 6 months after that for mass production to start. At which stage are we exactly?
 
Note the K40 replaced the K20X which itself was a year old. If Nvidia is trying to refresh the top Tesla on a yearly basis then this November we should expect the GM110.
Because GK110 was already in mass production and they had shipped 18.7k of them in Sept 2012.

We should already have/heard of 20nm products for your timeline to make sense.
If big Maxwell in late 2014 is too early, then is there a possibility that we will see another GK110/GK180 Tesla refresh later this year? (Something like a "K40X" with 800-900 MHz core and > 6 Gbps memory.)
 
I really dont have a high hope for the denver cores, the only good reason for denver cores+GPU is that you can built a cluster with only GPUs without the need of any x86.

But I dont believe any supercomputer in the future will build solely based on Maxwells, the most obvious reasons is the significant amount of software used in HPC were based on x86, it takes huge investment and effects to port some of them to a system with a new ISA, I dont belive Nvidia have the musle to do that.

For desktop application, the reason is the same, the x86 market is so huge that any shift to an different ISA is so costly to the degree no one would ever think of that, let along a relatively small weight like Nvidia.

So at least for the generation of Maxwell, it is safe to say the vast majority of them will remain at a PCIE slot.

And considering the existence of serious PCIE bandwidth constraints and very capable x86 host CPUs, I seriously doubt the point of having a local CPU on a GPU.

Most of the algorthims I wrote for hybrid computing systems will divide the workloads of tasks into two parts, one part is highly parallelized, and sending them to GPU, the other part is not that easy to be parallelized, and so I let CPU to handle.

With async data transfer/work, the mechanism works very well, utilize the system's computing resources better, and most importantly, let each computing devices handle only the task they are GOOD at. I seriously doubt a Denver ARM core could outperform the latest Xeon core on serialized tasks, not to mention the added die-size/silcon that would otherwise being more useful to GPU's main job: doing highly paralelized computations.
 
Your safest bet is to wait for the final tape out of a chip and then count for a best case scenario another 6 months after that for mass production to start. At which stage are we exactly?

Well if the cards are getting reviewed next week, I guess we're at that stage to find out then.
 
I really dont have a high hope for the denver cores, the only good reason for denver cores+GPU is that you can built a cluster with only GPUs without the need of any x86.

The purpose of "Project Denver" is to extend the reach of ARM. Denver should bring very high CPU performance and very high energy efficiency, relatively speaking, compared to most other ARM processors. In order for Android to grow and evolve beyond simply smartphones and small tablets, this type of processor is required. Obviously there will be many [x86] applications where an Intel CPU + NVIDIA GPU will be required for very high CPU and very high GPU performance, respectively, (hence the reason Intel and NVIDIA will still be collaborating with each other in the future), but the Denver cores still have their place in any ecosystem where ARM is a viable option.
 
So at least for the generation of Maxwell, it is safe to say the vast majority of them will remain at a PCIE slot.

And considering the existence of serious PCIE bandwidth constraints and very capable x86 host CPUs, I seriously doubt the point of having a local CPU on a GPU.

But that's just the point: having a CPU local to the GPU lets you avoid having to transfer things over PCIe, since you jut run the serial portion there.
 
But that's just the point: having a CPU local to the GPU lets you avoid having to transfer things over PCIe, since you jut run the serial portion there.

It's not only that; I think many missed the reference that Intel's upcoming Knights Landing will also be sold as a standalone CPU.
 
Probably gpu will have independent small os like xeon phi and videocore if denver is incorporated.
And denver performs scheduling and management of gpu and reduces the driver load of the main system.
 
Actually CPU+GPU on the same die brings great value and new opportunities to both the HPC and gaming applications.

Within general computing, the lower latency and higher bandwidth means that small parallell tasks that would previously have been deemed unfeasible due to the long thin PCIe "pipe" can now also be offloaded on the GPU.

Furthermore, the opportunity arises to run the entire application on the heterogeneous system / APU , totally cutting the PCIe out of the picture.

In gaming similar gains can be seen. Today data travels to the GPU in basically one direction, HostToDevice. And due to low latency requirements it's hard to for example do a compute intensive physics simulation on the GPU and then let it influence AI decisions (ex player positions etc,.), there simply isn't time. With such a tight coupling it will be possible. To my understanding the new FLEX: Unified Physics solver will take advantage of this.

Furthermore, this extra processor can potentially offload the traditional system CPU in a manner similar to AMD:s mantle.

The gains are twofold, hence it will really make sense to have smaller ARM core even in the lower end GPUs.

Here's a question, can we expect to see these processors directly in a PC socket in the future (similar to KNL)?
 
Back
Top