NVIDIA Maxwell Speculation Thread

Last edited by a moderator:
Well we will know in 18 days when the 750/750 TI is released as it will be the first Maxwell part.

I seriously doubt that these two cards will have an on-die ARM processor. That said, I do believe that you are overstating the die size and cost impact of adding Denver CPU cores. NVIDIA midrange GPU's are already close to 200 mm^2, upper midrange GPU's are close to 300 mm^2, and high end GPU's are close to 550 mm^2. And considering that Tegra will be a building block for all future NVIDIA GPU architectures, it would not make sense to spend time and effort on selectively removing the built-in CPU cores, even if these cores don't get used for certain [x86] applications and scenarios.
 
Last edited by a moderator:
Reread my post, I was asking about the rumor of the Denver ARM core being on DISCRETE add-in cards not the HPC/Server or Tegra.

I've understood your post perfectly well; now it's time YOU actually read and understand the text in the links provided.

Well we will know in 18 days when the 750/750 TI is released as it will be the first Maxwell part.

If then it is for the top dog only; when someone says TESLA and is an NV employee take a long hard guess what he might mean. Neither Tegra is related to the latter nor any other Maxwell cores outside the top dog.
 
A1xLLcqAgt0qc2RyMz0y said:
And I think you are missing the point of the cost dis-advantage that Nvidia would be put under with a very large size die on discrete laptop and desktop Maxwell GPUs that have no need for an on-die ARM processor.
Indeed, kind of like how I "missed" the disadvantages of Nvidia putting an SMX in Tegra...
 
Isn't that already common anyway with things like high DPFP rates, ECC, etc. on high end compute/server focused parts?
Within the same ASIC by enabling/disabling certain features, yes.
With separate ASICs? Not to my knowledge.

The Denver core is large so it will take up valuable die area so unless it makes sense it will not be on the silicon.

It makes sense for the HPC/Server and Tegra market but not for the discrete add-in market.

The HPC/Server and Tegra will only have the Denver ARM as a CPU whereas the discrete market will have other CPUs (Intel/AMD/IBM) as the main CPU so the Denver core would be wasted on those systems.

That depends on how many Denver cores they plan to implement. I sort of doubt they plan to put 16Denvers on the highend Maxwell. 2-4-8(low to highend) would seem more likely IMO.

While I agree this is obviously more geared towards the HPC and ultramobile markets, Tesla and Tegra, there are a lot of variables that Nvidia will(most likely already have) consider.

HPC still might require a CPU other than Denver, Titan nodes had quite a bit of DRAM(5x of the GPU) would 12-24Gb on the GPU be enough if that is even possible?
16core Opteron with 32gb DDR3 to feed a single Titan GPU.
 
Last edited by a moderator:
i still have some problem to understand how the software will comunicate with the ARM cores... x86 language and memory is completely different of ARM..
 
i still have some problem to understand how the software will comunicate with the ARM cores... x86 language and memory is completely different of ARM..

Not that I am trying to answer that, but according to rumours Denver is not exactly an ARM core either, despite using some ARM IP. Speculations say it is actually very similar to an hypothetical x86 emulator they hand planned before Intel axed it.
 
If you did then why did your response not once show any links or information in regards to whether or not the Discrete Maxwell GPU Add-In cards would have an on-board ARM processor.

Nothing that you couldn't read out of it. There will be CPU cores in GPU cores integrated whether they'll start with the coming or future generation and yes it's due a necessity and for any sort of decorational twist an architect might have had.

It's not as clearly stated in the marketing crap some love to parrot consistently? Never mind that trash isn't exactly material for any research either.
 
Nothing that you couldn't read out of it. There will be CPU cores in GPU cores integrated whether they'll start with the coming or future generation and yes it's due a necessity and for any sort of decorational twist an architect might have had.

It's not as clearly stated in the marketing crap some love to parrot consistently? Never mind that trash isn't exactly material for any research either.

What do you expect would be the primary benefit of integrating CPU cores with a GPU on an AIB type product? I'm sorry to ask something so rudimental, and to possibly ask you to repeat yourself--I probably have read all 28 pages of this thread (not recently enough for it to matter), and I haven't seen a really concrete theory emerge. Some vague ones about texture compression (hmm), x86 emulator / driver replacement (hunh), some kind of flow control, score-boarding arrangement (mehnh)?
 
"You know, we were going to call it CC 4.0, but Maxwell's compute capabilities are just so much better, the performance improvements it enables with complex algorithms are just so amazing that it didn't feel right to call it that. So we went with CC 5.0 instead, which gives you a much better idea of what Maxwell can really do."

—Jen-Hsun Huang.

Seriously, I hate these shits.

Any speculation on CC 5?

My biggest disappointing with Dynamic Parallelism was that spawning threads couldn't read shared memory parents allocations, let's hope they fixed that somehow
 
What do you expect would be the primary benefit of integrating CPU cores with a GPU on an AIB type product? I'm sorry to ask something so rudimental, and to possibly ask you to repeat yourself--I probably have read all 28 pages of this thread (not recently enough for it to matter), and I haven't seen a really concrete theory emerge. Some vague ones about texture compression (hmm), x86 emulator / driver replacement (hunh), some kind of flow control, score-boarding arrangement (mehnh)?

Intel didn't grant a x86 license and it does also include x86 emulation, ie they're not allowed to emulate either afaik.

Speaking of Intel, who is NVIDIA's strongest competitor in the HPC market right now? I'm sure many here if not all have seen the upcoming Knights Landing preliminary data. Does that one ring a bell? http://regmedia.co.uk/2013/06/13/intel_xeon_phi_knights_landing.jpg

Tesla SKUs are not the typical AIB type of products and that's the only point where I'd expect to see CPU cores integrated at least at first.

CarstenS' speculative theory that the CPU cores could be theoretically used as SFUs inside the GPU sounds interesting too.

Other than that as I said Bill Dally's Project Echelon which is some sort of prediction for a future Exascale machine would be something like that:

http://insidehpc.com/2009/11/10/nvidias-bill-dally-unveils-exascale-plans/

NVIDIA_ExaScale_Machine.jpg


There are quite a few links/presentations about it on the net, but here's a quick summary of related pictures/slides: https://www.google.gr/search?q=NVID...iJKWa1AWExIHIBg&ved=0CCQQsAQ&biw=1280&bih=804

If those 13 TFLOPs DP should sound like a lot, I'd expect the Maxwell top dog (in 2015?) to exceed a bit the 3 TFLOP mark; meaning the above slide might be a bit optimistic but not too far from future reality.
 
Einstein and everything else connected to Project Echelon is a research project, not a product roadmap.

OK, I thought Volta would still be a modified maxwell architecture with stacked dram, and Einstein would be the clean slate design.(Tick Tock type refreshes).

Nevermind.
 
Volta should be similar jump as Kepler was to Fermi, so Tock/Tick should work there.

We haven't heard code names for the chips after Volta.
They did release Kepler, Maxwell names during or near release of Fermi, so we might hear codenames soon.)
 
TSMC has 10 FinFET currently projected for 2016 which sounds like a fine candidate for Volta. For the uber-next generation they either started development recently or will start soon. Considering we are still waiting for the first Maxwell Benjamin to appear and it'll probably take another year at least for the entire product family to unfold, anything beyond Maxwell sounds extremely far fetched.

As for any "jump" from architecture to architecture, looking at that roadmap and since it concentrates exclusively on DP GFLOPs/W (on purpose) I can see an almost linear >2x times increase from generation to generation from Fermi all the way up to Volta.
 
Back
Top