NVIDIA Maxwell Speculation Thread

Well, it seems to keep the number of ROPs and memory bus width, while greatly increasing the number of TMUs from 32 to 80, if rumours are to be believed.
Where did you see the 80 TMUs? There's imho absolutely no way the TMU/ALU ratio is going to be increased. I could definitely see a decrease though (just as gk208/gk20a already do, only half the TMUs per SMX).
 
VRAM 3GB ?

192bit?

No, all the rumors so far point towards a 128-bit [quad-channel] memory bus interface. And those GPU-Z screenshots seem very incomplete (no mention of TMU count, no mention of CUDA core count, etc).
 
156mm^2 for GM107 vs. Bonaire's 160mm^2, very minimal difference. GM107 is looking to be ~10-12% faster than Bonaire's 7790. The big thing left now will be power consumption. Current rumor indications has it that a "stock" gtx750ti will suffice with the PCIe powering it, and that 750ti cards will include a 6-pin adapter for overclocking. If that's the case, we're looking at something around GK107 power consumption (probably slightly more on average under load), which would make GM107 around 20-25% more efficient than AMD's current most-efficient chip on similar processes.

Pretty damn impressive for being on the same node size. TSMC has already stated that moving to 20nm from 28nm will result in either a 30% power reduction or 30% performance improvement at the same power. I'm sure Nvidia will opt to keep TDP the same when they shrink GM107, so it's not hard to extrapolate what this chip should do at 20nm. Right now, though, GM107 is shaping up to be a staggering 60-75% faster than GK107 on what is basically the same node, at approximately the same power consumption, for a 32% increase in die size. Pretty amazing IMO. Nvidia has come a long ways in efficiency, and then some, since Fermi. If Maxwell scales up in performance relative to die size like Kepler, Nvidia can build a faster-than-GK110 Maxwell chip on 28nm at around 400mm^2.

I wonder if Nvidia scrapping GK106. That would be odd, though, as there is a pretty big gap between the gtx760 and GM107. I guess it would simplify orders and binning, though.
 
Last edited by a moderator:
Where did you see the 80 TMUs? There's imho absolutely no way the TMU/ALU ratio is going to be increased. I could definitely see a decrease though (just as gk208/gk20a already do, only half the TMUs per SMX).

There was a chart posted a couple of pages back that referenced 80 TMU's for this purported [5 SMX] GTX 750 TI. That would be consistent with the TMU/SMX ratio in various Kepler GPU's.

That said, it does appear that halving the TMU count per SMX is not unprecedented (see GK208 and Kepler.M). That would also help to explain the modest die size. In fact, if we assume that GTX 750 Ti has 5 SMX's with 192 CUDA cores per SMX, 8 TMU's per SMX, and a 128-bit memory interface with 4 ROP's per 32-bit mem. channel (which matches the leaked specs other than the TMU count), then there would be remarkable parallels to the Kepler.M GPU in Tegra K1. In comparison to Kepler.M, the purported GTX 750 Ti would have 5x more CUDA cores [Shader/GFLOPS throughout], 5x more memory bandwidth, 5x more pixel fillrate [ROP throughput], and 5.5x more texture fillrate [TMU throughput]!

In my opinion, the design and energy efficiency of GTX 750 Ti will be influenced by the breakthroughs made with Kepler.M, and this 750 Ti GPU will be a "bridge to Maxwell" so to speak.
 
A Sweclockers report (posted earlier in this thread) appears to say that neither the 750 Ti nor the 750 need external power connectors, but the translation or the report might be incorrect.

Here I will assume that this 768 number and the GPU-Z of the 750 Ti showing 960 CCs are correct. From these values, gcd(768, 960) = 192 implies a very narrow set of possibilities for the number of CCs per SMX (unless Maxwell can somehow disable parts of a SMX). A Kepler SMX has 192 CCs, so unless Maxwell is taking a step back in CCs per SMX, I would guess a Maxwell SMX also has 192 CCs.

I think too, 5 SMXs on the Ti and 4 ones on the normal 750

Would that be measuring only the card or the full system?

The full system I guess
 
This will be the vest it? From GTX750Ti see new core Maxwell

4222016_79022.jpg


Is it less than 120w in furmark?

Well, the CPU is different, but on the following Anandtech review of GTX650Ti Boost (which GTX750Ti looks to be close in performance to), the system was consuming 256W, more than double of that value.

http://www.anandtech.com/show/6838/nvidia-geforce-gtx-650-ti-boost-review-/13
 
There was a chart posted a couple of pages back that referenced 80 TMU's for this purported [5 SMX] GTX 750 TI. That would be consistent with the TMU/SMX ratio in various Kepler GPU's.
Ah yes you're right. I missed that and thought it was only 4 SMX. I think though it's not really obvious at this point how similar maxwell really is to kepler, so this is all very speculative. Maybe nvidia finally listens to me and kills off the separate SFUs :) which would also effectively increase TMU/ALU ratio a bit, so all the more reason to half this (I'm sure this could have some performance implication in some cases, but gk208 doesn't really seem to suffer all that much).
 
TSMC has already stated that moving to 20nm from 28nm will result in either a 30% power reduction or 30% performance improvement at the same power. I'm sure Nvidia will opt to keep TDP the same when they shrink GM107, so it's not hard to extrapolate what this chip should do at 20nm.

Yes it is because as it stands, they most likely won't be shrinking it "as is."
 
cut
Here I will assume that this 768 number and the GPU-Z of the 750 Ti showing 960 CCs are correct. From these values, gcd(768, 960) = 192 implies a very narrow set of possibilities for the number of CCs per SMX (unless Maxwell can somehow disable parts of a SMX). A Kepler SMX has 192 CCs, so unless Maxwell is taking a step back in CCs per SMX, I would guess a Maxwell SMX also has 192 CCs.

then the biggest news of Maxwell was the mere integration of the CPU Arm (which seems there is no mention in gm107) and in different tmu/alu ratio?
 
Last edited by a moderator:
I wonder if Nvidia scrapping GK106. That would be odd, though, as there is a pretty big gap between the gtx760 and GM107. I guess it would simplify orders and binning, though.
I think they will "scrap" GK106 in the sense that I don't think there will be a retail desktop 700 series part with GK106. For the OEM area there is the 760 192-bit which would fill the gap there.
 
then the biggest news of Maxwell was the mere integration of the CPU Arm (which seems there is no mention in gm107) and in different tmu/alu ratio?

No, the biggest breakthrough with Maxwell (other than unified virtual memory between CPU/GPU, and perhaps some IQ enhancements) is more energy efficient GPU computing. Maxwell is the first NVIDIA GPU architecture to be designed "mobile first".

To me, the GTX 750 Ti looks more like a Kepler.M derivative rather than a Maxwell derivative. The expectation is that Maxwell will use an even more advanced fabrication process, in addition to fully custom Denver CPU on board. That said, Kepler.M appears to be extremely energy efficient, so it is a great "bridge" to Maxwell so-to-speak.

The TMU/ALU ratio of GTX 750 Ti will be no different than what we already see with Kepler (either 16/192 or 8/192).
 
then the biggest news of Maxwell was the mere integration of the CPU Arm (which seems there is no mention in gm107) and in different tmu/alu ratio?

No, the biggest breakthrough with Maxwell (other than unified virtual memory between CPU/GPU, and perhaps some IQ enhancements) is more energy efficient GPU computing. Maxwell is the first NVIDIA GPU architecture to be designed "mobile first".

To me, the GTX 750 Ti looks to be a Kepler.M derivative rather than a Maxwell derivative. The expectation is that Maxwell will use an even more advanced fabrication process, in addition to fully custom Denver CPU on board.

The TMU/ALU ratio of GTX 750 Ti will be no different than what we already see with Kepler (either 16/192 or 8/192).
 
No, the biggest breakthrough with Maxwell (other than unified virtual memory between CPU/GPU, and perhaps some IQ enhancements) is more energy efficient GPU computing. Maxwell is the first NVIDIA GPU architecture to be designed "mobile first".
Yup, in terms of performance the energy efficiency is the only way forward.

From my understanding Maxwell should see first 'big' changes to nvidias architecture since Fermi. (Big changes before that were the G80.. NV40.. NV30.. basically nvidias Tock. )
I expect changes in TEX units, ROPs etc. in addition to the possible Denver cores.
 
Last edited by a moderator:
From my understanding Maxwell should see first 'big' changes to nvidias architecture since Fermi.
On the Videocardz comment sections I read that Kepler was the bigger change and Fermi and Maxwell are smaller jumps over their previous architectures.

I always thought that Fermi was the bigger change (I don't know much about GPU architectures), but which is true?
 
On the Videocardz comment sections I read that Kepler was the bigger change and Fermi and Maxwell are smaller jumps over their previous architectures.

I always thought that Fermi was the bigger change (I don't know much about GPU architectures), but which is true?

I would think Fermi would be the bigger change, with the modular GPC concept.
 
If Maxwell really has ARM-based CPU-IP inside it (which is probably reserved for and makes sense in the Big Iron anyway), I would count that as the biggest change chip-wide. As to how much the actual graphics architecture will change: If it's only a process change and the removal of the SFUs (in the Big Iron) in order to make room for the serial processing stuff, then that's not really major in my books.
 
Is it possible that the eventual ARM/Denver core would replace the command processor itself? Is such general architecture enough powerful to act also as a machine-state processor or some dedicated logic must be left on?
 
On the Videocardz comment sections I read that Kepler was the bigger change and Fermi and Maxwell are smaller jumps over their previous architectures.

I always thought that Fermi was the bigger change (I don't know much about GPU architectures), but which is true?

I guess they refer about dropping the so-called hot clocks technique, where the shaders were running twice the frequency
 
Back
Top