NVIDIA Maxwell Speculation Thread

McHuj · May 21, 2014

Erinyes said:
And finally...some news on GM200. Seems like there's been a lot of smoke and mirrors stuff going on. What I have again heard is that it is still on 28nm as planned..and should be taping out late this month/early next month.

Just so I understand GM200 would be the highest end part? and would it even be slated for a consumer GPU at first?

With GM107 out, I'm assuming there's an equivalent tier to what GK104 was, coming out before GM200 (hopefully in a GTX 870).

I looking forward to upgrading my GTX670, but so far there's nothing on the market worth upgrading to (for a reasonable price)

Kaotik · May 21, 2014

Erinyes said:
And finally...some news on GM200. Seems like there's been a lot of smoke and mirrors stuff going on. What I have again heard is that it is still on 28nm as planned..and should be taping out late this month/early next month.

You sure you're not mixing that with GK210?

tviceman · May 21, 2014

Erinyes said:
Interesting article on EE Times - http://www.eetimes.com/author.asp?section_id=36&doc_id=1322399

Some data which is very relevant to the discussion above:-

Cost of a 28nm wafer - $4,500-$5,000

Cost of a 20nm wafer - $6,000

Cost of a 16/14nm Finfet wafer - $7,270

There's a lot of other good info..worth a read. Some more key points mentioned were:-

TSMC's 20nm capacity is expected to be 60,000 Wafers per month in Q4.

A number of fabless companies will tape out their 16/14 FinFET product designs in the third quarter of 2014 with high-volume production planned for the second or third quarter of 2015.

And finally...some news on GM200. Seems like there's been a lot of smoke and mirrors stuff going on. What I have again heard is that it is still on 28nm as planned..and should be taping out late this month/early next month.

With 20nm offering "up to 1.9x transistor density" vs. 28nm, those costs actually make it look really beneficial to transition away from 28nm asap. Then again, yields might be terrible at first... but at similar yields, a company that can effectively shrink down their chip and get excellent transistor density scaling would be getting 75%+ more chips per wafer.

silent_guy · May 21, 2014

I think the real issue is that when 20nm becomes as cheap as 28nm per working transistor, 16nm may be readily available. At that point, why bother?

A1xLLcqAgt0qc2RyMz0y · May 21, 2014

tviceman said:
With 20nm offering "up to 1.9x transistor density" vs. 28nm, those costs actually make it look really beneficial to transition away from 28nm asap. Then again, yields might be terrible at first... but at similar yields, a company that can effectively shrink down their chip and get excellent transistor density scaling would be getting 75%+ more chips per wafer.

Don't forget the higher costs for the masks.

Erinyes · May 24, 2014

Dangerman said:
So now it absolutely will be on 28nm? Did Nvidia find that the density decrease wasn't justifiable enough to be on 20nmSoC?

Apparently it was always intended to be on 28nm..the 20nm rumours were just that..rumours. If so..this would be a very interesting chip IMHO..they are bound by the reticle limit and probably would not be able to increase die size much beyond GK110. Given that they're on the same process, a GM200 v/s GK110 comparison would tell us exactly how good the architectural improvements are.

McHuj said:
Just so I understand GM200 would be the highest end part? and would it even be slated for a consumer GPU at first?

With GM107 out, I'm assuming there's an equivalent tier to what GK104 was, coming out before GM200 (hopefully in a GTX 870).

I looking forward to upgrading my GTX670, but so far there's nothing on the market worth upgrading to (for a reasonable price)

Yes..GM200 would be the highest end part until Pascal. No idea but this time around there would be plenty of 28nm capacity available so they shouldn't be short of chips I would think.

Yes..GM204..please read the last few pages of this thread.

Kaotik said:
You sure you're not mixing that with GK210?

What is GK210?

tviceman said:
With 20nm offering "up to 1.9x transistor density" vs. 28nm, those costs actually make it look really beneficial to transition away from 28nm asap. Then again, yields might be terrible at first....

Only beneficial if you absolutely need higher density..not if you want lower cost... Yes..given initial yields..cost per transistor would be higher in the beginning. This slide by NV should give you a fair idea -

tviceman said:
but at similar yields, a company that can effectively shrink down their chip and get excellent transistor density scaling would be getting 75%+ more chips per wafer.

Assuming they stick to the same transistor count of course. Traditionally..transistor count goes up every generation and the die size remains around the same. Eg..GF104 had 1.95B transistors on a 330mm2 die whereas GK104 had 3.5B transistors on a 294 mm2 die.

I think what you're overlooking is that cost/mm2 goes up with every node.

Ailuros · May 24, 2014

Erinyes said:
Apparently it was always intended to be on 28nm..the 20nm rumours were just that..rumours. If so..this would be a very interesting chip IMHO..they are bound by the reticle limit and probably would not be able to increase die size much beyond GK110. Given that they're on the same process, a GM200 v/s GK110 comparison would tell us exactly how good the architectural improvements are.

Unless they went for some sort of insane transisor density there's not much more than roughly 4x times the GM107 transistor count you can theoretically fit into ~550mm2.

Dangerman · May 24, 2014

Erinyes said:
Apparently it was always intended to be on 28nm..the 20nm rumours were just that..rumours. If so..this would be a very interesting chip IMHO..they are bound by the reticle limit and probably would not be able to increase die size much beyond GK110. Given that they're on the same process, a GM200 v/s GK110 comparison would tell us exactly how good the architectural improvements are./

I'm guessing the memory bit bus width will still be 384-bit wide on the GM200 since it is on 28nm.

Yes..GM200 would be the highest end part until Pascal. No idea but this time around there would be plenty of 28nm capacity available so they shouldn't be short of chips I would think.

Click to expand...

I'm guessing this means that Big Pascal (GP100?), Performance Pascal (GP104?) will be around 2H 2016 on TSMC's 16nmFF+.

Only beneficial if you absolutely need higher density..not if you want lower cost... Yes..given initial yields..cost per transistor would be higher in the beginning. This slide by NV should give you a fair idea -

Assuming they stick to the same transistor count of course. Traditionally..transistor count goes up every generation and the die size remains around the same. Eg..GF104 had 1.95B transistors on a 330mm2 die whereas GK104 had 3.5B transistors on a 294 mm2 die.

I think what you're overlooking is that cost/mm2 goes up with every node.

Click to expand...

Is it possible to increase transistor density without process changes?

Click to expand...

Picao84 · May 24, 2014

Dangerman said:
Is it possible to increase transistor density without process changes?

Of course! Look at GM107.

Erinyes · May 25, 2014

Ailuros said:
Unless they went for some sort of insane transisor density there's not much more than roughly 4x times the GM107 transistor count you can theoretically fit into ~550mm2.

Yep I get 4.35x if I extrapolate the numbers from GK107-GK110 to GM107-GM200 (Assuming the same 551 mm2 die).

Numbers: GK107 - 11.02 million/mm2, GK110 - 12.89 million/mm2, i.e. a increase of 16.9%. GM107 is 12.63 million/mm2 so scaling by 16.9%, GM200 would be 14.78 million/mm2. At 551 mm2, it comes up to about 8.14B transistors.

Regarding performance, assuming that GM107 performs 78% better than GK107 (source:Techpowerup - http://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_750_Ti/25.html), I get roughly 40% higher performance for GM200 over GK110.

I know scaling is not always linear but in the case of GK107 to GK110 it seems to be extremely linear. From the same review, GK110's performance is 4.66 times that of GK107..and we know that the die size is 4.67 times that of GK107. Very simplistic analysis I know..but gives us a ballpark.

Dangerman said:
I'm guessing the memory bit bus width will still be 384-bit wide on the GM200 since it is on 28nm.

Should be I suppose. Maxwell is also more bandwidth efficient. However, 512-bit would allow them to put more memory on the cards..which matters in the professional/HPC segment. AMD's Firepro W9100 with 16 GB comes to mind.

I'm guessing this means that Big Pascal (GP100?), Performance Pascal (GP104?) will be around 2H 2016 on TSMC's 16nmFF+.

Why on 16FF+? Should be out on 16FF in 2H 2015 I hope.

Is it possible to increase transistor density without process changes?

Yes..as mentioned..look at GM107. While there is a 25% increase in die size from GK107 to GM107, the transistor count increased by 44%. As a result, sensity has increased substantially from 11m/mm2 to 12.64m/mm2. However, this would also partly be attributed to the huge increase in L2 cache from 256 KB to 2 MB.

Ailuros · May 25, 2014

Erinyes said:
Yep I get 4.35x if I extrapolate the numbers from GK107-GK110 to GM107-GM200 (Assuming the same 551 mm2 die).

Numbers: GK107 - 11.02 million/mm2, GK110 - 12.89 million/mm2, i.e. a increase of 16.9%. GM107 is 12.63 million/mm2 so scaling by 16.9%, GM200 would be 14.78 million/mm2. At 551 mm2, it comes up to about 8.14B transistors.

Regarding performance, assuming that GM107 performs 78% better than GK107 (source:Techpowerup - http://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_750_Ti/25.html), I get roughly 40% higher performance for GM200 over GK110.

I know scaling is not always linear but in the case of GK107 to GK110 it seems to be extremely linear. From the same review, GK110's performance is 4.66 times that of GK107..and we know that the die size is 4.67 times that of GK107. Very simplistic analysis I know..but gives us a ballpark.

I estimated more in the 50% region but in the grander scheme of things it doesn't make much difference any more. I guess as long as they can double their DP GFLOP/W goal for HPC it won't be a problem. However if Intel meets its own 14-15 GFLOPs/W projections for Knights Landing with the first hw revision it might get tricky.

Dunno what their plans are but I'd dedicate GM200 just to HPC, Quadros and thousand buck Titan successors and serve the enthusiast part of the market with GM204 mGPU. True and absolute GK110 successor = Pascal top dog.

Should be I suppose. Maxwell is also more bandwidth efficient. However, 512-bit would allow them to put more memory on the cards..which matters in the professional/HPC segment. AMD's Firepro W9100 with 16 GB comes to mind.

There's nothing to stop them if they want to have 4/8 GB framebuffers with a 384bit bus in theory.

Arun · May 25, 2014

If GM2xx isn't 20nm then this would be the first time TSMC achieved >20% revenue from a process without either NVIDIA or AMD. It's definitely possible but the implications about TSMC's other customers are massive.

Dangerman · May 25, 2014

Erinyes said:
Why on 16FF+? Should be out on 16FF in 2H 2015 I hope.

Because I just think by say around 2016 16FF+ will be avaliable to Nvidia easily and I dobut Nvidia would want to push higher end Pascals after a year or less Maxwell has been out (though I say that the higher end Maxwells will last shorter than the higher end Keplers).

Though does anyone have a clue why Nvidia has called these higher end Maxwells "Second Generation"? It implies there is going to be more architectural improvements.

Arun said:
If GM2xx isn't 20nm then this would be the first time TSMC achieved >20% revenue from a process without either NVIDIA or AMD. It's definitely possible but the implications about TSMC's other customers are massive.

Well with one of the customers being Apple it certinally helps achieving >20% revenue from a process without either NVIDIA or AMD.

rpg.314 · May 26, 2014

Arun said:
If GM2xx isn't 20nm then this would be the first time TSMC achieved >20% revenue from a process without either NVIDIA or AMD. It's definitely possible but the implications about TSMC's other customers are massive.

Interesting observation.

CarstenS · May 26, 2014

Dangerman said:
I'm guessing the memory bit bus width will still be 384-bit wide on the GM200 since it is on 28nm.

This being primarily a HPC part, I guess it'll depend on how long they expect it to lead their portfolio. Data sets in that space are constantly growing exponentially and memory size is always one order of magnitude short.

LiXiangyang · May 27, 2014

Dangerman said:
Because I just think by say around 2016 16FF+ will be avaliable to Nvidia easily and I dobut Nvidia would want to push higher end Pascals after a year or less Maxwell has been out (though I say that the higher end Maxwells will last shorter than the higher end Keplers).

Though does anyone have a clue why Nvidia has called these higher end Maxwells "Second Generation"? It implies there is going to be more architectural improvements.

Well with one of the customers being Apple it certinally helps achieving >20% revenue from a process without either NVIDIA or AMD.

Its not just about Nvidia.

Intel plan to release the knight landing next year, with 3D stack RAM and ~3Tflops.

By then if Nvidia cannot deliever anything competitive they will lose a lot of market share in HPC market.

I think the sudden appearance of Pascal in NV's plan implies it is the counter for Intel's next geneartion accerlaters, so despite of when high-end maxwell willl be out, I think Pascal will be out in 2015.

Erinyes · May 28, 2014

Ailuros said:
I estimated more in the 50% region but in the grander scheme of things it doesn't make much difference any more. I guess as long as they can double their DP GFLOP/W goal for HPC it won't be a problem. However if Intel meets its own 14-15 GFLOPs/W projections for Knights Landing with the first hw revision it might get tricky.

Dunno what their plans are but I'd dedicate GM200 just to HPC, Quadros and thousand buck Titan successors and serve the enthusiast part of the market with GM204 mGPU. True and absolute GK110 successor = Pascal top dog.

Given that they're staying on the same process..anything in that range is impressive enough IMHO. You think they'd move to 1/2 DP rate like AMD has?

I cant see GM204 completely replacing GK110 for gaming..not unless they increased the die size substantially.

There's nothing to stop them if they want to have 4/8 GB framebuffers with a 384bit bus in theory.

Of course..but if they want to hit 16 GB like the AMD part I mentioned..they'll have to go wider right?

Arun said:
If GM2xx isn't 20nm then this would be the first time TSMC achieved >20% revenue from a process without either NVIDIA or AMD. It's definitely possible but the implications about TSMC's other customers are massive.

Arun, when do you think they will hit 20% revenue from 20nm? The article I linked to upthread mentions 60,000 WPM by Q4. Also couple of other points to add:-

a) Nvidia's Erista is on 20nm
b) We haven't heard any news yet regarding AMD's plans for the node though there are rumours they may be shifting some production to GF (At least for 28nm)
c) As mentioned by others already, lots of rumours of Apple buying 20nm capacity
d) Qualcomm's 20nm modems should be ramping soon and SoC's by Q4. They aren't really a small player nowadays

Dangerman said:
Because I just think by say around 2016 16FF+ will be avaliable to Nvidia easily and I dobut Nvidia would want to push higher end Pascals after a year or less Maxwell has been out (though I say that the higher end Maxwells will last shorter than the higher end Keplers).

If 16FF is mature by 2H15, they will definitely have a part ready by then. Not necessarily high end though. With Kepler and Maxwell they've set a trend of coming out with the lower end parts first..and I expect them to follow that with the 16nm transition.

Though does anyone have a clue why Nvidia has called these higher end Maxwells "Second Generation"? It implies there is going to be more architectural improvements.

I don't believe Nvidia has called them anything yet..

homerdog · May 28, 2014

If Maxwell ends up being very good at coin mining, will we see a supply constraint like we saw with AMD stuff earlier this year?

3dilettante · May 28, 2014

It would have to be significantly better than AMD's constrained cards, since they've returned to normal pricing.
Additionally, enough people would need to be keeping the GPU mining demand significantly elevated, with multiple data points saying that time has passed.

Ailuros · May 29, 2014

Erinyes said:
Given that they're staying on the same process..anything in that range is impressive enough IMHO. You think they'd move to 1/2 DP rate like AMD has?

Silly backwards speculative math could lead you to something like that albeit I doubt they'll abandon the dedicated FP64 SPs approach that soon. If you'd go for 25 SMMs in theory with 64 FP64 SPs/SMM and a 0.85GHz frequency you get exactly 12 DP FLOPs/W with a 225W TDP.

I cant see GM204 completely replacing GK110 for gaming..not unless they increased the die size substantially.

Then they've most likely created an equally boring upgrade to the GK104 which would amount for a similar around 40-50% increase in performance.

Of course..but if they want to hit 16 GB like the AMD part I mentioned..they'll have to go wider right?

My point was/is that they can go for even framebuffer GB amounts even with an "uneven" buswidth; they've done it before. Technically at least there's nothing that speaks against it afaik except probably that the driver guys would want to pull their own hair out.

If you can have in theory 8 GB with a 384bit bus, what would then stop you to go to 16GB again in theory? :?:

NVIDIA Maxwell Speculation Thread

McHuj

Kaotik

Drunk Member

tviceman

silent_guy

A1xLLcqAgt0qc2RyMz0y

Erinyes

Ailuros

Epsilon plus three

Dangerman

Picao84

Erinyes

Ailuros

Epsilon plus three

Arun

Unknown.

Dangerman

rpg.314

CarstenS

Moderator

LiXiangyang

Erinyes

homerdog

donator of the year

3dilettante

Ailuros

Epsilon plus three

Similar threads