Nvidia Pascal Announcement

Why wouldn't nV release something that can make even more margins than a gp104?

nV will not put an AIO on gp 104, there is no need to, they might be able to push it up to 275 without a AIO if leakage isn't an issue.

If there is not competition, they might do what you say, but you are expecting Vega to not be competitive with gp 104? I expect it to be equal or faster than gp 104 at least at stock clocks depending on which Vega we are talking about.
 
index.php


Intersting though, look at the overclock on the 8GB of GDDR5X - it's reaching 5650MHz (11.3GHz) with bandwidth of 361GB/sec over its 256-bit memory bus. Pretty nice for GDDR5X.

http://www.guru3d.com/news-story/zo...inda-rips-off-the-msi-afterburner-design.html
 
Why not? Besides memory amount and price, Titan as a brand is a moving goalpost.

Because the resulting product would be very underwhelming for the name and that amount of goalpost movement would be quite unprecedented. If they were planning to use GP104 as Titan, they should have launched it already.

The first Titan was supposed to be the top offering for the Kepler generation along with fully enabled FP64 units (per SMX) so it was useful for DP compute too.
Half a year later the 780 Ti came out with higher clocks and more SMX units enabled and Titan wasn't the top offering anymore.

Around 9 months later and timed to counter the Hawaii launch. Titan was by far the top offering when it launched though.

Then came the Titan Black with all the SMX units too and higher clocks than the 780 Ti, but with higher-clocked custom 780 Ti boards the Titan Black still wasn't the top offering for gaming.
- It was then decided that Titan would be the brand name that carried a premium for larger memory amount and unlocked DP compute.

Well there was nowhere left to go... They were at max already. It can be argued that it was still the top dog. More memory and overclocking was enabled on that card as well.


Them came Maxwell and Titan X was again the top performance offering compared to the 980 Ti, but this time its chip came with a very low DP-SP ratio (there were no DP units to unlock this time around).
- It was then decided that Titan would be the brand name for top performance and more memory.
Then came the custom 980 Ti versions with higher clocks that easily surpassed the Titan X.
- It was then decided that Titan would be the brand name for.. erm.. more memory.

Titan X is the only fully enabled SKU of the GM200 chip and it launched first again, it had its time to shine and again, you seem to believe that people don't raise the clocks on these cards? + there are even different Titan X models, like the EVGA Hybrid and Hydro Copper and even if not it's silly to compare 3rd party models to reference.

Who knows what nvidia decides to call a Titan this time around...
And until (if ever) AMD decides to launch a GPU that directly competes with the GTX 1080 in performance, there's little reason for nvidia to launch a more powerful "prosumer" chip. Especially if the GP104 can clock as high and as easily as nvidia seems to claim. GP104 + >2GHz clocks + 16GB GDDR5X 12MT/s + AiO liquid cooler + $1200 price tag and there's your Titan.


It seems like getting over 2 GHz is no big deal with Pascal, so your suggestion would be really lame for a Titan. It can be argued that Titan Black was a poor product, but typically when Titan launches it's by far the best SKU out there, even if short lived.

Now you are suggesting that nVidia will completely water it down by launching some clocked to the max card, when all the cards they have released for ages have offered excellent OC headroom and based on a chip that has been on the market for who knows how long by then and when OC'd offers pretty much the same performance. It doesn't make too much sense to me. GP102 based Titan with even better margins and little bit later a cut down model as 1080Ti or 1180 makes more sense imo.
 
Last edited:
Why wouldn't nV release something that can make even more margins than a gp104?
Why would this supposedly larger GP102 chip make more margins than a smaller GP104, if sold at the same potential price?
Better yet, why spend all the R&D to put out a chip that was never present in their GPU family to date, just to counter something that may not even exist, when their current GPU can supposedly be boosted to new heights and form a completely new product for the premium audience?

If there is not competition, they might do what you say, but you are expecting Vega to not be competitive with gp 104? I expect it to be equal or faster than gp 104 at least at stock clocks depending on which Vega we are talking about.
Interestingly, AMD's biggest marketshare gain from the last 12 years happened when they decided not to compete with nvidia at the top-end, with RV770.
Another interesting thing is that AMD decided to not kickstart their FinFet products with a top-end, and not even a high-end solution.

Perhaps they think they can't get marketshare by fighting in the high-end this time around?
They have all the consoles, so maybe mindshare through a halo product isn't that important this time around. Marketshare is.


Because the resulting product would be very underwhelming for the name and that amount of goalpost movement would be quite unrepresented.
A GP104 at 2.1GHz base / 2.4GHz boost core clock would be around 33% faster than the current vanilla GTX 1080.
Guess how much faster the Titan X is than the vanilla GTX 980..
So why is it underwhelming? Because you know it's the same chip inside?
Do you think 95% of the people buying a Titan care a lot about the name, die size, transistor count and amount of execution units of the chip in their card? No, they just go to the store and say "give me the best nvidia card you've got, no matter the price".


If they were planning to use GP104 as Titan, they should have launched it already.
Unless a certain level of yields and further process optimization with the GP104 is still needed to mass produce graphics cards that will take the chip to >2.1GHz. Which would make sense since at the moment nVidia has around zero 16FF chips in the shelves.
 
Why would this supposedly larger GP102 chip make more margins than a smaller GP104, if sold at the same potential price?
Better yet, why spend all the R&D to put out a chip that was never present in their GPU family to date, just to counter something that may not even exist, when their current GPU can supposedly be boosted to new heights and form a completely new product for the premium audience?

Err we have had top end chips and dual cards that were "halo products" priced much higher and thus why the enthusiast segment was created. If you are expecting enthusiast level cards to placed at the same price as the high end cards I am not sure what you are thinking, are you thinking AMD and nV are crazy to drop their margins if they don't have to?

Interestingly, AMD's biggest marketshare gain from the last 12 years happened when they decided not to compete with nvidia at the top-end, with RV770.

So how does that correlate to this launch? You know they did that with the chip before that too, rv6xx series. The RV7xx wasn't the start of that strategy and the previous gpus did not fair well. The rv700 series did well becaues nV decided to go through another revision of tesla, they usually don't keep an architecture for more than 2 gens, but they did, they went 3 generations and we could start seeing the tesla architecture loosing its legs as AMD was able to catch up.


Another interesting thing is that AMD decided to not kickstart their FinFet products with a top-end, and not even a high-end solution.
Could be due to the design choice of using HBM 2 for Vega that caused this? If there is an exclusivity clause for Hynix and HBM and AMD, AMD has to procure certain amounts of HBM memory for Hynix just has Hynix can only sell to AMD in that time frame. The contract goes both ways. Otherwise Hynix would be dropping their pants.
Perhaps they think they can't get marketshare by fighting in the high-end this time around?
They have all the consoles, so maybe mindshare through a halo product isn't that important this time around. Marketshare is.

And how did that work last round, they had all the consoles last round too......



A GP104 at 2.1GHz base / 2.4GHz boost core clock would be around 33% faster than the current vanilla GTX 1080.
Guess how much faster the Titan X is than the vanilla GTX 980..
So why is it underwhelming? Because you know it's the same chip inside?
Do you think 95% of the people buying a Titan care a lot about the name, die size, transistor count and amount of execution units of the chip in their card? No, they just go to the store and say "give me the best nvidia card you've got, no matter the price".

Its not underwhelming at all, but that would give no room for error from validating those chip. If there are problems, they can't clock them that high at default. its ok if the user does it cause now its out of warranty specs. Also we don't know what power consumption will be like at those clocks. If they get up to 250 watts, with a 30% increase in performance, where a gp102 might be able to get to 225 watts at 30%, there are many things to look into.......
 
Better yet, why spend all the R&D to put out a chip that was never present in their GPU family to date, just to counter something that may not even exist, when their current GPU can supposedly be boosted to new heights and form a completely new product for the premium audience?

Because this way they can create better professional and consumer GPU products and the codename GP102 was already spotted in nVidia drivers last year and how much R&D cost is one additional chip configuration going to make?


A GP104 at 2.1GHz base / 2.4GHz boost core clock would be around 33% faster than the current vanilla GTX 1080.
Guess how much faster the Titan X is than the vanilla GTX 980..
So why is it underwhelming? Because you know it's the same chip inside?
Do you think 95% of the people buying a Titan care a lot about the name, die size, transistor count and amount of execution units of the chip in their card? No, they just go to the store and say "give me the best nvidia card you've got, no matter the price".

Unless a certain level of yields and further process optimization with the GP104 is still needed to mass produce graphics cards that will take the chip to >2.1GHz. Which would make sense since at the moment nVidia has around zero 16FF chips in the shelves.

It is underwhelming because quite probably most GP104 chips would be able to clock too close to the same numbers, and with 1080s starting at $599 a $1000-1200 based on the same chip with similar performance when taken to the limits simply isn't that great.

I don't think there has ever been even a Ti card in nVidia's lineup without it having more execution units than a model without that moniker and you are suggesting they are just going to overclock a chip and call it a Titan...

edit: also when you overclock a Titan X/980Ti and a 980 to their limits, the bigger chips pull away even further, because their stock clocks are further apart than their overclocked clocks. Your proposed scenario here would be the exact opposite.
 
Last edited:
Why not? Besides memory amount and price, Titan as a brand is a moving goalpost.



Who knows what nvidia decides to call a Titan this time around...
And until (if ever) AMD decides to launch a GPU that directly competes with the GTX 1080 in performance, there's little reason for nvidia to launch a more powerful "prosumer" chip. Especially if the GP104 can clock as high and as easily as nvidia seems to claim. GP104 + >2GHz clocks + 16GB GDDR5X 12MT/s + AiO liquid cooler + $1200 price tag and there's your Titan.
A GP104 at 2.1GHz base / 2.4GHz boost core clock would be around 33% faster than the current vanilla GTX 1080.
Guess how much faster the Titan X is than the vanilla GTX 980..
So why is it underwhelming? Because you know it's the same chip inside?
Do you think 95% of the people buying a Titan care a lot about the name, die size, transistor count and amount of execution units of the chip in their card? No, they just go to the store and say "give me the best nvidia card you've got, no matter the price".

People who buy a $1000+ SKU usually know their stuff and won't buy an premium priced SKU which is barely faster then their smaller overclocked former sisters (assuming that GP104 Titan based chip won't be clocking very much further then the regular overclocked GP104) . The Titan or whatever it's gonna be called have to offer something more, ie ~40% more perf, not just base perf., (also staying ahead once overclocked), HMB2 etc. If they can manage it with a wider chip or a smaller one reaching insane clocks (newer revisions) doesn't matter but the latter is less likely and they prpbably need to go wider and bigger esp. for the mem.bus , but the performance delta has to be there.

Also you can expect a lot of factory overclocked GP104's in the form of 1080's from various vendors at a lower price ( I bet we see 2 GHz+ boostclocks, I read even 2,4 GHz), making it even harder to make a premium card out of the GP104.
 
Last edited:
Tesla hasn't surpassed anything. It's a $100M/quarter business compared to $800M for GeForce. But the FP64 side of the Tesla business was getting very long in the tooth with K80, and this class of chips probably requires a lot more validation, so it made sense to give move that up. Tactical considerations don't have to mean strategic changes.

While true that revenue is unlikely to ever surpass that which is generated by the consumer lineup, it's likely that profits for the Tesla division will exceed those of the Geforce division at some point. Profit margin for the initial wave of P100 products is going to be at least an order of magnitude larger than the consumer lineup. Volume will obviously also be significantly lower. And hence even if it becomes more profitable than the Geforce line, it won't diminish the importance of the Geforce line to the company as high volumes will help to keep shared costs down (wafers, for example).

It'd be nice if they broke down revenue and operating profit by division similar to Microsoft, Apple, etc. but they don't appear to do that from a quick look at their yearly and quarterly financials. And as of their FY2016 fillings it's now ~705 million USD per quarter for Geforce and ~85 million USD per quarter for Tesla. I'd imagine Tesla volumes were impacted to some degree by how long in the tooth the architecture used for it is and the anticipation for upcoming Pascal based products moreso than the Geforce division.

Also of personal interest, the assets (IP and whatever else remains) from the 3dfx acquisition is still valued at 50 million USD. :D

Regards,
SB
 
Why is the 1080 coming before the 1070 anyway? Are 1080 yields and gddr5x that available? No 1070s were given out even though they use a cut down chip and gddr5? No full 1070 specs? both AMD and Nvidia are being a bit strange.
 
And how did that work last round, they had all the consoles last round too......
Last round they were short-handed because of the consoles (not to mention AMD is always short-handed vs. nvidia.. this time was just worse) and they tried to compete at all performance brackets at the same time. One would think they're less short-handed this time and/or they won't try to compete on all performance brackets.

It is underwhelming because quite probably most GP104 chips would be able to clock too close to the same numbers, and with 1080s starting at $599 a $1000-1200 based on the same chip with similar performance when taken to the limits simply isn't that great.
"Quite probably" most GP104 chips will allow a 33% overclock? Wow..

I don't think there has ever been even a Ti card in nVidia's lineup without it having more execution units than a model without that moniker and you are suggesting they are just going to overclock a chip and call it a Titan...
There's been 2 generations of Titan, which isn't enough to state exactly what a Titan is or should be. Furthermore, as I explained before the Titan moniker has been a moving goalpost with every single iteration. The only thing that is constant so far is memory amount and price.

People who buy a $1000+ SKU usually know their stuff and won't buy an premium priced SKU which is barely faster then their smaller overclocked former sisters (assuming that GP104 Titan based chip won't be clocking very much further then the regular overclocked GP104) .
If they used higher binned chips, better voltage regulation hardware and an AiO cooler, what makes you think it wouldn't be clocking much further?

Why is the 1080 coming before the 1070 anyway? Are 1080 yields and gddr5x that available? No 1070s were given out even though they use a cut down chip and gddr5? No full 1070 specs? both AMD and Nvidia are being a bit strange.
The 1070 isn't their halo product, which was the main point of the paper/soft-launch.
Besides, we're yet to see if the 1070 doesn't come with any further surprises (e.g. a castrated memory bus).
 
newb question:

On my 980ti, OC'ing the core leads to a much larger performance gain than oc'ing the memory.

Why is HBM2 so important if the memory jump doesn't mean much for gaming performance? Will the architecture difference between GDDR5(x) and HBM2 that significant that all of a sudden new type of memory will lead to a huge boost in performance?
 
Why is HBM2 so important if the memory jump doesn't mean much for gaming performance? Will the architecture difference between GDDR5(x) and HBM2 that significant that all of a sudden new type of memory will lead to a huge boost in performance?
HBM2 was never hyped or even marketed to the consumer space. And won't be for a long time.

GPUs have a lot of other applications, different from games, where fast on-board memory is critical.
 
Short answer: With 16nm, you get twice the transistor density, so a healthy increase in raw compute/texturing/whatnot performance. All of this needs to be fed with data. A 50% increase needs 50% more bandwidth, roughly approximated. An 80% increase 80% - and GDDR5 cannot keep up.
 
At least for Fiji, AMD's implementation of HBM cut power consumption by half or more, and it offered significantly more bandwidth. HBM2 tweaks efficiency numbers a bit as well.
For the same level of performance, it might be possible to free up the majority of the power budget dedicated to a GDDR5 interface. AMD stated 15-20% went into the interface, and it might be possible with HBM2 to get efficiency good enough to bring that percentage into the single digits for the same bandwidth demand.

If the stack count could be reduced with HBM2's higher density, that would save more die area. I'm not sure if Fiji's interface was a significant area saver versus Hawaii, but it might be fun to speculate on whether the interface coupled with Fiji's bandwidth-saving measures could have gotten acceptable performance from fewer stacks if HBM didn't kneecap capacity.

Then, it's more core you can overclock, and more power to do it with.
 
Thanks guys! that makes a lot of sense. So it's more about HBM2 using less power, while providing more breathing room for the core to "grow" and get even faster. That makes a lot of sense.
 
On my 980ti, OC'ing the core leads to a much larger performance gain than oc'ing the memory.

Why is HBM2 so important if the memory jump doesn't mean much for gaming performance?
HBM and HBM2 aren't important today for gaming. They are more important for scientific/compute applications. And they were very good for marketing last year, but that hype has died down a bit recently. (Edit: and, indeed, power, which helps if your core is not very power efficient.)

Will the architecture difference between GDDR5(x) and HBM2 that significant that all of a sudden new type of memory will lead to a huge boost in performance?
No. There aren't many architectural differences between GDDR5/GDDR5X and HBM/HBM2. It's mostly just the interface that's faster.

If you look at the influence of GDDR5 memory overclocking on gaming performance for a gm204, you see a scaling ratio of roughly 50%. That means that there are many cases where the GPU is not BW limited at all.
As you increase the memory clock, this ratio will drop further. Conversely, as you increase core speed/nr of cores, this ratio will increase.

There aren't many benchmarks available that have extensive results where core clock and memory clock are shmoo'd, so it's anybody's guess how scaling really evolves with changing clocks.
 
At least for Fiji, AMD's implementation of HBM cut power consumption by half or more, and it offered significantly more bandwidth. HBM2 tweaks efficiency numbers a bit as well.
For the same level of performance, it might be possible to free up the majority of the power budget dedicated to a GDDR5 interface. AMD stated 15-20% went into the interface, and it might be possible with HBM2 to get efficiency good enough to bring that percentage into the single digits for the same bandwidth demand.

If the stack count could be reduced with HBM2's higher density, that would save more die area. I'm not sure if Fiji's interface was a significant area saver versus Hawaii, but it might be fun to speculate on whether the interface coupled with Fiji's bandwidth-saving measures could have gotten acceptable performance from fewer stacks if HBM didn't kneecap capacity.

Then, it's more core you can overclock, and more power to do it with.
And you can do what NVIDIA did with P100 and clock it lower to improve TDP even more (I assume as it was done due to being on the acceptable limits); 720GB/s rather than the 1TB/s spec for HBM2.
If think HBM/HBM2 bandwidth being too much, so why not lower clocks and improve efficiency further.

Cheers
 
And you can do what NVIDIA did with P100 and clock it lower to improve TDP even more (I assume as it was done due to being on the acceptable limits); 720GB/s rather than the 1TB/s spec for HBM2.
If think HBM/HBM2 bandwidth being too much, so why not lower clocks and improve efficiency further.

Cheers

You probably need to balance efficiency vs having enough performance for enthusiasts to cycle out their older cards.

For example, I have absolutely no reason to replace by 980Ti with a 1080Ti if it's only 20% faster but 60% more power efficient. It means nothing. For me to spend money on new card, I rather they make it 60% faster and 20% more efficient.

While reduced power, cooling and all that is great to have, they take a backseat to outright performance at the enthusiast level. Since enthusiast level = highest margins, you need to ensure your performance bump is enticing enough for people to make the jump.
 
For example, I have absolutely no reason to replace by 980Ti with a 1080Ti if it's only 20% faster but 60% more power efficient. It means nothing. For me to spend money on new card, I rather they make it 60% faster and 20% more efficient.
They didn't stress it during this launch, Nvidia has historically always targeted new launches at those who were 2 generations behind. In this case, that's a 780 Ti. It's because those are most likely to upgrade.

That said, 60% faster and 20% more efficient means 30% more power. This would quickly make top-end GPUs consume too much power. So they should at least keep performance and power increases the same.
 
Back
Top