Nvidia Pascal Announcement

I don't know whether today's GP104 products have artificially throttled Int8, I haven't seen evidence either way. But GP104 and 102 are the same architecture, just with different numbers of units. GP100, on the other hand, is unique.
Just to confirm Scott Gray on the Nvidia devforums tested this and the GP104 (well the 1080 anyway) has full throughput for Int8/dp4a.

Separately and to be a broken record because I cannot let go of it and need to get it off my chest lol, I really think Nvidia with their strategy are opening themselves up to be squeezed by Intel with Knights Landing, which is already winning large HPC contracts through sales channels such as Cray (who also sell P100 and other Nvidia related solutions), I can see the same happening on the Deep Learning side.
Cheers
 
Last edited:
Yeah, the whole different chips with different capabilities strikes me as a terrible long term idea. Hopefully, Nvidia regains its senses with Volta....
 
Yeah, the whole different chips with different capabilities strikes me as a terrible long term idea. Hopefully, Nvidia regains its senses with Volta....

This is not atypical for market segmentation purposes. Intel's been doing this since the introduction of the Xeon.
 
That is 1) a poor analogy, and 2) Intel has zero competition.

How is it a poor analogy? They both make microprocessors, they just serve different markets for the most part. Are you telling me that one vendor disabling a feature in a particular SKU to push customers to a higher-margin SKU is abnormal, and that this is not precisely what NV is doing here? Also as you'll see I mentioned the Xeon, which was introduced in 1998. Intel did not have "zero competition" in 1998, particularly in the server market. In fact, Intel was the little fish in the big pond when it came to servers back then.
 
Are you telling me that one vendor disabling a feature in a particular SKU to push customers to a higher-margin SKU is abnormal, and that this is not precisely what NV is doing here?
That is not what I wrote, no. But if you desire to tilt at windmills, that is your business.
 
I'm trying to understand your statement. Do you wish to take part in a discussion?
No, you are not. If you were (genuinely), you would have asked a question not made a statement.

Do you wish to take part in a discussion?
AFAICS, there is nothing to discuss. I made an observation. You disagree with it it. OK.
 
No, you are not. If you were (genuinely), you would have asked a question not made a statement.

ShaidarHaran said:
How is it a poor analogy?

ShaidarHaran said:
Are you telling me that one vendor disabling a feature in a particular SKU to push customers to a higher-margin SKU is abnormal, and that this is not precisely what NV is doing here?

By my count there are 2 question marks there.

AFAICS, there is nothing to discuss. I made an observation. You disagree with it it. OK.

If that's how you feel.
 
Do you see the question mark at the end of that line? It's in the source. I didn't add it just now.
Well, if you think about it...... That would obviously not be what I was referring to. Here...

No question mark (no question at all really):
This is not atypical for market segmentation purposes. Intel's been doing this since the introduction of the Xeon.
 
This is not atypical for market segmentation purposes. Intel's been doing this since the introduction of the Xeon.
That is not the same thing though, here we are talking about native precision-compute support and I doubt this is different between the various Xeon models in a range, especially Knights Landing that is designed to target both HPC and scientific world/Deep Learning.

Cheers
 
Last edited:
And the pressure with this strategy will be from Intel.
As Charlie Wuishpard, VP of Intel’s Datacenter Group says with analytical points from NextPlatform:
“There is machine learning training and there is inference; if I start with inference, the Xeon is still the most widely deployed landing space for that part of the workload and that has not been broadly advertised,” Wuishpard explains. “It’s an interesting and evolving area and we think Xeon Phi is going to be a great solution here and are in trials with a number of customers. We believe ultimately we’ll get a faster and more scalable result than GPUs—and we say GPUs because that’s gotten a lot of the attention.”
While the estimates about the majority of the inference side of the workload are based on Intel’s own estimates, this is a tough figure to poke holes in, in part because indeed, while the training end gets more attention, it is rarely a CPU-only conversation. “These codes tend to be tough to scale, they tend to live in single boxes. So people are buying these big boxes and chock them full of high power graphics cards, and there is an efficiency loss here,” he says, noting that users in this area desire to keep the entire workload on a single machine, have the ability to scale out, and use a cluster with a highly parallel implementation and without an offload model on the programming front. And it is in this collection of needs that the strongest case for Knights Landing is made—at least for deep learning training and inference.

So, why is Knights Landing a suitable competitor in the deep learning training and inference market? That answer kicks off with the fact that it’s a bifurcated workload, requiring two separate clusters for many users with limited scalability. In fact, scalability of deep learning frameworks on GPUs has been a challenge that many have faced for some time. The easy answer is to solve those underlying challenges using a common architecture that both scales and allows deep learning training and inference to happen on the same cluster using a simplified code base (i.e. not requiring offload/CUDA) and do so in a way that moves from beyond a single node for training.

As I said, Nvidia is complicating the research model with the way they are deliberately structuring their Pascal models and no particular model able to handle the whole research/deep learning requirement, compounded with the work they are now pushing for GP102 and Int8.
Intel's Knight Landing is already having success in HPC, and with Nvidia's strategy I can see the same happening in the research/Deep Learning segment as well.
Cheers
 
When is Volta scheduled for again? Because the real issue with all of Nvidia's recent announcements is availability, especially since all their dies are huge.

Despite all of Nvidia's PR claims that "there's no yield issues", well, let's face it there's huge yield issues. If you are constantly sold out, that by definition is a damned yield issue. And both Nvidia and AMD are having huge yield issues, as I can't even find stock of an Rx480 or a GTX 1060 for my brother, who's been waiting for over a month now to get a new GPU. And the bigger dies are obviously going to have even more yield issues than their smaller relatives.

So right now, with all the questions of whether Nvidia's highly divided Pascal lineup is a good idea or not, the reality for sales at least is it might not matter if Volta shows up around the time the manufacturing processes can churn out anything like near enough to meet demand. Of course whether Nvidia continues its strategy into Volta is another question.
 
If you are constantly sold out, that by definition is a damned yield issue.

That tells you nothing about yield issues. All that says is that demand is greater than supply.

You could have 100% yield, and still be supply constrained. OTOH, you could have 25% yield and not be supply constrained. As yield does influence the available supply, it isn't the only thing that affects available supply. And yield doesn't directly influence demand in even the slightest bit.

In other words. It's entirely possible to have 100% yield and use 100% of a foundry's available per month wafer production and still be supply constrained if demand were high enough.

Regards,
SB
 
Separately and to be a broken record because I cannot let go of it and need to get it off my chest lol, I really think Nvidia with their strategy are opening themselves up to be squeezed by Intel with Knights Landing, which is already winning large HPC contracts through sales channels such as Cray (who also sell P100 and other Nvidia related solutions), I can see the same happening on the Deep Learning side.
Cheers
From what I can see on the surface, I disagree. But I don't know how real neural networkers use their machines - if they do training on one rig, inferencing on the other, so they can continue training a new iteration on the first one?

What I do know is, the harder the competition, the less able you will be to carry around dead weight, i.e. specialized products. With a chip nearing the size TSMC apparently can built comfortably, you probably would have to sacrifice something in order to expand the FP32 cores to to 4×INT8 as well.
 
It's unclear to me why GP100 has dedicated double precision ALUs but not dedicated FP16 ALUs. Is it due to area, clock speeds, data routing, operand collection, register file organisation or something else?
 
It's unclear to me why GP100 has dedicated double precision ALUs but not dedicated FP16 ALUs. Is it due to area, clock speeds, data routing, operand collection, register file organisation or something else?

You mostly cover it in your question i think, when im not a specialist


Honestly, im a bit in doubt with thoses GPU's deep learning who should be able to do all ( from graphics to FP64 computing, to virtual reality and deep learning etc ) ..
Will it not been more interessant and efficent to have dedicated processors who are able to cover the need of deep learning instuctions and who do only that ? , this look to me more efficient in all aspect instead of trying to go there with multi-purpose processors . ( but that is certainly another debate ).. at least on a research lab.
 
Last edited:
Back
Top