Are their sales down?
If not then people are really using the wrong metric.
Is it really a nothingburger that a random Chinese firm can produce a working LLM model for $5MM instead of tens of billions?Well Nvidia' stock price rises based on low information emotion so it makes sense that it's falling for the same reason. I'm sure it'll go back up when people realize the Deepseek news is a nothing burger.
Is it really a nothingburger that a random Chinese firm can produce a working LLM model for $5MM instead of tens of billions?
Does it matter? All the LLMs are (at least partly) trained with stolen data, why not steal other LLMs data too?Did they have the same starting point? Thought I read somewhere they used data that was already processed by another LLM.
The bigger downstream risk from a market perspective I'd be worried about is if this increases the trade war surrounding AI.
But in terms of the AI bubble popping I think there needs to be some context when throwing that around in terms of what people actually mean. Nvidia's valuation, and this growth tech companies in general, are based essentially on future growth. Nvidia's actual AI sales/revenue can still continue to grow and their stock could plummet as long as that growth rate is just perceived to be lower. So the AI bubble popping from a company valuation standpoint is very different than the AI bubble popping from the product stand point.
Does it matter? All the LLMs are (at least partly) trained with stolen data, why not steal other LLMs data too?
It does matter as you need to get the dataset for "cheap" training from "expensive" training.Does it matter? All the LLMs are (at least partly) trained with stolen data, why not steal other LLMs data too?
The market rarely cares if they came from the same starting point. If they can do the same thing but for 3% of the cost the market will correct, regardless of if they are riding on coattails.Did they have the same starting point? Thought I read somewhere they used data that was already processed by another LLM.
You can download DeepSeek today and see for yourself if it's a non-repeatable experiment or a functional LLM made with 3% of the budget.Did they “solve AI” in general or did they share non-repeatable results from a niche experiment? It matters.
The market rarely cares if they came from the same starting point. If they can do the same thing but for 3% of the cost the market will correct, regardless of if they are riding on coattails.
In fact this happens again and again in tech, everyone thinks the moat is unbreakable until newcomers come along and use the work you've done to break your moat.
You can download DeepSeek today and see for yourself if it's a non-repeatable experiment or a functional LLM made with 3% of the budget.
How on earth did you get that from what I said at all?Not sure what you mean. Do you believe that all workloads have already been encoded as neural networks and therefore we’re now in the second phase of the revolution that can run on a fraction of the hardware? Can Deepseek repeat the outcome with other workloads e.g. protein folding or weather simulation? Think I saw somewhere that it doesn’t have ChatGPT’s conversational chops. No idea what to believe as there’s an immense amount of FUD being tossed around right now.
DeepSeek is an LLM, why would it be used for protein folding and weather simulation?
(as an aside I am skeptical American companies can successfully use AI for either of those in any meaningful way)
This is the dumbest number I have seen in this whole fiasco. 5 million is maybe maybe the cost of the final training run. It in no way encompasses the cost of the 50k hopper cluster they set up, or the cost of research, development, data collection, or the cost of the previous training runs.Is it really a nothingburger that a random Chinese firm can produce a working LLM model for $5MM instead of tens of billions?
Quote me where I called this the “harbinger of the end”.Ok I’m trying to follow your point. If deepseek is just an LLM how is it a harbinger of the end of the AI arms race? Aren’t we just skimming the surface of ML workloads?
Sure, 5MM probably isn’t all of it, but I think it’s nowhere near the billions used by their American counterparts.This is the dumbest number I have seen in this whole fiasco. 5 million is maybe maybe the cost of the final training run. It in no way encompasses the cost of the 50k hopper cluster they set up, or the cost of research, development, data collection, or the cost of the previous training runs.
Something that is 95% as good for a fraction of the cost is basically catnip for investors.ChatGPT and other large LLMs are like Ray Tracing.
DeepSeek is baked lighting.
Deepseek could never be created, without the ground truth tool, which are the larger LLMs. That also means that while Deepseek can run much faster and cheaper than the larger LLMs, its likely unable to ever produce a better result than the LLMs.
For most use cases this might be sufficient, but for others like asking it to come up with solutions to PHD level problems, deepseek may not be useful at all.