NVIDIA discussion [2025]

It is decent, but DLSS still has superior upscaling, and it has Ray Reconstruction and Frame Generation.

What's more, the transition to the transformer model will boost DLSS quality in all categories even further.
XeSS has Frame Generation too.
 
NVIDIA has a big supercomputer running 24/7, 365 days a year improving DLSS. And it's been doing that for six years.

This is one of reasons why when I see people saying something like this should be open sourced or made vendor neutral I ask them about specifics.

I know the consumer side might not be want to hear this but sans directly monetizing software improvements like this it's going to need to be captured via the margins on hardware.
 
They have moved forward their plans in a great way, Blackwell Ultra (B300) is coming up 6 months early, and Rubin is the same. It's the same situation as H100 and H200. NVIDIA keeps pumping up new hardware and customers buy what's available according to their budget, order volume and development plans.

As for reports of delay:


The answer is pretty obvious - until whoever keeps leaking them covers their shorts! (Only half joking).
 
This is one of reasons why when I see people saying something like this should be open sourced or made vendor neutral I ask them about specifics.

I know the consumer side might not be want to hear this but sans directly monetizing software improvements like this it's going to need to be captured via the margins on hardware.
Whatever happened to the idea of just investing in something for long term benefits?
 
The crazy spending on AI continues, after Microsoft pledged 80 billion $ on AI in 2025, Meta now pledges 65 billion $ in 2025.



Meta will set up a massive 2-gigawatt data center, which is "so large it would cover a significant part of Manhattan," according to Zuckerberg. Roughly 1 gigawatt of this computing capacity will be online by the end of this year, and it will use more than a whopping 1.3 million GPUs.

at least this is marginally more useful than mining crypto
 
the IA bubble seems gone. It may seem brutal, but it is still a natural drop after wild speculation. That's what the stock market is like. China has shown that it is necessary to invest in better algorithms or codes that make the most of the available hardware. With AI, they were filled with money for the opportunity to sell vitaminized (and inefficient) products, and another competitor has put them in their place.


The went through tougher times though, very very recently.


Let's see how OpenAI can compete with DeepSeek.
 
All it took was one opensource model that needed relatively just handfull of h100's to train and NVIDIAs stock is down by over 15% so far today
1738002846990.png
 
China has shown that it is necessary to invest in better algorithms or codes that make the most of the available hardware.
I would argue they have demonstrated the ability to distill ChatGPT models, using its answers for training and nothing more. The R1 preview even displayed OpenAI’s prices for API calls and answered that it was created by OpenAI)

There’s no other way to improve scores in the hard math, programming and agentic benchmarks except through inference time search with reinforcement learning and follow up training to retain the optimal paths. All this requires a hell of a lot of accelerators.
 
I would argue they have demonstrated the ability to distill ChatGPT models, using its answers for training and nothing more. The R1 preview even displayed OpenAI’s prices for API calls and answered that it was created by OpenAI)
That's super interesting. So the GPT models trained themselves on human-generated data while the R1 trained itself on GPT-generated data. It's like a cheap knockoff of GPT.

If that's the case then it makes sense that it's so much easier to train. Abstracting the corpus of human knowledge into a math model is hard. Distilling that pre-abstracted model into a different model should be much easier.

I think it changes the AI monetization calculus in interesting ways. Smaller companies that cannot afford to train their own first-principles models will no longer have to license them from OpenAI or Meta or whatever. They can just create their own knockoff based on the outputs of the OG models.

These distilled models cannot advance the state-of-the-art in learning. For that you need grunt work based on human data and deep pockets. But how will the big gorillas be incentivized to push the frontier if their work can just be cheaply copied? (I do get the irony re artists/actors but let's ignore that for now).

Since this is an NVIDIA thread the impact is interesting -- democratization means everyone and their mother will now want to buy a GPU to train the copycat models. But that may be offset by a drop in demand from the gorillas if they cannot protect their investments.
 
I would argue they have demonstrated the ability to distill ChatGPT models, using its answers for training and nothing more.

It's more complicated than that. This wouldn't have worked in the past, as training against the networks output with no access to the reference corpus resulted in amplification of training artifacts. The much touted "self poisoning" issue would have rendered the "distilled" model significantly worse than the original.

These distilled models cannot advance the state-of-the-art in learning.
But with the addition of the iterative "reasoning" pass, the output of the discerning model is stable enough to better encode "logic" by example in the primary transformation while also filtering out a lot of the garbage the input model was confronted with in it's own training corpus.

While the derived model inevitably has a huge loss of "anecdotal" knowledge (that does not carry over without being trained against the original corpus), it can be expected to be more refined up to the point where it can cut the number of required reasoning loops to achieve the same accuracy in logic tasks as the original model by a decent factor every generation. Up to the point where we hit the limitations of the model, and anecdotal knowledge is lost entirely.

Why this all is so worrying for Nvidia? Because it means that at least for inference, peak computing power demand is over, respectively already in sight. While there is still some minor demand for "large" models that have been trained against the full corpus, derived "small" models are now obviously the superior choice for actual application.

The real problem for Nvidia is that they don't have any answer at all for the question "how do we get that shit efficient" - only for "how can we throw more hardware at it". So if the demand for computing power is stagnating and price-gouging based on scarcity of computing resources is no longer viable, then power efficiency becomes a major concern again in order to remain competitive. And that in return is usually solved by competitors - once computing models stay sufficiently immutable for a while - going for ASIC solutions that solve the issue at typically 10-50x lower power requirements.

Current estimates range for a 50-75% devaluation of Nvidia this year, even pre-arrival of generally accessible ASICs from other vendors (and no, I don't count what Google is doing inhouse as "accessible", nor do I count Intel occasionally acquiring start-ups only to release a butchered version of the bought in product 1-2 years too late every single time). But don't be fooled - once the software model can be considered "stable" it's just 6 months till Huawei etc. can offer tailored hardware. And the efficiency gains to be expected from better tailoring still outweigh the lack of access to bleeding edge manufacturing nodes.

That be said - we are now only an estimated 18 months out from China fully catching up with TSCM tech (well, actually the tech they are buying), and that's also an elephant in the room that's been chasing Nvidia for a while now. Because that's what so far protecting Nvidias market position on general purpose accelerators as only ASICs are competitive in terms of power efficiency.

Once manufacturing has caught up, it's turning worthwhile to invest into extensive reverse engineering to catch up on the architectural details too. And I'm not talking about a punny 10-20 man team maintaining a compatibility layer like AMD does, but a large scale spin up of companies via government grants. It really just turns into a numbers game to have at least a couple of highly functional start-ups emerging as a consequence.
 
Last edited:
Why this all is so worrying for Nvidia? Because it
means that at least for inference, peak computing power demand is over
Or Jevons paradox will kick in, and inference demand will increase significantly necessitating even more compute to be deployed.

While there is still some minor demand for "large" models that have been trained against the full corpus
Large corporations are still doubling down on compute. People are racing to build AGIs, they are not building large clusters for inference alone. Even China has announced a 100 billion $ investment in AI.



The real problem for Nvidia is that they don't have any answer at all for the question "how do we get that shit efficient" - only for "how can we throw more hardware at it"
That's not the job of NVIDIA, they didn't start this AI cycle, NVIDIA has one task: sell shovels, and that's it. They already have their hands full tuning hardware efficiency.

Let's not forget, DeepSeek was made on 50k Hopper GPUs (and more will be needed if they want to scale up the model), and used PTX (NVIDIA’s intermediate assembly language) extensively. Let's put credit where credit is due.
 
Last edited:
It really is a panic for nothing. Deepseek has shown that you can do the same thing cheaper. If you maintain the current investment, you will end up making more than expected, as IA scale with more compute.
 
Using a lower-level language near-always increases performance, at the cost of development time and requisite talent to build and maintain the project. Higher level languages, like CUDA, forfeit absolute performance for a (usually significant) reduction in complexity, thus enabling less expensive development and faster turnaround times for ever-larger projects. If your development org can afford the talent and time to write your new application in pure assembly, you're highly likely to have a very high performing and lightweight product to deliver. You're also going to iterate more slowly, especially as staff move in and out of the org, as low level languages (especially something like assembly) can be very tricky to navigate.

I did a tiny bit of assembly back in high school, purely for my own edification. It's rewarding when you get it right, but damned if getting it right takes quite a while. I'd much rather write in C, and of course in modern days for the random work I do, powershell and bash and python and even some Autoit v3 all get the job done.
 
Higher-level languages are more portable between architectures as well. PyTorch runs on many different IHV's hardware. CUDA is officially Nvidia-only but there's unofficial ways to run CUDA applications on non-Nvidia hardware (with potential hits to performance and compatibility). There's also the various cross-IHV CUDA alternatives like OpenCL, SYCL, etc. Good luck running PTX on non-Nvidia HW.

It's also possible that higher-level code written today but recompiled with future compiler versions will run better on future architecture versions than assembly code written and optimized for the current architecture, because the future compiler can apply optimizations for the future architecture. This might not be a concern for an organization with the resources to update the assembly code base every time the HW is upgraded, but many organizations don't have that luxury.
 
It's actually expanding .. now AI is going to be available on even more devices.

This selloff is just a knee jerk panic reaction from the less educated masses. The whole semiconductor market is down as a result.
Yea, that's not it at all. Many companies like Microsoft are reducing their capex for the next fy as investors are asking for returns on the AI spending. This will affect Nvidia's growth. Second of all, there's a move to limit Nvidia's sales to China further expanding the current restrictions which will limit growth. DeepSeek while relevant is mostly a smoke screen. Depending on how expansive the government's restrictions are, I wouldn't be surprised if Nvidia is sub $100 at some point this year.
 
I was keeping an eye on the launches in Europe and Asia, which are where I'm from and where I live, respectively. It was kinda wild. Launch days aren't for the sane. I know, I know.
The actual release and sales (such as they were) of the 5090 and 5080 happened at the same exact moment globally, as far as I can tell, at 15:00 CET and 22:00 HKT (already the 31st in Australia :p).

In the hours before, it was pretty much the same everywhere: Nvidia's websites for Germany/France/Netherlands/Japan/Taiwan/Singapore just showed some placeholder Blackwell pages and under the 'Shop' sections they'd still list Ada stuff only. Also, the sites were erroring out constantly. It looked like they were suffering from significant load.

Then came reveal time, and updated sites, and eventually, some 'Buying Options' showed up (examples):
5080 from €1,190.00 ($1,240.26)
5090 from €2,369.00 ($2,468.91)

5080 from S$1,660.00 ($1,229.74)
5090 from S$3,320.00 ($2,459.55)

Not that far off from MSRP + VAT, I guess, but it turned out that the international Nvidia website shops didn't really have any ready-to-order Founders Editions.
Instead, these shops settle for links to retailers, some of which actually offer individual cards. Others sell prebuilt systems only.
Because hey why not sell some other components at the same time?

I'm not sure which genius came up with the idea to do the release on the day after Chinese New Year. It's not a great call. A lot of shops in Asia are not responding or updating their stuff on this day.
Some of those that were though could be seen jacking up prices in realtime as the level of demand became clear. Initially, the cheapest 5090 I saw (for preorder!) was about $3,630.60. 15 mins later it had gone up to $3,926.89, and 30 mins later the listing was gone. Meanwhile, most 5080s ended up not all that far off from the $2000 asking price that was supposed to get a 5090. Nuts.

Watching people go mental on forums and chats was entertaining in its own right. I reckon that the actual ratio of 5090 to 5080 cards out there must've been something on the order of 1 to 20 or worse. Finding 5090s actually in stock was super rare. Some shops reported having received one or two samples only. As for Founders Editions - outside of the US you can forget about those altogether. At least anytime close to launch day. Which is a pity, as I kinda lust after that design.

Anyway, as much as some people like to go on and on about the sad state videocard prices, and youtubers declare that the cards they hated yesterday are now sooo much better value than the latest batch, there sure seems to be a lot of pent up demand, and excitement.

So.. did I buy anything? I actually don't really know. I'll have to see how that goes.
 
Back
Top