NVIDIA discussion [2024]

Also in Portal RTX and Portal RTX Prelude. There is also SER in more games: Witcher 3, Sackboy, Portal RTX, Portal RTX Prelude, F1 2023, Cyberpunk and UE5 in general.

RTX Remix titles coulde be using SER and OMM automatically.


Blackwell is rumored to have Ray Tracing Level 4 or Level 5.

There is also some speculation that NVIDIA will integrate generative AI for NPCs and characters in certain titles, and will rely on Blackwell to deliver responses with minimal latency.

Ah yes, the good old glory days of Aegia PhysX, Killer NIC, and AIseek's Intia AI processor.
I still remember AIseek's town demo even though it may have been lost from the internet. The Tank demo is on youtube though.
2000's was such a wild time for gaming and the hardware market.
 
NVIDIA is opening up a new division focused on doing custom chips for AI and video games.

Nvidia officials have met with representatives from Amazon.com (AMZN.O), opens new tab, Meta, Microsoft, Google and OpenAI to discuss making custom chips for them, two sources familiar with the meetings said. Beyond data center chips, Nvidia has pursued telecom, automotive and video game customers.

Dina McKinney, a former Advanced Micro Devices (AMD.O), opens new tab and Marvell executive, heads Nvidia's custom unit and her team's goal is to make its technology available for customers in cloud, 5G wireless, video games and automotives, a LinkedIn profile said. Those mentions were scrubbed and her title changed after Reuters sought comment from Nvidia.

 
When I was working at AMD, I was always impressed with how the company managed to keep two teams in complete isolation from each other to protect client confidentiality. One team was designing the next chip for the Microsoft XBox, while the other was designing a chip for the Sony Playstation. Each client had their own gaming console intellectual property and requirements which had to be protected from the other team. It was a successful model for AMD, who still owns that market.

But all that secrecy can be difficult and expensive. And it is hard to scale that business. What if the chip vendor let the customer do more of the design work, and provide their IP for inclusion into the customers’ chips? And of course, the client could leverage the vendors relationship with TSMC or Samsung for fabrication to lower costs and improve time to market.

So it should surprise exactly nobody that Nvidia has announced it formed a group tasked with forging this new business model, helping clients build their own solution using Nvidia IP or perhaps even chiplets. Nvidia is up yet another 3% on the news.

Perhaps Nvidia didn’t need to buy Arm after all. With this move, it is beginning to build an AI licensing giant.
...
Could these Nvidia custom-chip clients tap into Nvidia’s in-house and AWS supercomputers to accelerate and optimize those design efforts? It would be a nice piece of additional revenue as well as an incredible differentiator. If so, this could be why Nvidia is hosting their latest “internal” supercomputer, Project Cieba, at AWS data centers, where the infrastructure for secure cloud services are already available. Nvidia could make chip design optimization services available on Cieba.

While this speculation may be a bridge too far, doing so would indicate that Nvidia sees the writing on the wall, and is already gearing up to transform the industry once again.
 
Has Nvidia started realizing revenue from GH200? It’s crazy how much hype and spending there is on “AI” based on just the first generation of A100/H100 products. It will be interesting to watch the upgrade cycles. I’m guessing the big cloud and supercomputer players won’t be swallowing the capex to switch to the latest and greatest every few years.
 
I’m guessing the big cloud and supercomputer players won’t be swallowing the capex to switch to the latest and greatest every few years.

Why would you guess that?

For the cloud and hyperscaler people surely it must pay to have the cutting-edge tech, which they can rent out at a premium, but they are on such a scale that they don't have to replace old kit they can just add the new stuff and still sell services based on the old stuff at the same time.

The old stuff can just atrophy and be disposed of whenever.
 
Why would you guess that?

For the cloud and hyperscaler people surely it must pay to have the cutting-edge tech, which they can rent out at a premium, but they are on such a scale that they don't have to replace old kit they can just add the new stuff and still sell services based on the old stuff at the same time.

The old stuff can just atrophy and be disposed of whenever.

I can’t imagine demand will still be this insane in 2026 when Blackwell is ramping up.
 
I can’t imagine demand will still be this insane in 2026 when Blackwell is ramping up.
Blackwell for HPC will likely be announced this spring.
H100 isn't a first generation of AI h/w either - GV100 had tensor cores so that puts GH100 on 3rd gen, and GH200 is just GH100 with more memory.
 
I can’t imagine demand will still be this insane in 2026 when Blackwell is ramping up.
I'm not so sure the end of stupidly high growth is in sight just yet. Think of the smaller but perhaps more risky complaints about GenAI today: a company with proprietary internally-developed software assets can't just pipe their entire codebase into a community AI platform, because it may (and has already been shown to, in multiple cases) start leaking your software secrets to others as part of the inference and distillation cycles. Any sensible companies who want to leverage AI for NDA-level things will be hosting their own private GenAI / LLMs and will need compute to do it. This isn't about the total GPU cluster capacity though, it's about securely partitioning the data.

Great, so why don't they just buy their own compute cluster? Because they can't buy just one if it's business critical, they need to buy at least two and space them out geographically. Then you get into resliency sets and replication and services routing and of course all the underlying infrastructure duplication. Depending on the geographic size of the company (eg multinational business) they may need to buy more than two clusters and place them in more than two geographic regions. The more clusters we end up buying for resiliency and reduncancy, the less likely they're all going to be heavily used. In fact, it's more likely those extra GPU clusters will spend a lot of time functionally idle. We've now burned a TON of investment in physical equipment, datacenter capacity, and staff to manage lifecycle of the infrastructure to manage business continuity and disaster recovery, versus just "renting" the time from a cloud provider who could provide the same.

Even if our hypothetical company ends up spending 75% of their projected total cost of ownership on renting instead of buying, they've still demonstrated a 25% reduction in cost for the same performance. And if the business demands they reduce cost? Then it's a clear line from reducing consumption to getting the expected cost reduction.

I'm not telling anyone cloud is "cheap' (because holy hell, it's not) but for boutique use cases like this one, it might be the better alternative for smaller to midsize companies -- and there are a LOT of those out there. Self-hosted LLMs are just now starting to take off; I have one in my house now and I can see how this could immediately grow for other companies.
 
Last edited:
Blackwell for HPC will likely be announced this spring.
H100 isn't a first generation of AI h/w either - GV100 had tensor cores so that puts GH100 on 3rd gen, and GH200 is just GH100 with more memory.

First generation of the AI hype.

Blackwell may be announced this year but GH200 has barely started shipping. General availability is expected in Q2.

I'm not so sure the end of stupidly high growth is in sight just yet. Think of the smaller but perhaps more risky complaints about GenAI today: a company with proprietary internally-developed software assets can't just pipe their entire codebase into a community AI platform, because it may (and has already been shown to, in multiple cases) start leaking your software secrets to others as part of the inference and distillation cycles. Any sensible companies who want to leverage AI for NDA-level things will be hosting their own private GenAI / LLMs and will need compute to do it. This isn't about the total GPU cluster capacity though, it's about securely partitioning the data.

Great, so why don't they just buy their own compute cluster? Because they can't buy just one if it's business critical, they need to buy at least two and space them out geographically. Then you get into resliency sets and replication and services routing and of course all the underlying infrastructure duplication. Depending on the geographic size of the company (eg multinational business) they may need to buy more than two clusters and place them in more than two geographic regions. The more clusters we end up buying for resiliency and reduncancy, the less likely they're all going to be heavily used. In fact, it's more likely those extra GPU clusters will spend a lot of time functionally idle. We've now burned a TON of investment in physical equipment, datacenter capacity, and staff to manage lifecycle of the infrastructure to manage business continuity and disaster recovery, versus just "renting" the time from a cloud provider who could provide the same.

Even if our hypothetical company ends up spending 75% of their projected total cost of ownership on renting instead of buying, they've still demonstrated a 25% reduction in cost for the same performance. And if the business demands they reduce cost? Then it's a clear line from reducing consumption to getting the expected cost reduction.

I'm not telling anyone cloud is "cheap' (because holy hell, it's not) but for boutique use cases like this one, it might be the better alternative for smaller to midsize companies -- and there are a LOT of those out there. Self-hosted LLMs are just now starting to take off; I have one in my house now and I can see how this could immediately grow for other companies.

The premise is that demand for AI compute will continue to drive datacenter expansion. If we’re lucky that demand will be driven by newly discovered useful things to do with transformer models (or some hot new thing will emerge). What’s happening right now is everyone is stocking their armories but the war may be over before it starts.
 
Last edited:
First generation of the AI hype.
I doubt that this affects much in how they approach the roadmap.

Blackwell may be announced this year but GH200 has barely started shipping. General availability is expected in Q2.
GB100 should follow the usual rhythm - announcement in spring, select availability in Q3, general availability in Q4-Q1. It will also probably cost more than H200 so these can co-exist.
 
Workstations equipped with GH200 are available for small-mid size companies and AI enthusiasts.
So far, the NVIDIA GH200 Grace Hopper Superchip has only been available to servers, data centers, and cloud systems but certain manufacturers are now buying these accelerators and putting them inside pre-built workstation PCs.

GPTshop is offering four models of its Workstation PCs which include a GH200 576 GB variant, a GH200 624 GB variant, and two Special Edition variants of the previously mentioned models that feature "early bird" pricing. The models start at $41,500 US for the 576 GB and $48,500 for the 624 GB models.
...
Coming to the specifications, the NVIDIA GH200 Grace Hopper Superchip comes with the H100 Tensor Core CPU and a 72-core NVIDIA Grace CPU. The 576 GB model comes with 480 GB of LPDDR5X memory for system RAM and 96 GB of HBM3 memory for the GPU. The 624 GB model comes with 480 GB of LPDDR5X for system RAM and 144 GB of HBM3 memory. Both models are equipped with 900 GB/s of NVLINK-C2C interconnects and have a TDP that can be adjusted from 450W to 1000W.
 
Last edited:
Anandtech is quoting Jon Peddie saying that NVIDIA is trying to expand heavily into consoles.

"NVIDIA is of course interested in expanding its footprint in consoles – right now they are supplying the biggest selling console supplier, and are calling on Microsoft and Sony every week to try and get back in," Peddie said. "NVIDIA was in the first Xbox, and in PlayStation 3. But AMD has a cost-performance advantage with their APUs, which NVIDIA hopes to match with Grace. And since Windows runs on Arm, NVIDIA has a shot at Microsoft. Sony's custom OS would not be much of a challenge for NVIDIA."

 
I think Sony is in pretty deep with AMD. But it would be VERY cool to see more diversity of hardware between the big 2 consoles.
 
Pretty sure Sony will stick with AMD due to RDNA backward compatibility. MS otoh has more flexibility with its DX12 API.
 
If Nvidia will take care of the s/w side then nothing would stop Sony from switching either. And Nvidia can do it since they've done it for PS3 and are doing it for Switch.

This idea isn't really relevant in modern age. The only thing which determines the outcome is cost.
 
Serendipity implies luck. When CUDA’s architects decided to include support for pointers I’m guessing luck didn’t have much to do with it.

Raja’’s hypothesis that software developers would be happy coding to a bare metal ISA seems misguided. This hasn’t been the case for decades. The average software engineer nowadays relies on a deep stack of libraries and frameworks to get anything done. That’s true whether you’re targeting x86 or ARM or working in python/java/C.

The “Swiss cheese” stack of languages, compilers and libraries is a fact of life for everything else. Why would it be different for compute accelerators. Nvidia advantage is that they invested in the entire stack and that’s what makes their ecosystem sticky.

It’s unlikely Nvidia will become the Wintel of AI in the long run though. There’s too much diversity in edge compute / inferencing hardware for that to happen.
 
Back
Top