NVIDIA discussion [2024]

LordEC911 · Feb 4, 2024

DavidGraham said:
Also in Portal RTX and Portal RTX Prelude. There is also SER in more games: Witcher 3, Sackboy, Portal RTX, Portal RTX Prelude, F1 2023, Cyberpunk and UE5 in general.

RTX Remix titles coulde be using SER and OMM automatically.

Blackwell is rumored to have Ray Tracing Level 4 or Level 5.

There is also some speculation that NVIDIA will integrate generative AI for NPCs and characters in certain titles, and will rely on Blackwell to deliver responses with minimal latency.

Ah yes, the good old glory days of Aegia PhysX, Killer NIC, and AIseek's Intia AI processor.
I still remember AIseek's town demo even though it may have been lost from the internet. The Tank demo is on youtube though.
2000's was such a wild time for gaming and the hardware market.

DavidGraham · Feb 9, 2024

NVIDIA is opening up a new division focused on doing custom chips for AI and video games.

Nvidia officials have met with representatives from Amazon.com (AMZN.O), opens new tab, Meta, Microsoft, Google and OpenAI to discuss making custom chips for them, two sources familiar with the meetings said. Beyond data center chips, Nvidia has pursued telecom, automotive and video game customers.

Dina McKinney, a former Advanced Micro Devices (AMD.O), opens new tab and Marvell executive, heads Nvidia's custom unit and her team's goal is to make its technology available for customers in cloud, 5G wireless, video games and automotives, a LinkedIn profile said. Those mentions were scrubbed and her title changed after Reuters sought comment from Nvidia.

https://www.reuters.com/technology/nvidia-chases-30-billion-custom-chip-market-with-new-unit-sources-2024-02-09/

pharma · Feb 9, 2024

Why Nvidia Is Entering The $30B Market For Custom Chips

Nvidia could become a tech licensing powerhouse.

www.forbes.com

When I was working at AMD, I was always impressed with how the company managed to keep two teams in complete isolation from each other to protect client confidentiality. One team was designing the next chip for the Microsoft XBox, while the other was designing a chip for the Sony Playstation. Each client had their own gaming console intellectual property and requirements which had to be protected from the other team. It was a successful model for AMD, who still owns that market.

But all that secrecy can be difficult and expensive. And it is hard to scale that business. What if the chip vendor let the customer do more of the design work, and provide their IP for inclusion into the customers’ chips? And of course, the client could leverage the vendors relationship with TSMC or Samsung for fabrication to lower costs and improve time to market.

So it should surprise exactly nobody that Nvidia has announced it formed a group tasked with forging this new business model, helping clients build their own solution using Nvidia IP or perhaps even chiplets. Nvidia is up yet another 3% on the news.

Perhaps Nvidia didn’t need to buy Arm after all. With this move, it is beginning to build an AI licensing giant.
...
Could these Nvidia custom-chip clients tap into Nvidia’s in-house and AWS supercomputers to accelerate and optimize those design efforts? It would be a nice piece of additional revenue as well as an incredible differentiator. If so, this could be why Nvidia is hosting their latest “internal” supercomputer, Project Cieba, at AWS data centers, where the infrastructure for secure cloud services are already available. Nvidia could make chip design optimization services available on Cieba.

While this speculation may be a bridge too far, doing so would indicate that Nvidia sees the writing on the wall, and is already gearing up to transform the industry once again.

trinibwoy · Feb 10, 2024

Has Nvidia started realizing revenue from GH200? It’s crazy how much hype and spending there is on “AI” based on just the first generation of A100/H100 products. It will be interesting to watch the upgrade cycles. I’m guessing the big cloud and supercomputer players won’t be swallowing the capex to switch to the latest and greatest every few years.

nutball · Feb 10, 2024

trinibwoy said:
I’m guessing the big cloud and supercomputer players won’t be swallowing the capex to switch to the latest and greatest every few years.

Why would you guess that?

For the cloud and hyperscaler people surely it must pay to have the cutting-edge tech, which they can rent out at a premium, but they are on such a scale that they don't have to replace old kit they can just add the new stuff and still sell services based on the old stuff at the same time.

The old stuff can just atrophy and be disposed of whenever.

trinibwoy · Feb 10, 2024

nutball said:
Why would you guess that?

For the cloud and hyperscaler people surely it must pay to have the cutting-edge tech, which they can rent out at a premium, but they are on such a scale that they don't have to replace old kit they can just add the new stuff and still sell services based on the old stuff at the same time.

The old stuff can just atrophy and be disposed of whenever.

I can’t imagine demand will still be this insane in 2026 when Blackwell is ramping up.

pharma · Feb 10, 2024

trinibwoy said:
I can’t imagine demand will still be this insane in 2026 when Blackwell is ramping up.

Did the Blackwell release time frame get changed? I thought we were looking at a Q4/2024 - Q1/2025 release.
I was thinking Rubin might see release towards end of 2026 unless the annual cadence has changed.

DegustatoR · Feb 10, 2024

trinibwoy said:
I can’t imagine demand will still be this insane in 2026 when Blackwell is ramping up.

Blackwell for HPC will likely be announced this spring.
H100 isn't a first generation of AI h/w either - GV100 had tensor cores so that puts GH100 on 3rd gen, and GH200 is just GH100 with more memory.

Albuquerque · Feb 10, 2024

trinibwoy said:
I can’t imagine demand will still be this insane in 2026 when Blackwell is ramping up.

I'm not so sure the end of stupidly high growth is in sight just yet. Think of the smaller but perhaps more risky complaints about GenAI today: a company with proprietary internally-developed software assets can't just pipe their entire codebase into a community AI platform, because it may (and has already been shown to, in multiple cases) start leaking your software secrets to others as part of the inference and distillation cycles. Any sensible companies who want to leverage AI for NDA-level things will be hosting their own private GenAI / LLMs and will need compute to do it. This isn't about the total GPU cluster capacity though, it's about securely partitioning the data.

Great, so why don't they just buy their own compute cluster? Because they can't buy just one if it's business critical, they need to buy at least two and space them out geographically. Then you get into resliency sets and replication and services routing and of course all the underlying infrastructure duplication. Depending on the geographic size of the company (eg multinational business) they may need to buy more than two clusters and place them in more than two geographic regions. The more clusters we end up buying for resiliency and reduncancy, the less likely they're all going to be heavily used. In fact, it's more likely those extra GPU clusters will spend a lot of time functionally idle. We've now burned a TON of investment in physical equipment, datacenter capacity, and staff to manage lifecycle of the infrastructure to manage business continuity and disaster recovery, versus just "renting" the time from a cloud provider who could provide the same.

Even if our hypothetical company ends up spending 75% of their projected total cost of ownership on renting instead of buying, they've still demonstrated a 25% reduction in cost for the same performance. And if the business demands they reduce cost? Then it's a clear line from reducing consumption to getting the expected cost reduction.

I'm not telling anyone cloud is "cheap' (because holy hell, it's not) but for boutique use cases like this one, it might be the better alternative for smaller to midsize companies -- and there are a LOT of those out there. Self-hosted LLMs are just now starting to take off; I have one in my house now and I can see how this could immediately grow for other companies.

trinibwoy · Feb 10, 2024

DegustatoR said:
Blackwell for HPC will likely be announced this spring.
H100 isn't a first generation of AI h/w either - GV100 had tensor cores so that puts GH100 on 3rd gen, and GH200 is just GH100 with more memory.

First generation of the AI hype.

Blackwell may be announced this year but GH200 has barely started shipping. General availability is expected in Q2.

Albuquerque said:
I'm not so sure the end of stupidly high growth is in sight just yet. Think of the smaller but perhaps more risky complaints about GenAI today: a company with proprietary internally-developed software assets can't just pipe their entire codebase into a community AI platform, because it may (and has already been shown to, in multiple cases) start leaking your software secrets to others as part of the inference and distillation cycles. Any sensible companies who want to leverage AI for NDA-level things will be hosting their own private GenAI / LLMs and will need compute to do it. This isn't about the total GPU cluster capacity though, it's about securely partitioning the data.

Great, so why don't they just buy their own compute cluster? Because they can't buy just one if it's business critical, they need to buy at least two and space them out geographically. Then you get into resliency sets and replication and services routing and of course all the underlying infrastructure duplication. Depending on the geographic size of the company (eg multinational business) they may need to buy more than two clusters and place them in more than two geographic regions. The more clusters we end up buying for resiliency and reduncancy, the less likely they're all going to be heavily used. In fact, it's more likely those extra GPU clusters will spend a lot of time functionally idle. We've now burned a TON of investment in physical equipment, datacenter capacity, and staff to manage lifecycle of the infrastructure to manage business continuity and disaster recovery, versus just "renting" the time from a cloud provider who could provide the same.

Even if our hypothetical company ends up spending 75% of their projected total cost of ownership on renting instead of buying, they've still demonstrated a 25% reduction in cost for the same performance. And if the business demands they reduce cost? Then it's a clear line from reducing consumption to getting the expected cost reduction.

I'm not telling anyone cloud is "cheap' (because holy hell, it's not) but for boutique use cases like this one, it might be the better alternative for smaller to midsize companies -- and there are a LOT of those out there. Self-hosted LLMs are just now starting to take off; I have one in my house now and I can see how this could immediately grow for other companies.

The premise is that demand for AI compute will continue to drive datacenter expansion. If we’re lucky that demand will be driven by newly discovered useful things to do with transformer models (or some hot new thing will emerge). What’s happening right now is everyone is stocking their armories but the war may be over before it starts.

DegustatoR · Feb 11, 2024

trinibwoy said:
First generation of the AI hype.

I doubt that this affects much in how they approach the roadmap.

trinibwoy said:
Blackwell may be announced this year but GH200 has barely started shipping. General availability is expected in Q2.

GB100 should follow the usual rhythm - announcement in spring, select availability in Q3, general availability in Q4-Q1. It will also probably cost more than H200 so these can co-exist.

pharma · Feb 12, 2024

This AI Workstation PC Comes Equipped With An NVIDIA GPU & A NVIDIA CPU: Grace Hopper Superchip Starts At $41K

NVIDIA's first combo GPU & CPU combo with its GH200 Grace Hopper Superchip is now available in pre-built PC workstations by GPTshop.

wccftech.com

Workstations equipped with GH200 are available for small-mid size companies and AI enthusiasts.

So far, the NVIDIA GH200 Grace Hopper Superchip has only been available to servers, data centers, and cloud systems but certain manufacturers are now buying these accelerators and putting them inside pre-built workstation PCs.

GPTshop is offering four models of its Workstation PCs which include a GH200 576 GB variant, a GH200 624 GB variant, and two Special Edition variants of the previously mentioned models that feature "early bird" pricing. The models start at $41,500 US for the 576 GB and $48,500 for the 624 GB models.
...
Coming to the specifications, the NVIDIA GH200 Grace Hopper Superchip comes with the H100 Tensor Core CPU and a 72-core NVIDIA Grace CPU. The 576 GB model comes with 480 GB of LPDDR5X memory for system RAM and 96 GB of HBM3 memory for the GPU. The 624 GB model comes with 480 GB of LPDDR5X for system RAM and 144 GB of HBM3 memory. Both models are equipped with 900 GB/s of NVLINK-C2C interconnects and have a TDP that can be adjusted from 450W to 1000W.

DavidGraham · Feb 14, 2024

Anandtech is quoting Jon Peddie saying that NVIDIA is trying to expand heavily into consoles.

"NVIDIA is of course interested in expanding its footprint in consoles – right now they are supplying the biggest selling console supplier, and are calling on Microsoft and Sony every week to try and get back in," Peddie said. "NVIDIA was in the first Xbox, and in PlayStation 3. But AMD has a cost-performance advantage with their APUs, which NVIDIA hopes to match with Grace. And since Windows runs on Arm, NVIDIA has a shot at Microsoft. Sony's custom OS would not be much of a challenge for NVIDIA."

Report: NVIDIA Forms Custom Chip Unit for Cloud Computing and More

www.anandtech.com

pharma · Feb 15, 2024

February 15, 2024

NVIDIA Eos Revealed: Peek Into Operations of a Top 10 Supercomputer | NVIDIA Blog

Providing a peek at the architecture powering advanced AI factories, NVIDIA Thursday released a video that offers the first public look at Eos, its latest data-center-scale supercomputer. An extremely large-scale NVIDIA DGX SuperPOD, Eos is where NVIDIA developers create their AI breakthroughs...

blogs.nvidia.com

Remij · Feb 16, 2024

I think Sony is in pretty deep with AMD. But it would be VERY cool to see more diversity of hardware between the big 2 consoles.

Pinstripe · Feb 16, 2024

Pretty sure Sony will stick with AMD due to RDNA backward compatibility. MS otoh has more flexibility with its DX12 API.

DegustatoR · Feb 16, 2024

If Nvidia will take care of the s/w side then nothing would stop Sony from switching either. And Nvidia can do it since they've done it for PS3 and are doing it for Switch.

This idea isn't really relevant in modern age. The only thing which determines the outcome is cost.

Kaotik · Feb 17, 2024

Pinstripe said:
Pretty sure Sony will stick with AMD due to RDNA backward compatibility. MS otoh has more flexibility with its DX12 API.

MS leak already confirmed AMD for MS

DavidGraham · Feb 18, 2024

Raja on CUDA.

https://twitter.com/i/web/status/1759042938515144850

trinibwoy · Feb 18, 2024

Serendipity implies luck. When CUDA’s architects decided to include support for pointers I’m guessing luck didn’t have much to do with it.

Raja’’s hypothesis that software developers would be happy coding to a bare metal ISA seems misguided. This hasn’t been the case for decades. The average software engineer nowadays relies on a deep stack of libraries and frameworks to get anything done. That’s true whether you’re targeting x86 or ARM or working in python/java/C.

The “Swiss cheese” stack of languages, compilers and libraries is a fact of life for everything else. Why would it be different for compute accelerators. Nvidia advantage is that they invested in the entire stack and that’s what makes their ecosystem sticky.

It’s unlikely Nvidia will become the Wintel of AI in the long run though. There’s too much diversity in edge compute / inferencing hardware for that to happen.

NVIDIA discussion [2024]

LordEC911

DavidGraham

pharma

Why Nvidia Is Entering The $30B Market For Custom Chips

trinibwoy

Meh

nutball

trinibwoy

Meh

pharma

DegustatoR

Albuquerque

Red-headed step child

trinibwoy

Meh

DegustatoR

pharma

This AI Workstation PC Comes Equipped With An NVIDIA GPU & A NVIDIA CPU: Grace Hopper Superchip Starts At $41K

DavidGraham

Report: NVIDIA Forms Custom Chip Unit for Cloud Computing and More

pharma

NVIDIA Eos Revealed: Peek Into Operations of a Top 10 Supercomputer | NVIDIA Blog

Remij

Pinstripe

DegustatoR

Kaotik

Drunk Member

DavidGraham

trinibwoy

Meh

Similar threads