NVIDIA discussion [2024]

That's the first thought that crossed my mind. Of course there are already millions of users using computers for gaming, but this is on top of that. Plus it's concentrated in one single location? Maybe it should have it's own nuclear power station haha.
If the system costs $100B, adding a small nuclear plower plant would only increase the cost by like 1%.
 
At any rate, this is just a glimpse of what is coming in regards to AI investment, if anyone thinks AI is a bubble, he is serverely mistaken. If anything it's the contrary, companies and countries are increasing their investment in a crazy manner.
 
For it not to be a bubble there needs to be a return on the investment. Nvidia is certainly seeing a return, but so are the folks who sell shovels during a gold rush. Eventually this has to pay off to all links in the chain down to the consumer, otherwise they have no reason to participate. Right now it feels like all the investment going into AI is not based on what it does now, but riding on the speculation that LLMs will get much better and orders of magnitude less costly in terms of resources (power, transistors) in the future. Writing hundred billion dollar cheques expecting Moore's Law to cash them 5-10 years from now seems like a weird gamble to make. Maybe if MS were seeing Bing's traffic skyrocketing (and their advertising dollars likewise) then at least we could have that to point to for user adoption and a monetization model, but I don't think that's what we're seeing right now.
 
Maybe if MS were seeing Bing's traffic skyrocketing (and their advertising dollars likewise) then at least we could have that to point to for user adoption and a monetization model, but I don't think that's what we're seeing right now.
Something is going on behind the scenes, we just don't know about it fully yet, OpenAI and Microsoft are cooking something, they are saying ChatGPT 5 is amazingly powerful and is unlike anything else before it. Sora is also advancing rapidly, executives from Hollywood are meeting up with OpenAI people to discuss big things, which explains why Microsoft and OpenAI are doubling down on AI investment. OpenAI also revealed a voice engine that takes a 15 second voice clip from a single person, and generates human audio from it in any language.

These developments are triggering competition waves across the industry, nobody wants to fall behind.


 
Last edited:
MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates (hpcwire.com)

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues to keep pace with fast-moving ML technology. Nvidia showcased H100 and H200 results, Qualcomm’s Cloud AI 100 Ultra (preview category) and Intel/Habana’s Gaudi 2 showed gains. Intel had the only CPU-as-accelerator in the mix.
...
Overall, the number of submitters has been fairly stable in recent years. There were 23 this time including ASUSTeK, Azure, Broadcom, Cisco, CTuning, Dell, Fujitsu, Giga Computing, Google, Hewlett Packard Enterprise, Intel, Intel Habana Labs, Juniper Networks, Krai, Lenovo, NVIDIA, Oracle, Qualcomm Technologies, Inc., Quanta Cloud Technology, Red Hat, Supermicro, SiMa, and Wiwynn. MLPerf Inference v4.0 included over 8500 performance results and 900 Power results from 23 submitting organizations.

Missing were the heavy chest-thumping bar charts from Nvidia versus competitors as the rough pecking order of inference accelerators, at least for now, seems to be settled. One of the more interesting notes came from David Salvator, Nvidia director of accelerated computing products, who said inference revenue was now 40% of Nvidia’s datacenter revenue.
...
MLCommons provided a deeper look into its decision-making process in adding the two new benchmarks which is posted on the MLCommons site. The composition of the team doing that work — Thomas Atta-fosu, Intel (Task Force Chair); Akhil Arunkumar, D-Matrix; Anton Lokhomotov, Krai ; Ashwin Nanjappa, NVIDIA; Itay Hubara, Intel Habana Labs; Michal Szutenberg, Intel Habana Labs; Miro Hodak, AMD ; Mitchelle Rasquinha, Google; Zhihan Jiang, Nvidia — reinforces the idea of cooperation among rival companies.
...
Practically speaking, digging out value from the results requires some work. With this round, MLPerf results are being presented on a different platform — Tableau — and, at least for me, there’a learning curve to effectively use the powerful platform. That said the data is there.
...
Instead SalvatorAsked about forthcoming Blackwell GPUs, B100 and B200, and their drop-in compatibility with existing H100 and H200 systems, Salvator said, “We have not designed B200 to be drop-in compatible with an H200 CTS system. The drop-in compatible side is focused more on the B100, because we have a pretty significant installed base of H100 base servers and a lot of our partners know how to build those servers. So that ability to easily swap in a B100 base board gets them to market that much faster. B200 will require a different chassis design. It’s not going to be drop-in compatible with H200 systems.”
...

Shah noted Intel had five partners submitting this time around. “The fact that we have five partners that submitted is indicative of the fact that they also are recognizing that this is where Xeons key strengths are; it’s in the area of when you have mixed general purpose workloads or a general purpose application and you’re infusing AI into it.” The five partners were Cisco, Dell, Quanta, Supermicro, and WiWynn.

Next up is the MLPerf Training expected in the June time frame.
 
The H100 and L40S dominates almost all inference charts, I am surprised with the L40S to be honest.

Interesting that Google only submitted results for a single category: GPTj99 with 4 pods of TPU v5e, they get trounced though, 8x H100 are between 24x to 34x times faster, while the 8x L40S is 10x to 14x times faster. Normalized accelerator vs accelerator it's still a huge difference (H100 is 12x to 17x times faster while L40S is 5x to 7x times faster), I thought TPUs were much more competitive than this?
 
Last edited:

A computer this size is probably going to consume more than 300MW though. The largest computer today (Frontier) cost $600M and consumes about 22.7MW. It is among the most power efficient supercomputers.
A $100B system means that it could be more than 100 times the size. It could consume way more than 1000MW, probably even 2000MW. You'll probably need a full sized nuclear power plant (like two AP1000s) to support it :)
 
A computer this size is probably going to consume more than 300MW though. The largest computer today (Frontier) cost $600M and consumes about 22.7MW. It is among the most power efficient supercomputers.
A $100B system means that it could be more than 100 times the size. It could consume way more than 1000MW, probably even 2000MW. You'll probably need a full sized nuclear power plant (like two AP1000s) to support it :)
I don't understand how the distribution system would ever be able to support a 2000MW spot load, that's 84kA on a 13.8kV system, are these planned to be fed via transmission line?
 
You'll probably need a full sized nuclear power plant (like two AP1000s) to support it :)

Or lots of solar + storage. Or many smaller nuclear reactors. Whatever it takes and if it helps further development of non fossil energy market it's a positive. The challenge with anything except small nuclear is grid interconnects isn't it? A project with this sort of budget will be able to pay for them but regulation/nimbism make things move slower than they should.

It's like we learned nothing from the Terminator movies 😞

There's no fete but what we make? 🙂

1711871572158.png
 
The premise of the Terminator hinged on an AI sending technology back in time which rapidly seeded the tech advancements that gave rise to the AI. It wasn't a modern retelling of Frankenstein, it was a plot device for a robot assassin action movie.
 
Or lots of solar + storage. Or many smaller nuclear reactors. Whatever it takes and if it helps further development of non fossil energy market it's a positive. The challenge with anything except small nuclear is grid interconnects isn't it? A project with this sort of budget will be able to pay for them but regulation/nimbism make things move slower than they should.

If the supercomputer (or perhaps more accurately, the data center) itself requires so much power, it seems to me that naturally you'll want to use the largest reactors around, instead of using many smaller reactors. Larger reactors tend to be more efficient and it's also generally cheaper to maintain a few large reactors compared to many smaller reactors.

Solar only makes sense for this kind of project when you want to save money, because daytime electricity is more expensive. You probably don't need to have a lot of your own storage unless you are off grid, but I don't think it's realistic for a project this size to be completely off grid. Actually even if you build your own nuclear reactors you probably still want to be on grid because the reactors will need to be shutdown from time to time for routine maintainance.
 
which is good, we shouldn't use hollywood productions to inform us about the future of computing

The Terminator reference was mostly tongue in cheek. I am genuinely concerned at this runaway competition for insane AI performance without any real safety net though. We already have AI networks that easily exceed the computational performance of a human brain and we're looking to exponentially exceed those in the near future. Surely it's inevitable that these things will at some point become sentient and as far as I'm aware we have no plan whatsoever for if/when that happens, the likely vastly more intelligent AI's values and aspirations to not align with our own.

I know that sounds like science fiction, but so would the whole concept of exaflop level AI networks a few years ago. Let alone the things that generative AI can do these days for the average consumer.
 
If the supercomputer (or perhaps more accurately, the data center) itself requires so much power, it seems to me that naturally you'll want to use the largest reactors around, instead of using many smaller reactors. Larger reactors tend to be more efficient and it's also generally cheaper to maintain a few large reactors compared to many smaller reactors.

We don't really have number for small nuclear, since they're not deployed. The one pilot project fell by the wayside, but that was because it was likely to be thrashed by solar prices. In principle, the cost should beat out large nuclear plants' actual operating costs. You can always beat something that's actually operating with something on paper though!

I'd bet on the cheapest option is build the data center next to a massive solar farm+storage.

Wonder what they could so with all the waste heat being generated?
 
Surely it's inevitable that these things will at some point become sentient

Is it though? The current models seem to rely very heavily on hoovering up human-generated knowledge, and also human-generated bullshit. There have been claims of emergent behaviour in the LLMs but TBH I think that's at the level of people who think their dog is clever because it can lick its own balls.

Let alone the things that generative AI can do these days for the average consumer.

This, IMO, is the real problem. It's not the Artificial Intelligence of the machines, it's the Artificial Stupidity of the humans.
 
The Terminator reference was mostly tongue in cheek. I am genuinely concerned at this runaway competition for insane AI performance without any real safety net though. We already have AI networks that easily exceed the computational performance of a human brain and we're looking to exponentially exceed those in the near future. Surely it's inevitable that these things will at some point become sentient and as far as I'm aware we have no plan whatsoever for if/when that happens, the likely vastly more intelligent AI's values and aspirations to not align with our own.

I know that sounds like science fiction, but so would the whole concept of exaflop level AI networks a few years ago. Let alone the things that generative AI can do these days for the average consumer.
And masively off topic.

No, it's not "surely that these things will at some point become sentient". Even you then water it down by saying it sounds like science fiction; language which shows that you might be unsure yourself.
Actually the burden of proof would be on those claiming they would become "sentient".

Regardless, we'd first need to worry about the same bad actors being able to do more harm with the help of AI (a new tool that obviosly can be use to cause harm too). Long before awaken AIs will "decide" to wipe us out
 
The Terminator reference was mostly tongue in cheek. I am genuinely concerned at this runaway competition for insane AI performance without any real safety net though.
Skynet is the greatest threat humanity NVIDIA's stock price has ever seen.

Wonder what they could so with all the waste heat being generated?
Jen-Hsun joking: "Room temperature comes in, Jacuzzi comes out" -

Related: https://theconversation.com/swimmin...from-servers-heres-how-to-make-it-work-221693 (I'm not seriously expecting that to happen at any significant scale, but it'd be interesting to swim in the entropy of intelligence itself...)
Is it though? The current models seem to rely very heavily on hoovering up human-generated knowledge
Who cares about current models? Everyone seems to forget that while we don't quite understand the inner workings of LLMs, we do understand Reinforcement Learning pretty well, and there is a vast amount of high quality literature (and research by OpenAI/Deepmind from 5+ years ago) on the topic. What's missing is arguably the easy part - we "just" need to figure out how to integrate the two (easier said than done).

The scary part to me is that we understand too little of the inner workings while now building single clusters that beat the human brain on every metric by such ridiculous amounts that AGI has moved to being strictly a software problem. Even the most extravagent claims about wetware computation with significant computation done implicitly in every synapse still aren't anywhere near enough to match a 20K GB200 cluster (never mind Stargate if that thing is real).

That has never been the case before; if an alien civilisation came to us 30 years ago with the software for AGI, we nearly certainly couldn't compute it at sufficient scale. We absolutely definitely could today.
 
Back
Top