NVIDIA discussion [2024]

DegustatoR · Oct 17, 2024

Broopster said:
Board Channel rumor (KitGuru report linked below) that RTX 50 will launch on a relatively tight schedule - if true it seems likely aimed at sucking the any air left for RDNA 4 out of the market.

Nvidia has been launching their lineups in 3 to a max of 6 months timeframes for literally years now. I fail to see how this is something new with Blackwell - and as such why it would be related to RDNA4 in any way.

Broopster · Oct 17, 2024

DegustatoR said:
Nvidia has been launching their lineups in 3 to a max of 6 months timeframes for literally years now. I fail to see how this is something new with Blackwell - and as such why it would be related to RDNA4 in any way.

They haven’t released everything in a 3 month period since Pascale I believe - and this is 2 months. The RDNA4 comment is admittedly speculation on my part - might not be a major consideration though surely they’ve considered it.

dorf · Oct 17, 2024

DegustatoR said:
Nvidia has been launching their lineups in 3 to a max of 6 months timeframes for literally years now

Wrong. 8 months between 4090 and 4060 launch.

DegustatoR · Oct 17, 2024

Broopster said:
They haven’t released anything in a 3 month period since Pascale - and this is 2 months.

Jan-Mar is 3 months.
Turing was Sep-Dec - 4 months.
Ampere was Sep-Feb - 6 months.
Lovelace was Oct-Jun so technically 9 months but it was also the least interesting as a perf/price upgrade for anything below 4070Ti which launched in Jan which would also be 3 months.

Broopster · Oct 17, 2024

DegustatoR said:
Lovelace was Oct-Jun so technically 9 months but it was also the least interesting as a perf/price upgrade for anything below 4070Ti which launched in Jan which would also be 3 months

I’m not sure I buy that bit of sophistry haha.

In any case even by your own estimate it’s a shorter time frame by a month. Look I never claimed this was a radical departure, just that it’s more accelerated than usual. Also, if you reread my initial post, you’ll see I said that one can also see this as a typical schedule just with the high end cards delayed. Either could be valid based on what we know right now.

(Also, if I’m in January, March is two months away, not three. If you go from early Jan to late March then it’s more like three, but these won’t release until at least the second half of January unless NVIDIA is sending reviewers their cards the day after the announcement.)

DegustatoR · Oct 17, 2024

Broopster said:
I’m not sure I buy that bit of sophistry haha.

4060 didn't bring anything new to the market, you could argue that it was even somewhat of a downgrade (due to VRAM size) in comparison to a similar priced 3060. So in this case its launch was rather irrelevant as well as when it had happened. Same is true for 4060Ti and 4070 although the latter can be viewed as a minor upgrade depending on what you compare it to.

Broopster said:
Look I never claimed this was a radical departure, just that it’s more accelerated than usual.

This may be more a result of them launching the first SKUs later than originally planned than accelerating the launch of lower end cards. My point is that this in itself isn't a sign of anything about the competition.

Broopster · Oct 17, 2024

DegustatoR said:
4060 didn't bring anything new to the market, you could argue that it was even somewhat of a downgrade (due to VRAM size) in comparison to a similar priced 3060. So in this case its launch was rather irrelevant as well as when it had happened. Same is true for 4060Ti and 4070 although the latter can be viewed as a minor upgrade depending on what you compare it to.

That doesn’t negate that they’re new cards! I don’t disagree with your characterization of their value but saying they don’t count as releases is a stretch. It’s not like they’re minor refreshes of the old cards (new node, new functionality, etc).

Broopster · Oct 17, 2024

DegustatoR said:
My point is that this in itself isn't a sign of anything about the competition.

It might not be - just one possible explanation, even if we may disagree on how likely it is.

Broopster · Oct 21, 2024

Interesting development. The change being mulled here sounds like it would just expand the restrictions to the Middle East. Depending on the election results all GPUs could get a lot more expensive to boot.

US govt reportedly mulls capping AI GPU exports from Nvidia and AMD to the Middle East

The goal here is to regulate the dissemination of advanced AI technology.

www.tomshardware.com

DegustatoR · Oct 24, 2024

Modern GPUs are highly complex system-on-chips with a massive number of functions and resources — starting from compute resources and power management and all the way to display engines and security — that must be managed.

These things are now managed by 10 to 40 custom RISC-V cores developed by Nvidia, depending on chip complexity. Nvidia started to replace its proprietary microcontrollers with RISC-V-based microcontroller cores in 2015, and by now, virtually all of its MCU cores are RISC-V-based, according to an Nvidia slide demonstrated at the RISC-V Summit.

By now, Nvidia has developed at least three RISC-V microcontroller cores: NV-RISCV32 (RV32I-MU, in-order single-issue core), NV-RISCV64 (RV64I-MSU, out-of-order dual-issue core), and NV-RVV (RV32I-MU, NVRISCV32 + 1024-bit vector extension). These cores (and perhaps others) replaced the proprietary Falcon microcontroller unit based on a different instruction set architecture. In addition, Nvidia has developed 20+ custom RISC-V extensions for extra performance, functionality, and security.

Perhaps the most important RISC-V-based part of Nvidia GPUs is its embedded GPU System Processor (GSP). According to Nvidia's website, the first GPUs to use RISC-V-based GSP were based on the Turing architecture. This GSP offloads Kernel Driver functions, reduces GPU MIMO exposure to the CPU, and manages how the GPU is used.

Since MCU cores are universal, they can be used across Nvidia's products. As a result, in 2024, Nvidia is expected to ship around a billion RISC-V cores built into its GPUs, CPUs, SoCs, and other products, according to one of the demonstrated slights, which highlights the ubiquity of custom RISC-V cores in Nvidia's hardware.

Nvidia to ship a billion of RISC-V cores in 2024

Nvidia quietly adopts RISC-V and replaces proprietary microcontrollers.

www.tomshardware.com

In 2005, Intel CEO Paul Otellini surprised the company's board. According to a report from the New York Times, he suggested that Intel buy Nvidia for "as much as" $20 billion.

According to the Times's sources ("two people familiar with the boardroom discussion"), even some Intel executives thought that Nvidia's designs could eventually play an important role in data centers. While that idea would come to fruition with the modern AI boom, the board pushed back against it. It would have been Intel's most expensive acquisition, and there were worries about integrating the company. Otellini backed off, and that was that.

Instead, Intel's board backed an internal graphics project, Larabee, which now-CEO Pat Gelsinger helmed. It used Intel's x86 technologies, and the GPU was a sort of hybrid of a CPU and GPU. Intel ultimately pulled the plug on the project, though it would later return to graphics with its Xe and Arc projects.

Intel's former CEO reportedly wanted to buy Nvidia for $20 billion in 2005 — Nvidia is worth over $3 trillion today

A purchase could have drastically changed GPU and AI tech.

www.tomshardware.com

DavidGraham · Oct 25, 2024

DegustatoR said:
Intel's former CEO reportedly wanted to buy Nvidia for $20 billion in 2005 — Nvidia is worth over $3 trillion today

A purchase could have drastically changed GPU and AI tech.

www.tomshardware.com

So the proposal happened around the time AMD bought ATi? Interesting.

I am guessing the deal would have fell through due to Jensen insisting to be the CEO, NVIDIA was approached multiple times before, but the deals fell through due to this exact reason.

On other news, India is building a gigawat data center and NVIDIA is supplying Blackwell GPUs for it. India is also building another data center which will house tens of thousands of H100s.

https://twitter.com/x/status/1849636314259542248

pcchen · Oct 25, 2024

I find it interesting because about a decade ago when people were talking about “exascale” (meaning supercomputers at least 1000 PFLOPS, or 1 EFLOPS), the question was that no one thought it’s practical to have a supercomputer requiring more than like 100MW power (otherwise you just pile more GPU, and the largest supercomputer at that time was around 60MW). Now people seem to talk about numbers like “1 GW” as it’s nothing, all just because of “AI”.

trinibwoy · Oct 25, 2024

pcchen said:
I find it interesting because about a decade ago when people were talking about “exascale” (meaning supercomputers at least 1000 PFLOPS, or 1 EFLOPS), the question was that no one thought it’s practical to have a supercomputer requiring more than like 100MW power (otherwise you just pile more GPU, and the largest supercomputer at that time was around 60MW). Now people seem to talk about numbers like “1 GW” as it’s nothing, all just because of “AI”.

Yep the only thing that changed is $$$$$$$.

Albuquerque · Oct 25, 2024

Two things I would like to point out here:

Ten years ago, a singular computer which could consume an entire gigawatt of power would've been monumental in size, to the point where it might as well be several hundred thousand square feet -- equal to the typical wing of a modern colocation datacenter itself. At such physical scale, it likely would've ended up being a clustered ecosystem of supercomputers, rather than "one supercomputer." Maybe the clustering comment is just semantics, "supercomputer" work isn't single threaded anyway. Today, with the newest commercial GPUs coming in at ~1KW each, we can do 150KW or more in single 42U rack without accounting for cooling. Napkin maths says around 5000 racks packed with GPUs should consume roughly 750MW and sit on ~40,000 sqft of raised floor, then add in aisleways between those rack rows, a campus-wide water loop, the bigass pumps for the chiller system, the thermal plates for "cooling capacitance" in the chiller loop, some quantity of DX cooling for non-loop-chilled workloads (standard CRAC stuff), a room for an operations center, the necessary electrical rooms, UPS gear, and a pair of reactor rooms, and you've probably hit the 1GW mark in a datacenter smaller than 100,000sqft. There's no way we were building this sort of facility a decade ago.

Secondly, a decade ago we had no reasonable way to send an entire gigawatt of power to a single building; there are moderately sized cities that don't run on a gigawatt of power. As a simple example: the Tennessee Valley Authority (TVA) generates and distributes power to all of the state of TN, a substantial portion of Mississippi, the northernmost ~20% of Alabama, the southwestern-most ~20% of Kentucky, and several hundred square miles of Georgia, Virginia and the two Carolinas. As I'm typing this, the weather in this area of the country is around 78*F and sunny, it's about 5:30pm, and TVA is generating 19.2GW for all of their coverage. (Link: https://www.eia.gov/electricity/gridmonitor/dashboard/electric_overview/balancing_authority/TVA) Imagine sending 5% of that power to just a single building? Even today that sounds ludicrous.

Yet today, and for the past several years thanks to EPA authorizations and certifications, it's now commercially possible to drop one (or more) modular fission reactor(s) into a datacenter campus and achieve GW-class baseload at less than $0.20 kWh. This is crazy cheap for self-generation, especially when calculated over the decades-long lifespan of the unit even after considering fueling and maintenance costs.

Money isn't the only thing which has changed, so too has technology localized energy production.

Seanspeed · Oct 26, 2024

How long til any of these local nuclear power plants come online, though? We're still seeing a gross, insane explosion of energy usage from the grid for these AI farms until this happens.

Albuquerque · Oct 26, 2024

The EPA only certified the very first one 18 months ago, so it's gonna be just a bit longer before they find themselves in use.

However, I'm direclty familiar with two datacenters being built inside US shores which are designed around local, modular fission reactors. One of them is using the GE/Hitachi unit; I'm unsure of the other.

DavidGraham · Oct 28, 2024

It's official, xAI is expanding their super cluster from a 100K H100s to 200K H100s + H200s, making it the absolute biggest and most powerful supercomputer in the world, for now.

xAI Is Now In The Process Of Doubling The Size Of Its Colossus Supercluster To 200,000 NVIDIA Hopper GPUs

xAI and NVIDIA brought Colossus online in just 122 days when it would ordinarily take "many months to years" to operationalize such a system.

wccftech.com

SlmDnk · Oct 29, 2024

Cappuccino · Oct 29, 2024

Albuquerque said:
Two things I would like to point out here:

Ten years ago, a singular computer which could consume an entire gigawatt of power would've been monumental in size, to the point where it might as well be several hundred thousand square feet -- equal to the typical wing of a modern colocation datacenter itself. At such physical scale, it likely would've ended up being a clustered ecosystem of supercomputers, rather than "one supercomputer." Maybe the clustering comment is just semantics, "supercomputer" work isn't single threaded anyway. Today, with the newest commercial GPUs coming in at ~1KW each, we can do 150KW or more in single 42U rack without accounting for cooling. Napkin maths says around 5000 racks packed with GPUs should consume roughly 750MW and sit on ~40,000 sqft of raised floor, then add in aisleways between those rack rows, a campus-wide water loop, the bigass pumps for the chiller system, the thermal plates for "cooling capacitance" in the chiller loop, some quantity of DX cooling for non-loop-chilled workloads (standard CRAC stuff), a room for an operations center, the necessary electrical rooms, UPS gear, and a pair of reactor rooms, and you've probably hit the 1GW mark in a datacenter smaller than 100,000sqft. There's no way we were building this sort of facility a decade ago.

Secondly, a decade ago we had no reasonable way to send an entire gigawatt of power to a single building; there are moderately sized cities that don't run on a gigawatt of power. As a simple example: the Tennessee Valley Authority (TVA) generates and distributes power to all of the state of TN, a substantial portion of Mississippi, the northernmost ~20% of Alabama, the southwestern-most ~20% of Kentucky, and several hundred square miles of Georgia, Virginia and the two Carolinas. As I'm typing this, the weather in this area of the country is around 78*F and sunny, it's about 5:30pm, and TVA is generating 19.2GW for all of their coverage. (Link: https://www.eia.gov/electricity/gridmonitor/dashboard/electric_overview/balancing_authority/TVA) Imagine sending 5% of that power to just a single building? Even today that sounds ludicrous.

Yet today, and for the past several years thanks to EPA authorizations and certifications, it's now commercially possible to drop one (or more) modular fission reactor(s) into a datacenter campus and achieve GW-class baseload at less than $0.20 kWh. This is crazy cheap for self-generation, especially when calculated over the decades-long lifespan of the unit even after considering fueling and maintenance costs.

Money isn't the only thing which has changed, so too has technology localized energy production.

I wouldn’t exactly call this feasible, has this been actually done at all? I’ve seen proposals on paper and endless hype about modular reactors but they all end up in development hell essentially.

Supplying a GW from the grid to a building is incredibly difficult at the distribution level. I don’t want to claim any professional standing here since I’m a relatively new engineer but I am an EE and do something like this for a living, and at best a distribution feeder has a rating of ~550A (or ~13MW at the voltage we use here, 13.8kV, which is more or less common throughout the country give or take a few kV, call it 15kV class). So serving this from a distribution standpoint is essentially impossible, our substation xfmrs are rated at like 50 MVA (for laypeople just consider this MW, which is fairly accurate anyways as power factor is pretty close to .9 at the distribution level anyways due to cap banks) so even if you built out dedicated feeders you would need dedicated transformers.

Of course, you can serve this at the transmission level, which obviously goes a lot higher than a 15kV class distribution feeder. I imagine a lot of these gigantic data centers are fed like this, although frankly I have no idea as I don’t work in transmission.

entity279 · Oct 29, 2024

Cappuccino said:
I wouldn’t exactly call this feasible, has this been actually done at all? I’ve seen proposals on paper and endless hype about modular reactors but they all end up in development hell essentially.

Oh as far as i've been seeing discussed online , there seem to be nothing at all especially challenging with SMRs.

Their main issue is about bringing the costs down - SMRs still need to comply to all of the ( too exagerated, btw ) safety regulations . So in most practical cases SMRs cost more per MW the the equivalent big reactors.
I guess the few deployments we'll get will happen due to the SMRs getting the adequate amount of subsidies and being able to fit the requirements of a given contract at the right time. My country is going to install on such reactor in a not so distant future

NVIDIA discussion [2024]

DegustatoR

Broopster

dorf

DegustatoR

Broopster

DegustatoR

Broopster

Broopster

Broopster

US govt reportedly mulls capping AI GPU exports from Nvidia and AMD to the Middle East

DegustatoR

Nvidia to ship a billion of RISC-V cores in 2024

Intel's former CEO reportedly wanted to buy Nvidia for $20 billion in 2005 — Nvidia is worth over $3 trillion today

DavidGraham

Intel's former CEO reportedly wanted to buy Nvidia for $20 billion in 2005 — Nvidia is worth over $3 trillion today

pcchen

Moderator

trinibwoy

Meh

Albuquerque

Red-headed step child

Seanspeed

Albuquerque

Red-headed step child

DavidGraham

xAI Is Now In The Process Of Doubling The Size Of Its Colossus Supercluster To 200,000 NVIDIA Hopper GPUs

SlmDnk

Cappuccino

entity279

Similar threads