NVIDIA discussion [2024]

  • Thread starter Deleted member 2197
  • Start date
Status
Not open for further replies.
Board Channel rumor (KitGuru report linked below) that RTX 50 will launch on a relatively tight schedule - if true it seems likely aimed at sucking the any air left for RDNA 4 out of the market.
Nvidia has been launching their lineups in 3 to a max of 6 months timeframes for literally years now. I fail to see how this is something new with Blackwell - and as such why it would be related to RDNA4 in any way.
 
Nvidia has been launching their lineups in 3 to a max of 6 months timeframes for literally years now. I fail to see how this is something new with Blackwell - and as such why it would be related to RDNA4 in any way.
They haven’t released everything in a 3 month period since Pascale I believe - and this is 2 months. The RDNA4 comment is admittedly speculation on my part - might not be a major consideration though surely they’ve considered it.
 
Last edited:
They haven’t released anything in a 3 month period since Pascale - and this is 2 months.
Jan-Mar is 3 months.
Turing was Sep-Dec - 4 months.
Ampere was Sep-Feb - 6 months.
Lovelace was Oct-Jun so technically 9 months but it was also the least interesting as a perf/price upgrade for anything below 4070Ti which launched in Jan which would also be 3 months.
 
Lovelace was Oct-Jun so technically 9 months but it was also the least interesting as a perf/price upgrade for anything below 4070Ti which launched in Jan which would also be 3 months
I’m not sure I buy that bit of sophistry haha.

In any case even by your own estimate it’s a shorter time frame by a month. Look I never claimed this was a radical departure, just that it’s more accelerated than usual. Also, if you reread my initial post, you’ll see I said that one can also see this as a typical schedule just with the high end cards delayed. Either could be valid based on what we know right now.

(Also, if I’m in January, March is two months away, not three. If you go from early Jan to late March then it’s more like three, but these won’t release until at least the second half of January unless NVIDIA is sending reviewers their cards the day after the announcement.)
 
Last edited:
I’m not sure I buy that bit of sophistry haha.
4060 didn't bring anything new to the market, you could argue that it was even somewhat of a downgrade (due to VRAM size) in comparison to a similar priced 3060. So in this case its launch was rather irrelevant as well as when it had happened. Same is true for 4060Ti and 4070 although the latter can be viewed as a minor upgrade depending on what you compare it to.

Look I never claimed this was a radical departure, just that it’s more accelerated than usual.
This may be more a result of them launching the first SKUs later than originally planned than accelerating the launch of lower end cards. My point is that this in itself isn't a sign of anything about the competition.
 
4060 didn't bring anything new to the market, you could argue that it was even somewhat of a downgrade (due to VRAM size) in comparison to a similar priced 3060. So in this case its launch was rather irrelevant as well as when it had happened. Same is true for 4060Ti and 4070 although the latter can be viewed as a minor upgrade depending on what you compare it to.
That doesn’t negate that they’re new cards! I don’t disagree with your characterization of their value but saying they don’t count as releases is a stretch. It’s not like they’re minor refreshes of the old cards (new node, new functionality, etc).
 
Modern GPUs are highly complex system-on-chips with a massive number of functions and resources — starting from compute resources and power management and all the way to display engines and security — that must be managed.

These things are now managed by 10 to 40 custom RISC-V cores developed by Nvidia, depending on chip complexity. Nvidia started to replace its proprietary microcontrollers with RISC-V-based microcontroller cores in 2015, and by now, virtually all of its MCU cores are RISC-V-based, according to an Nvidia slide demonstrated at the RISC-V Summit.

By now, Nvidia has developed at least three RISC-V microcontroller cores: NV-RISCV32 (RV32I-MU, in-order single-issue core), NV-RISCV64 (RV64I-MSU, out-of-order dual-issue core), and NV-RVV (RV32I-MU, NVRISCV32 + 1024-bit vector extension). These cores (and perhaps others) replaced the proprietary Falcon microcontroller unit based on a different instruction set architecture. In addition, Nvidia has developed 20+ custom RISC-V extensions for extra performance, functionality, and security.

Perhaps the most important RISC-V-based part of Nvidia GPUs is its embedded GPU System Processor (GSP). According to Nvidia's website, the first GPUs to use RISC-V-based GSP were based on the Turing architecture. This GSP offloads Kernel Driver functions, reduces GPU MIMO exposure to the CPU, and manages how the GPU is used.

Since MCU cores are universal, they can be used across Nvidia's products. As a result, in 2024, Nvidia is expected to ship around a billion RISC-V cores built into its GPUs, CPUs, SoCs, and other products, according to one of the demonstrated slights, which highlights the ubiquity of custom RISC-V cores in Nvidia's hardware.

In 2005, Intel CEO Paul Otellini surprised the company's board. According to a report from the New York Times, he suggested that Intel buy Nvidia for "as much as" $20 billion.

According to the Times's sources ("two people familiar with the boardroom discussion"), even some Intel executives thought that Nvidia's designs could eventually play an important role in data centers. While that idea would come to fruition with the modern AI boom, the board pushed back against it. It would have been Intel's most expensive acquisition, and there were worries about integrating the company. Otellini backed off, and that was that.

Instead, Intel's board backed an internal graphics project, Larabee, which now-CEO Pat Gelsinger helmed. It used Intel's x86 technologies, and the GPU was a sort of hybrid of a CPU and GPU. Intel ultimately pulled the plug on the project, though it would later return to graphics with its Xe and Arc projects.
 
So the proposal happened around the time AMD bought ATi? Interesting.

I am guessing the deal would have fell through due to Jensen insisting to be the CEO, NVIDIA was approached multiple times before, but the deals fell through due to this exact reason.

On other news, India is building a gigawat data center and NVIDIA is supplying Blackwell GPUs for it. India is also building another data center which will house tens of thousands of H100s.

 
I find it interesting because about a decade ago when people were talking about “exascale” (meaning supercomputers at least 1000 PFLOPS, or 1 EFLOPS), the question was that no one thought it’s practical to have a supercomputer requiring more than like 100MW power (otherwise you just pile more GPU, and the largest supercomputer at that time was around 60MW). Now people seem to talk about numbers like “1 GW” as it’s nothing, all just because of “AI”.
 
I find it interesting because about a decade ago when people were talking about “exascale” (meaning supercomputers at least 1000 PFLOPS, or 1 EFLOPS), the question was that no one thought it’s practical to have a supercomputer requiring more than like 100MW power (otherwise you just pile more GPU, and the largest supercomputer at that time was around 60MW). Now people seem to talk about numbers like “1 GW” as it’s nothing, all just because of “AI”.

Yep the only thing that changed is $$$$$$$.
 
Two things I would like to point out here:

Ten years ago, a singular computer which could consume an entire gigawatt of power would've been monumental in size, to the point where it might as well be several hundred thousand square feet -- equal to the typical wing of a modern colocation datacenter itself. At such physical scale, it likely would've ended up being a clustered ecosystem of supercomputers, rather than "one supercomputer." Maybe the clustering comment is just semantics, "supercomputer" work isn't single threaded anyway. Today, with the newest commercial GPUs coming in at ~1KW each, we can do 150KW or more in single 42U rack without accounting for cooling. Napkin maths says around 5000 racks packed with GPUs should consume roughly 750MW and sit on ~40,000 sqft of raised floor, then add in aisleways between those rack rows, a campus-wide water loop, the bigass pumps for the chiller system, the thermal plates for "cooling capacitance" in the chiller loop, some quantity of DX cooling for non-loop-chilled workloads (standard CRAC stuff), a room for an operations center, the necessary electrical rooms, UPS gear, and a pair of reactor rooms, and you've probably hit the 1GW mark in a datacenter smaller than 100,000sqft. There's no way we were building this sort of facility a decade ago.

Secondly, a decade ago we had no reasonable way to send an entire gigawatt of power to a single building; there are moderately sized cities that don't run on a gigawatt of power. As a simple example: the Tennessee Valley Authority (TVA) generates and distributes power to all of the state of TN, a substantial portion of Mississippi, the northernmost ~20% of Alabama, the southwestern-most ~20% of Kentucky, and several hundred square miles of Georgia, Virginia and the two Carolinas. As I'm typing this, the weather in this area of the country is around 78*F and sunny, it's about 5:30pm, and TVA is generating 19.2GW for all of their coverage. (Link: https://www.eia.gov/electricity/gridmonitor/dashboard/electric_overview/balancing_authority/TVA) Imagine sending 5% of that power to just a single building? Even today that sounds ludicrous.

Yet today, and for the past several years thanks to EPA authorizations and certifications, it's now commercially possible to drop one (or more) modular fission reactor(s) into a datacenter campus and achieve GW-class baseload at less than $0.20 kWh. This is crazy cheap for self-generation, especially when calculated over the decades-long lifespan of the unit even after considering fueling and maintenance costs.

Money isn't the only thing which has changed, so too has technology localized energy production.
 
How long til any of these local nuclear power plants come online, though? We're still seeing a gross, insane explosion of energy usage from the grid for these AI farms until this happens.
 
The EPA only certified the very first one 18 months ago, so it's gonna be just a bit longer before they find themselves in use.

However, I'm direclty familiar with two datacenters being built inside US shores which are designed around local, modular fission reactors. One of them is using the GE/Hitachi unit; I'm unsure of the other.
 
Last edited:
Two things I would like to point out here:

Ten years ago, a singular computer which could consume an entire gigawatt of power would've been monumental in size, to the point where it might as well be several hundred thousand square feet -- equal to the typical wing of a modern colocation datacenter itself. At such physical scale, it likely would've ended up being a clustered ecosystem of supercomputers, rather than "one supercomputer." Maybe the clustering comment is just semantics, "supercomputer" work isn't single threaded anyway. Today, with the newest commercial GPUs coming in at ~1KW each, we can do 150KW or more in single 42U rack without accounting for cooling. Napkin maths says around 5000 racks packed with GPUs should consume roughly 750MW and sit on ~40,000 sqft of raised floor, then add in aisleways between those rack rows, a campus-wide water loop, the bigass pumps for the chiller system, the thermal plates for "cooling capacitance" in the chiller loop, some quantity of DX cooling for non-loop-chilled workloads (standard CRAC stuff), a room for an operations center, the necessary electrical rooms, UPS gear, and a pair of reactor rooms, and you've probably hit the 1GW mark in a datacenter smaller than 100,000sqft. There's no way we were building this sort of facility a decade ago.

Secondly, a decade ago we had no reasonable way to send an entire gigawatt of power to a single building; there are moderately sized cities that don't run on a gigawatt of power. As a simple example: the Tennessee Valley Authority (TVA) generates and distributes power to all of the state of TN, a substantial portion of Mississippi, the northernmost ~20% of Alabama, the southwestern-most ~20% of Kentucky, and several hundred square miles of Georgia, Virginia and the two Carolinas. As I'm typing this, the weather in this area of the country is around 78*F and sunny, it's about 5:30pm, and TVA is generating 19.2GW for all of their coverage. (Link: https://www.eia.gov/electricity/gridmonitor/dashboard/electric_overview/balancing_authority/TVA) Imagine sending 5% of that power to just a single building? Even today that sounds ludicrous.

Yet today, and for the past several years thanks to EPA authorizations and certifications, it's now commercially possible to drop one (or more) modular fission reactor(s) into a datacenter campus and achieve GW-class baseload at less than $0.20 kWh. This is crazy cheap for self-generation, especially when calculated over the decades-long lifespan of the unit even after considering fueling and maintenance costs.

Money isn't the only thing which has changed, so too has technology localized energy production.
I wouldn’t exactly call this feasible, has this been actually done at all? I’ve seen proposals on paper and endless hype about modular reactors but they all end up in development hell essentially.

Supplying a GW from the grid to a building is incredibly difficult at the distribution level. I don’t want to claim any professional standing here since I’m a relatively new engineer but I am an EE and do something like this for a living, and at best a distribution feeder has a rating of ~550A (or ~13MW at the voltage we use here, 13.8kV, which is more or less common throughout the country give or take a few kV, call it 15kV class). So serving this from a distribution standpoint is essentially impossible, our substation xfmrs are rated at like 50 MVA (for laypeople just consider this MW, which is fairly accurate anyways as power factor is pretty close to .9 at the distribution level anyways due to cap banks) so even if you built out dedicated feeders you would need dedicated transformers.

Of course, you can serve this at the transmission level, which obviously goes a lot higher than a 15kV class distribution feeder. I imagine a lot of these gigantic data centers are fed like this, although frankly I have no idea as I don’t work in transmission.
 
I wouldn’t exactly call this feasible, has this been actually done at all? I’ve seen proposals on paper and endless hype about modular reactors but they all end up in development hell essentially.
Oh as far as i've been seeing discussed online , there seem to be nothing at all especially challenging with SMRs.

Their main issue is about bringing the costs down - SMRs still need to comply to all of the ( too exagerated, btw ) safety regulations . So in most practical cases SMRs cost more per MW the the equivalent big reactors.
I guess the few deployments we'll get will happen due to the SMRs getting the adequate amount of subsidies and being able to fit the requirements of a given contract at the right time. My country is going to install on such reactor in a not so distant future
 
Status
Not open for further replies.
Back
Top