Speculation and Rumors: Nvidia Blackwell ...

  • Thread starter Deleted member 2197
  • Start date
We can try to extrapolate architectural efficiency by looking at 400W performance though. If the 5080 really does need 400W to beat the 4090 by 10% that doesn’t seem very impressive. At 400W the 4090 barely loses any performance.
Again no way of knowing how impressive or not that is without knowing how GB203 compares to AD102 and how much performance would a 5080 lose with a similar -50W power change.
 
Benchlife chips in on the 600W rumor ... seems a bit more reasonable.

"Earlier we mentioned that the GeForce RTX 5090 may have a TGP (Total Graphics Power) of 550W, but the latest news from Kopite7kimi mentioned that the GeForce RTX 5090 may reach the upper limit of 600W. In addition to the NVIDIA GeForce RTX 5090, Kopite7kimi has also confirmed that the GeForce RTX 5080 will reach 400W.
...
There was indeed a 400W NVIDIA cooling module case earlier, but it is currently in the cancellation or suspension stage;
as for the 600W cooling module case, from the beginning to Now, it has not stopped, and the module manufacturer has also confirmed that NVIDIA currently has 5 GeForce RTX graphics card cases in progress.

Perhaps, 400W and 600W are the maximum heat dissipation capabilities of the radiator. In fact,
the TGP of GeForce RTX 5090 and GeForce RTX 5080 are 550W and 350W respectively.


If there are no surprises, the Blackwell GPU architecture GeForce RTX 5090/D and GeForce RTX 5080/D are scheduled to be officially launched in September.
If it is as our informant told us, there will soon be more information on the Internet for reference.
"
 
Sounds like a prime opportunity for undervolting, again. Let's hope NVIDIA lets us scrape some of the lower VID bins this time; the 40-series limited the lowest voltage bins to like 843 or 850mv depending on the SKU.
 
In no world does Nvidia need to make these GPU's handle 400w+ of power as standard. All it does is make the graphics cards bigger and more expensive for everybody. I hate it so much. We used to have reasonable TDP's for normal cards, and then if you really wanted the monster graphics cards that could handle big overclocks and high power draw, then you could shell out for those as an option.

It's even worse cuz it's creating this huge misconception among the average PC gamer that everything is getting less efficient, when it's very much not the case. So few people(outside a forum like this) seem to grasp that TDP is a chosen spec, not an inherent property of the processor. A lot more people might complain about the situation if they actually understood what was happening.
 
Does anyone have a good idea of how the 5090D could become sanctions compliant? Something I don’t see mentioned anywhere is that if they need a D model of the 5080 then how would a GB202 model become compliant unless it was cut down to 5080 levels (at which point why waste so much silicon). Maybe the 5090D is based on GB203 like the laptop version, but more SM and bandwidth? Rumors don’t indicate that.

I understand that the sanctions are based on theoretical limits not gaming performance. So the only other explanation I can think of is that their SM redesign is so radical that they can limit the tensor throughput somehow, which seems unlikely. Or am I missing something really obvious?
 
Benchlife chips in on the 600W rumor ... seems a bit more reasonable.

"Earlier we mentioned that the GeForce RTX 5090 may have a TGP (Total Graphics Power) of 550W, but the latest news from Kopite7kimi mentioned that the GeForce RTX 5090 may reach the upper limit of 600W. In addition to the NVIDIA GeForce RTX 5090, Kopite7kimi has also confirmed that the GeForce RTX 5080 will reach 400W.
...
There was indeed a 400W NVIDIA cooling module case earlier, but it is currently in the cancellation or suspension stage;
as for the 600W cooling module case, from the beginning to Now, it has not stopped, and the module manufacturer has also confirmed that NVIDIA currently has 5 GeForce RTX graphics card cases in progress.

Perhaps, 400W and 600W are the maximum heat dissipation capabilities of the radiator. In fact,
the TGP of GeForce RTX 5090 and GeForce RTX 5080 are 550W and 350W respectively.


If there are no surprises, the Blackwell GPU architecture GeForce RTX 5090/D and GeForce RTX 5080/D are scheduled to be officially launched in September.
If it is as our informant told us, there will soon be more information on the Internet for reference.
"

First time a September launch date has been mentioned I think, with earlier rumours all saying Late 2024/CES'25. Usually if the launch is so close we get a lot of leaked hardware images, silicon, box art, etc. Unless I've missed it, we don't seem to have any of those yet. Could still be a paper launch in late September with retail availability in October perhaps.
 
Could still be a paper launch in late September with retail availability in October perhaps
I have a feeling something may have been lost in translation - I suspect the designs are being finalized and silicon sent to partners this months for a launch in November.
 
Oracle says that the forthcoming Blackwell supercluster with 131,072 Blackwell GPUs is rated at 2.4 zettaflops. That math checks out at FP4 precision, with the B200 rated at 18 petaflops of aggregate oomph on the tensor cores in Blackwell. If you multiply that out, you get 2,359.3 exaflops of FP4 peak, and that rounds up to 2.4 zettaflops. However, divide that by four to use the FP16 precision that most LLM makers want to use if they can, and that is only 589.8 exaflops.

For FP64 performance, which is important for segments of certain AI workloads and for HPC simulation and modeling, the Big Larry cluster coming next year, then the vector cores or tensor cores only deliver 5.24 exaflops of FP64 oomph across that fleet of Blackwell GPUs. That is five times the peak performance of the “Frontier” supercomputer at Oak Ridge National Laboratories and probably two times the peak FP64 performance of the impending “El Capitan” supercomputer at Lawrence Livermore National Laboratory.

Mind you, at 5.24 exaflops, that Oracle “machine” would still count as the largest HPC system in the world, if Oracle would let you rent it all at once. The odds certainly favor Oracle selling this machine in chunks to many people, but with that number of Blackwell allocations all in one place, maybe not. Perhaps there will only be a few customers who get access to the OCI supercluster so they can train their models
...
And a few minutes later, when Ellison was talking again, he added a thought about the power these AI datacenters require.

“Let me say something that’s going to sound really bizarre. Well, you would probably say, well, he says bizarre things all the time, so why is he announcing this one? It must be really bizarre. So we are in the middle of designing a datacenter that’s north of a gigawatt – we found the location and the power for the place. We look at it, and they have already got building permits for three nuclear reactors. These are the small modular nuclear reactors to power the datacenter. This is how crazy it is getting.
 
Could we perhaps separate Data Center / AI from actual graphics thread? Or maybe talk about those topics solely in the Industry Section? Maybe it's just me but I find very annoying coming to see a new post about Blackwell and there is zero on it that's actually graphics related. Lately this topic is mostly an echo chamber of press releases of companies that have nothing to do with graphics just because they are NVIDIA customers. There is already a thread about NVIDIA business, no need to replicate the same here?
 
I think once there is actually Blackwell architectural documents available there will be a dedicated Blackwell graphics thread. This is the speculation and rumors thread so doubt anything mentioned will hold sway though discussion on rumored Blackwell FP64 performance and exaflops in comparison to existing supercomputers is revelant to the thread.

Creating a new thread to hold factual Blackwell specifications and information might be ideal for avoiding the speculation thread.
 
An oblique Blackwell (gaming) question: is NVIDIA doing a second GTC this year? Don’t see anything online. If not, I’m guessing they could launch the gaming chips on a GeForce streaming event - can’t imagine they’re waiting until GTC 2025 in San Jose.
 
The same rumor however says that 5080 at 400W should be 10% faster than 4090
I’m skeptical they could push performance that much higher on architectural and clock speed improvements alone, unless the 10% refers to RT. GB203 only has 4 more SM than AD103. Would be happy to be proven wrong on this.
 
It doesn't immediately seem like it but what an SM entails changes more often then not gen to gen. Only Maxwell to Pascal and Ampere to Ada really kept the SM relatively consistent. Every other gen you basically had drastic changes in terms of the layout at the sub SM level making them incomparable.

As such leaks based on SM count may not really mean much in the grand scheme of things.
 
An oblique Blackwell (gaming) question: is NVIDIA doing a second GTC this year? Don’t see anything online. If not, I’m guessing they could launch the gaming chips on a GeForce streaming event - can’t imagine they’re waiting until GTC 2025 in San Jose.

No, looking at the calendar, there are only AI events left. If there is anything about new GPUs, it will be at CES 2025.

 
Last four or something launches weren't aligned with any event so why would anyone expect that for Blackwell?
He did say it was an oblique question to Blackwell, for the record.

o·blique
[əˈblēk]

adjective
  1. neither parallel nor at a right angle to a specified or implied line; slanting:
    "we sat on the settee oblique to the fireplace"
 
Last four or something launches weren't aligned with any event so why would anyone expect that for Blackwell?
Only thing I can think is that NVIDIA is so interested in keeping the focus on AI that they just stream a release announcement like PS5 Pro on short notice. Wondering what the hold up is - maybe GDDR7 supply. I know xpea threw cold water on the CES rumor back in July but things could have changed - certainly the lack of AIB leaks suggests they’re not arriving this year - I doubt they’d launch anything in December.
 
It doesn't immediately seem like it but what an SM entails changes more often then not gen to gen. Only Maxwell to Pascal and Ampere to Ada really kept the SM relatively consistent. Every other gen you basically had drastic changes in terms of the layout at the sub SM level making them incomparable.

As such leaks based on SM count may not really mean much in the grand scheme of things.
Fair point - I’m guess I’m going partly off the potential transistor density because at the current level per SM GB202 will be approaching the limit unless they do a big cut down on cache (unlikely I think). They aren’t going to get much from the node. But hey I hope NVIDIA can pull it off - they’ve done it before
 
Back
Top