Speculation and Rumors: Nvidia Blackwell ...

Now, Wall Street appears to be front-running NVIDIA's relatively cautious guidance by going a giant step forward. To wit, Morgan Stanley has now disclosed that it expects NVIDIA to ship 450,000 Blackwell chips in the December-ending quarter, earning ~$10 billion in revenue from this architecture alone:
...
While Morgan Stanley concedes that NVIDIA is still in the process of resolving a few "technical challenges" with its GB200 server racks, the Wall Street titan posits a qualifier that such issues are a part of a "normal debugging process for new product launches."

What's more, Morgan Stanley still sees a very healthy demand profile for NVIDIA's H200 chips, courtesy of sovereign AI projects and smaller cloud service providers continuing to expand their capacity.
 
Maybe it's a language thing? Two slots in addition to the slot where the GPU plugs in (eg the PCB itself?)

Just trying to read charitably into someone's statements.
 
Maybe it's a language thing? Two slots in addition to the slot where the GPU plugs in (eg the PCB itself?)

Just trying to read charitably into someone's statements.
I don’t know he’s been saying it for months now. Unless he’s counting 2.9 as two slot I don’t see how you can cool that without liquid or very high noise. Also - why bother keeping it so slim?
 
S
These specs looks weird, for 5080 at least. With 10K FP32 ALUs it will have to run at >4GHz to beat 4090 while consuming 50W less?
i agree, though I don’t think there’s any guarantee it will beat the 4090. Worth noting that his 5080 spec would be a fully enabled chip. Also, I am still scratching my head at 32GB for a consumer card. It seems totally unnecessary for gaming and counterproductive if they’re looking to push more AI buyers towards workstation cards. If they are resigned to this being bought for AI I worry the price could reflect that.
 
S

i agree, though I don’t think there’s any guarantee it will beat the 4090. Worth noting that his 5080 spec would be a fully enabled chip. Also, I am still scratching my head at 32GB for a consumer card. It seems totally unnecessary for gaming and counterproductive if they’re looking to push more AI buyers towards workstation cards. If they are resigned to this being bought for AI I worry the price could reflect that.
If it has a 512-bit bus, the only options are 16GB and 32GB. 16GB wouldn't fly seeing as the top spec card has been 24GB for the last two generations, and so 32GB is really the only choice. Apparently GDDR7 will have 'half step' chip sizes (24gbit in addition to 16gbit) at some point later on, but not at initial introduction.

That won't help the highest-spec card though, as to get a 512-bit memory bus you need to use 16 chips, and 16 x 24gbit GDDR7 chips means 48GB of VRAM, making the problem even worse.
You need smaller, lower density 12gbit chips to make 24GB with a 512-bit memory bus, and I haven't seen those on any GDDR7 roadmaps.

The 24gbit GDDR7 chips will allow them to do cool things like make a 256-bit bus '5080 Super' refresh with 24GB VRAM though.

1727384549321.png
 
Last edited:
One would think that with the switch to G7 it could be fine with a cut down bus like 448 bits. The decision to keep the full bus on GB202 also seems weird in these specs. Kopite has been wrong before.
My thoughts exactly - no reason they couldn’t cut down the bus and have enough bandwidth, unless the die really is that much of beast that it needs it. Seems unlikely. As you say, he’s been wrong on stuff before. Also curious that they’d do a fully enabled die on the 5080 - are yields on 4N really that good?
 
These specs looks weird, for 5080 at least. With 10K FP32 ALUs it will have to run at >4GHz to beat 4090 while consuming 50W less?
I think most sensible explanation in this situation would be that it wont beat a 4090. There was never any rule it had to.

My thoughts exactly - no reason they couldn’t cut down the bus and have enough bandwidth, unless the die really is that much of beast that it needs it. Seems unlikely. As you say, he’s been wrong on stuff before. Also curious that they’d do a fully enabled die on the 5080 - are yields on 4N really that good?
Yields on N4 should be freaking amazing by now. N5 was already yielding great years ago and there's usually no notable yield regression when moving to evolutions of the same process like here.

And there's no excuse for the 5080 to not be fully enabled. It was bad enough the 4080 wasn't, and 5080 is similarly going to be another upper midrange part sold as a high end part. It's gonna be another sub 400mm² die with a 256-bit bus.
 
My thoughts exactly - no reason they couldn’t cut down the bus and have enough bandwidth, unless the die really is that much of beast that it needs it. Seems unlikely. As you say, he’s been wrong on stuff before. Also curious that they’d do a fully enabled die on the 5080 - are yields on 4N really that good?
My hypothesis is that if the rumors are true, they've scaled back some of the SRAM allocated to on-die cache, choosing to use more of that area for other things, leaning on the external memory bus to do the heavy lifting, Ampere-style. Maybe GDDR7 is cheap and power efficient enough that it makes sense this generation, or maybe they're going close to the reticle limit and there simply wasn't enough room for the giant on-chip cache (for GB202 anyway), especially with how poorly SRAM scales on cutting edge processes.

512-bit bus kind of necessitates a giant chip, if for no other reason than you need the space around the edges of the die to fit the 16 memory controller channels.
AD102 at 609mm^2 on N4 doesn't look like it has space in the floorplan for 4 more memory channels, even if you omit the NVLink bits entirely.
 
Last edited:
I truly hope power consumption is not correct. 600 watt to play games is insane.

There will be other cards that don’t need 600w and play games just fine.

My thoughts exactly - no reason they couldn’t cut down the bus and have enough bandwidth, unless the die really is that much of beast that it needs it. Seems unlikely. As you say, he’s been wrong on stuff before. Also curious that they’d do a fully enabled die on the 5080 - are yields on 4N really that good?

The 4090 has a relative deficit of bandwidth compared to the 4080. 36% more bandwidth for 60% more flops/SMs. If these rumors are true the 5090 will be better balanced. 100% more bandwidth than the 5080 for 100% more SMs.
 
It has seen plenty movement, mostly up...
The HBM equation is now pay more to have more, making it utterly infeasible for non >10k$ products (i.e everything consumer/prosumer)
 
Back
Top