Speculation and Rumors: Nvidia Blackwell ...

Broopster · Tuesday at 6:25 PM

Broopster said:
Wondering what the hold up is

I can see them holding up the 5090D/5080D designs to see how the trade restriction updates in October pan out.

pharma · Tuesday at 6:47 PM

Morgan Stanley: NVIDIA To Earn $10 Billion In Revenue From Blackwell Chips Alone In Q4 2024

Morgan Stanley has now disclosed that it expects NVIDIA to ship 450,000 Blackwell chips in the December-ending quarter.

wccftech.com

Now, Wall Street appears to be front-running NVIDIA's relatively cautious guidance by going a giant step forward. To wit, Morgan Stanley has now disclosed that it expects NVIDIA to ship 450,000 Blackwell chips in the December-ending quarter, earning ~$10 billion in revenue from this architecture alone:
...
While Morgan Stanley concedes that NVIDIA is still in the process of resolving a few "technical challenges" with its GB200 server racks, the Wall Street titan posits a qualifier that such issues are a part of a "normal debugging process for new product launches."

What's more, Morgan Stanley still sees a very healthy demand profile for NVIDIA's H200 chips, courtesy of sovereign AI projects and smaller cloud service providers continuing to expand their capacity.

DavidGraham · Thursday at 6:00 PM

Latest 5090 and 5080 specs:

https://twitter.com/x/status/1839343725727941060

https://twitter.com/x/status/1839345147789934794

Broopster · Thursday at 7:01 PM

DavidGraham said:
Latest 5090 and 5080 specs:

https://twitter.com/x/status/1839343725727941060

https://twitter.com/x/status/1839345147789934794

He’s still claiming a two slot cooler for the 5090 at 600w….

Albuquerque · Thursday at 7:08 PM

Maybe it's a language thing? Two slots in addition to the slot where the GPU plugs in (eg the PCB itself?)

Just trying to read charitably into someone's statements.

Broopster · Thursday at 7:17 PM

Albuquerque said:
Maybe it's a language thing? Two slots in addition to the slot where the GPU plugs in (eg the PCB itself?)

Just trying to read charitably into someone's statements.

I don’t know he’s been saying it for months now. Unless he’s counting 2.9 as two slot I don’t see how you can cool that without liquid or very high noise. Also - why bother keeping it so slim?

DegustatoR · Thursday at 7:29 PM

These specs looks weird, for 5080 at least. With 10K FP32 ALUs it will have to run at >4GHz to beat 4090 while consuming 50W less?

Broopster · Thursday at 8:43 PM

S

DegustatoR said:
These specs looks weird, for 5080 at least. With 10K FP32 ALUs it will have to run at >4GHz to beat 4090 while consuming 50W less?

i agree, though I don’t think there’s any guarantee it will beat the 4090. Worth noting that his 5080 spec would be a fully enabled chip. Also, I am still scratching my head at 32GB for a consumer card. It seems totally unnecessary for gaming and counterproductive if they’re looking to push more AI buyers towards workstation cards. If they are resigned to this being bought for AI I worry the price could reflect that.

trinibwoy · Thursday at 9:52 PM

DegustatoR said:
These specs looks weird, for 5080 at least. With 10K FP32 ALUs it will have to run at >4GHz to beat 4090 while consuming 50W less?

Maybe it’s only faster at RT or cheap enough that people won’t care that it doesn’t beat the 4090 (lol yeah right).

T2098 · Thursday at 10:02 PM

Broopster said:
S

i agree, though I don’t think there’s any guarantee it will beat the 4090. Worth noting that his 5080 spec would be a fully enabled chip. Also, I am still scratching my head at 32GB for a consumer card. It seems totally unnecessary for gaming and counterproductive if they’re looking to push more AI buyers towards workstation cards. If they are resigned to this being bought for AI I worry the price could reflect that.

If it has a 512-bit bus, the only options are 16GB and 32GB. 16GB wouldn't fly seeing as the top spec card has been 24GB for the last two generations, and so 32GB is really the only choice. Apparently GDDR7 will have 'half step' chip sizes (24gbit in addition to 16gbit) at some point later on, but not at initial introduction.

That won't help the highest-spec card though, as to get a 512-bit memory bus you need to use 16 chips, and 16 x 24gbit GDDR7 chips means 48GB of VRAM, making the problem even worse.
You need smaller, lower density 12gbit chips to make 24GB with a 512-bit memory bus, and I haven't seen those on any GDDR7 roadmaps.

The 24gbit GDDR7 chips will allow them to do cool things like make a 256-bit bus '5080 Super' refresh with 24GB VRAM though.

DegustatoR · Thursday at 10:21 PM

T2098 said:
If it has a 512-bit bus, the only options are 16GB and 32GB.

One would think that with the switch to G7 it could be fine with a cut down bus like 448 bits. The decision to keep the full bus on GB202 also seems weird in these specs. Kopite has been wrong before.

Broopster · Thursday at 11:11 PM

DegustatoR said:
One would think that with the switch to G7 it could be fine with a cut down bus like 448 bits. The decision to keep the full bus on GB202 also seems weird in these specs. Kopite has been wrong before.

My thoughts exactly - no reason they couldn’t cut down the bus and have enough bandwidth, unless the die really is that much of beast that it needs it. Seems unlikely. As you say, he’s been wrong on stuff before. Also curious that they’d do a fully enabled die on the 5080 - are yields on 4N really that good?

Seanspeed · Thursday at 11:54 PM

DegustatoR said:
These specs looks weird, for 5080 at least. With 10K FP32 ALUs it will have to run at >4GHz to beat 4090 while consuming 50W less?

I think most sensible explanation in this situation would be that it wont beat a 4090. There was never any rule it had to.

Broopster said:
My thoughts exactly - no reason they couldn’t cut down the bus and have enough bandwidth, unless the die really is that much of beast that it needs it. Seems unlikely. As you say, he’s been wrong on stuff before. Also curious that they’d do a fully enabled die on the 5080 - are yields on 4N really that good?

Yields on N4 should be freaking amazing by now. N5 was already yielding great years ago and there's usually no notable yield regression when moving to evolutions of the same process like here.

And there's no excuse for the 5080 to not be fully enabled. It was bad enough the 4080 wasn't, and 5080 is similarly going to be another upper midrange part sold as a high end part. It's gonna be another sub 400mm² die with a 256-bit bus.

T2098 · Thursday at 11:55 PM

Broopster said:
My thoughts exactly - no reason they couldn’t cut down the bus and have enough bandwidth, unless the die really is that much of beast that it needs it. Seems unlikely. As you say, he’s been wrong on stuff before. Also curious that they’d do a fully enabled die on the 5080 - are yields on 4N really that good?

My hypothesis is that if the rumors are true, they've scaled back some of the SRAM allocated to on-die cache, choosing to use more of that area for other things, leaning on the external memory bus to do the heavy lifting, Ampere-style. Maybe GDDR7 is cheap and power efficient enough that it makes sense this generation, or maybe they're going close to the reticle limit and there simply wasn't enough room for the giant on-chip cache (for GB202 anyway), especially with how poorly SRAM scales on cutting edge processes.

512-bit bus kind of necessitates a giant chip, if for no other reason than you need the space around the edges of the die to fit the 16 memory controller channels.
AD102 at 609mm^2 on N4 doesn't look like it has space in the floorplan for 4 more memory channels, even if you omit the NVLink bits entirely.

Boss · Friday at 1:01 AM

DavidGraham said:
Latest 5090 and 5080 specs:

https://twitter.com/x/status/1839343725727941060

https://twitter.com/x/status/1839345147789934794

I truly hope power consumption is not correct. 600 watt to play games is insane.

arandomguy · Friday at 1:59 AM

Well you won't need 600 watts to play games if that isn't appealing to you.

pharma · Friday at 2:27 AM

I think the rumored 600 watts was the figure for "maximum heat dissipation capabilities of the radiator".

trinibwoy · Friday at 2:32 AM

Boss said:
I truly hope power consumption is not correct. 600 watt to play games is insane.

There will be other cards that don’t need 600w and play games just fine.

Broopster said:
My thoughts exactly - no reason they couldn’t cut down the bus and have enough bandwidth, unless the die really is that much of beast that it needs it. Seems unlikely. As you say, he’s been wrong on stuff before. Also curious that they’d do a fully enabled die on the 5080 - are yields on 4N really that good?

The 4090 has a relative deficit of bandwidth compared to the 4080. 36% more bandwidth for 60% more flops/SMs. If these rumors are true the 5090 will be better balanced. 100% more bandwidth than the 5080 for 100% more SMs.

techuse · Friday at 7:10 AM

Has HBM seen no movement in the last decade?

Triskaine · Friday at 7:55 AM

It has seen plenty movement, mostly up...
The HBM equation is now pay more to have more, making it utterly infeasible for non >10k$ products (i.e everything consumer/prosumer)

Speculation and Rumors: Nvidia Blackwell ...

Broopster

pharma

Morgan Stanley: NVIDIA To Earn $10 Billion In Revenue From Blackwell Chips Alone In Q4 2024

DavidGraham

Broopster

Albuquerque

Red-headed step child

Broopster

DegustatoR

Broopster

trinibwoy

Meh

T2098

DegustatoR

Broopster

Seanspeed

T2098

Boss

arandomguy

pharma

trinibwoy

Meh

techuse

Triskaine