Speculation and Rumors: Nvidia Blackwell ...

Reddit has another slide:
semianlaysis-nvidia-b100-b200-gb200-cogs-pricing-margins-v0-K8B2q6_pnryc2FYalKJalHshWvinNDdbHTwt2V576Ho.jpg


1.8 TB/s offchip bandwidth. Four years ago A100 supported 2.02TB/s with HBM2. Be able to connect 72 B100s with 14 TB memory is just out of this world.
 
I thought everyone knows this because nVidia has written and spoke about it four years ago...

I remember that the interconnect speed in GA100 was 7.2 TB/s.
 
This might explain why B202 is rumored to be 512 bit, it's dies connected together, appears as one huge gaming die. It explains why it's called B202 and not B102.
 
The interlink bandwidth sounds incredible, but otherwise "two H100s stapled together (in terms of compute resources) but without double the HBM" isn't exactly what I was expecting out of Nvidia.
 
It was previously expected that NVIDIA was going to leverage the TSMC 3nm process node for the gaming chip but that plan has seemingly changed as Kopite7kimi now states both Blackwell AI Tensor Core and Gaming GPUs to be fabricated on a very similar process node. Just a few hours ago, we came to know that NVIDIA will be using TSMC's 4NP node, a variation of the 5nm node that was already used for Ada Lovelace and Hopper GPUs.

It is stated that the new process node will allow a 30% increase in transistor density which can lead to higher performance gains but the actual efficiency advantages are yet to be explained. TSMC doesn't explicitly state the 4NP process node anywhere on its webpage. They only mention N4P & which is also mentioned as an extension of the N5 platform with an 11% performance boost over N5 and a 6% boost over N4.
...
He also mentions that the GB203 GPU, the next in the Blackwell Gaming GPU lineup, will be half of the GB202, similar to AD102 and AD103 GPUs. This will lead to a huge disparity in performance if NVIDIA equips the next 90-series cards with GB202 and the 80-series cards with GB203. The biggest question is whether NVIDIA will utilize MCM (Multi-Chip-Module) packaging for its Blackwell Gaming GPUs or keep them monolithic for now. Given the increasing costs and yield issues associated with GPU/chip development, the chiplet route is indeed the way of the future & AMD's Radeon division has already embraced it.
 
The interlink bandwidth sounds incredible, but otherwise "two H100s stapled together (in terms of compute resources) but without double the HBM" isn't exactly what I was expecting out of Nvidia.
B100 doubled the HBM3 memory from H100 and has 2.4x more bandwidth. It is basically a H100 NVL with 8TB/s instead of 600GB/s (NVLink).
 

NVLink Switch Generational comparison
First GenerationSecond GenerationThird GenerationNVLink Switch
Number of GPUs with direct connection within a NVLink domainUp to 8Up to 8Up to 8Up to 576
NVSwitch GPU-to-GPU bandwidth300GB/s600GB/s900GB/s1,800GB/s
Total aggregate bandwidth2.4TB/s4.8TB/s7.2TB/s1PB/s
Supported NVIDIA architecturesNVIDIA Volta™ architectureNVIDIA Ampere architectureNVIDIA Hopper™ architectureNVIDIA Blackwell architecture
Preliminary specifications; may be subject to change.


NVLink Generational comparison
Second GenerationThird GenerationFourth GenerationFifth Generation
NVLink bandwidth per GPU300GB/s600GB/s900GB/s1,800GB/s
Maximum Number of Links per GPU6121818
Supported NVIDIA ArchitecturesNVIDIA Volta™ architectureNVIDIA Ampere architectureNVIDIA Hopper™ architectureNVIDIA Blackwell architecture
Preliminary specifications; may be subject to change.
 
Last edited:
The interlink bandwidth sounds incredible, but otherwise "two H100s stapled together (in terms of compute resources) but without double the HBM" isn't exactly what I was expecting out of Nvidia.
It's honestly super un-exciting. Per processor, the improvements really aren't that big at all. They must be quite assured in their current lead given this is supposed to be their new flagship for the next two years.

This new era of 'more performance by using more silicon' is gonna kinda suck.
 
Back
Top