Speculation and Rumors: Nvidia Blackwell ...

  • Thread starter Deleted member 2197
  • Start date
I’d like to throw a question out there regarding GB202 and GB203 L2 cache. Despite our proximity to launch, I haven’t seen any leaks about the cache size. The 128 MB rumor has been out for well over a year and never confirmed by anyone to my knowledge. It could easily have been an estimation based on the assumption that the core/cache ratio would stay the same as Ada. Sounds like that could be tough with only 22% more die space (on GB202) and added MCs, though AMD managed to cut down the L3 cache size on 4nm for Zen 5, so it’s not impossible I guess. GB203 shouldn’t have the same problem so I’d guess that cache would stay the same.

Curious if anyone has seen anything or has any thoughts on this.
 
I think there is also a generational effect as well. I have teenage brothers from a second marriage and they exclusively game on laptop or console - they marveled at my desktop case like something at the Smithsonian. They’ve grown up being able to do pretty much anything on the go, so bifurcating their laptop and desktop like I do seems strange to them (there isn’t a desktop in their house). I also think one can’t underestimate the popularity of low-Fi games to that generation. One of my brothers exclusively plays Roblox and Minecraft on his laptop. Admittedly, you probably don’t need a dGPU for that but the parents bought him a laptop with one because they asked to buy a gaming laptop!

VRR monitors are de rigeur for a modern gaming laptop, whereas when I bought a top-of-the-line gaming laptop (admittedly Maxwell, not Turing) I don’t recall even having the option, meaning it was 60 hz Vsync or bust. Now if you can only hit 50 fps no problem, and then DLSS 3 gets you well over 60. I swore I’d never buy one again - it was essentially a portable desktop. But I’m confident if I did buy one today my experience would be far better (to be clear, I have zero interest in one). Please bear in mind this post is entirely anecdotal, and only intended as a partial explanation.
OK that makes sense. But with the popularity of streaming I'd think young people wouldn't find the concept of a desktop so foreign anymore.
I’d like to throw a question out there regarding GB202 and GB203 L2 cache. Despite our proximity to launch, I haven’t seen any leaks about the cache size. The 128 MB rumor has been out for well over a year and never confirmed by anyone to my knowledge. It could easily have been an estimation based on the assumption that the core/cache ratio would stay the same as Ada. Sounds like that could be tough with only 22% more die space (on GB202) and added MCs, though AMD managed to cut down the L3 cache size on 4nm for Zen 5, so it’s not impossible I guess. GB203 shouldn’t have the same problem so I’d guess that cache would stay the same.

Curious if anyone has seen anything or has any thoughts on this.
Been wondering this myself. If yields are better they could reduce the amount of cache on the chip without reducing the amount on the end product. The 4090 left nearly 30% of its L2 dark. :unsure:
 
But with the popularity of streaming I'd think young people wouldn't find the concept of a desktop so foreign anymore.
Again, it’s anecdotal. My father is older than most parents of teenagers, and so he grew up in the Atari era and always cared more about sports anyway. So he has zero interest in anything PC related - he barely even uses his laptop. I’m sure there are plenty of youngsters with pc parents who will pick up the hobby. I’m unlikely to ever have kids (partner doesn’t want them) and I tried to get one of them interested in PC, but when all he wants to play is Roblox and Minecraft it’s hard to get excited for building a computer.

Unlike 10 years ago, PC actually gets ports of virtually all games now. I really think that APUs could push PC/laptop gaming further into the mainstream if they can push performance enough. Since there’s little cause to keep pushing 16inch resolution we should get there eventually. But that’s a topic for another thread!
The 4090 left nearly 30% of its L2 dark.
I strongly suspect that was a segmentation decision. All the memory controllers were enabled, and even if yields played some role I can’t see the need to disable 25% of the cache when you only needed to cut 11.1% of your cores to hit the needed yields. Only Nvidia knows how much it would’ve helped performance, but it certainly wouldn’t have hurt, especially in RT.
 
I’d like to throw a question out there regarding GB202 and GB203 L2 cache. Despite our proximity to launch, I haven’t seen any leaks about the cache size. The 128 MB rumor has been out for well over a year and never confirmed by anyone to my knowledge. It could easily have been an estimation based on the assumption that the core/cache ratio would stay the same as Ada. Sounds like that could be tough with only 22% more die space (on GB202) and added MCs, though AMD managed to cut down the L3 cache size on 4nm for Zen 5, so it’s not impossible I guess. GB203 shouldn’t have the same problem so I’d guess that cache would stay the same.

Curious if anyone has seen anything or has any thoughts on this.

Rumors point to a significant increase in bandwidth due to a wider bus and higher clocks. If anything Nvidia may decide to reduce L2 on GB202. No way to know until we see the goods
 
Rumors point to a significant increase in bandwidth due to a wider bus and higher clocks. If anything Nvidia may decide to reduce L2 on GB202. No way to know until we see the goods
Yeah, the 128MB rumor made more sense when people were talking about a 384-bit bus width but I just don’t see the need now. We’ll know soon enough, surprised nothing has leaked yet.
 
I strongly suspect that was a segmentation decision. All the memory controllers were enabled, and even if yields played some role I can’t see the need to disable 25% of the cache when you only needed to cut 11.1% of your cores to hit the needed yields. Only Nvidia knows how much it would’ve helped performance, but it certainly wouldn’t have hurt, especially in RT.
Technically we could find out. Somebody match core and memory clocks on a 4090 and a Quadro RTX6000 Ada and see how they differ. Someone smarter than me can surely figure out how to account for the SM count difference (~10%).
 
Technically we could find out. Somebody match core and memory clocks on a 4090 and a Quadro RTX6000 Ada and see how they differ. Someone smarter than me can surely figure out how to account for the SM count difference (~10%).
The Quadro RTX6000 Ada is kinda like the apex predator graphics GPU right now, with the full AD102 die enabled and 48 GB of VRAM, however it's marred down by a limited TDP of 300w (vs 450w for the 4090) and by using the regular GDDR6 (vs GDDR6X in the 4090). In actual game use, it's going to be power limited and will have ~8% less memory bandwidth than the 4090. In most cases it's swinging between being 10% faster than the 4090 or 20% slower than the 4090 depending on where the bottleneck lies.
 
The Quadro RTX6000 Ada is kinda like the apex predator graphics GPU right now, with the full AD102 die enabled and 48 GB of VRAM, however it's marred down by a limited TDP of 300w (vs 450w for the 4090) and by using the regular GDDR6 (vs GDDR6X in the 4090). In actual game use, it's going to be power limited and will have ~8% less memory bandwidth than the 4090. In most cases it's swinging between being 10% faster than the 4090 or 20% slower than the 4090 depending on where the bottleneck lies.
That's why I said match GPU and memory clocks. Clock them both at the level of the Quadro and run some tests.

Here I'll even link you the cards.

^That one comes with a bonus 3060.


:mrgreen:
 
Last edited:
Back
Top