Nvidia Blackwell Architecture Speculation

  • Thread starter Deleted member 2197
  • Start date
Supposed "2 slot cooler for 5090 FE"


but 2 slot doesn't seem right for the 5090, there's no major node jump, Blackwell compute does not demonstrate any major efficiency gains, and there's no way Nvidia would allow performance recessions.

I could see an AI dedicated version that has a 2 slot cooler. Binned super low compute clock/384bit bus w/high memory speed. A bin specifically for high memory/low clockspeed/slim server SKU. But two slot cooler for a 5090, like with "it's logically MCM" it doesn't sound like Kopite knows shit.
They will release a 3 slot 5090 super that is another $500 bucks a few months later lol
 
It’s hard to tell how much price sensitivity there is among flagship buyers. 4090s already range from ~$1750 to $2000 with MSRP at $1600. The 4090 is still plenty fast so pushing past $2000 will probably cause existing 4090 owners to think twice. People with 2080 Tis or 3090s are due for an upgrade but will probably balk at much higher prices too.

Pricing on the 4080 16GB and unlaunched 4080 12GB was really cynical and gamers pushed back. I won’t be surprised if they tried to push the 5080 above $1000 again though. Either way it seems the interesting fight will be under $800 this time around.
 
i will not replace my 4090 for a 5090 unless its like 10 times faster at everything lol
i kept my old gtx 980 ti for (too) many years so this 4090 has something to live up to
 
Buffers don't take that much, and the only titles I'm aware of with 8k textures are the Crysis Remasters so far, which don't take 24gb. The only title that seems to want above 16gb at all so far is Frontiers of Pandora, on the hidden Unobtanium settings, and it's really unclear how much of the memory it reserves is actually needed (someone would need to go through with a debugging tool running to figure it out).

This rumor doesn't make much practical sense, but I'm not going to claim what Nvidia does for it's uppermost tier makes practical sense. The Titan RTX launched at a then unprecedented 280w and still unprecedented $2499 back in 2018, the entire point is to grab headlines first, and see how much profit margin they can squeeze out of the highest paying customers a close second. "Practical sense" is for anything below that.

I can't enable DSR on S90C to check but I remember Cyberpunk at 8k native would run out of 24GB while DLSS kept it to around 20. Enabled FG and again out of VRAM. With the texture mods it'd be even worse.

This youtuber does 8k testing on many games. Below is Cyberpunk with 4090, it drops to 16GB at 8k DLSSP but with FG enabled jumps to 22GB.


In Spiderman, with FG enabled, it maxes out VRAM as well. See at 18min.

 
I can't enable DSR on S90C to check but I remember Cyberpunk at 8k native would run out of 24GB while DLSS kept it to around 20. Enabled FG and again out of VRAM. With the texture mods it'd be even worse.

This youtuber does 8k testing on many games. Below is Cyberpunk with 4090, it drops to 16GB at 8k DLSSP but with FG enabled jumps to 22GB.


In Spiderman, with FG enabled, it maxes out VRAM as well. See at 18min.


Reservation can just be "I dunno whatever just scale it to resolution". Doesn't mean it actually uses it, again nothing in resolution scaling is going to get you another 4gb (and even if it did, you wouldn't need 28gb).

Even at native 8k getting to 3gb is hard, an entire 64bit buffer is all of 256mb, a few of those and you're done with resource usage in every game, virtualized texture streaming pools and virtualized shadow map pools still might not hit 4gb total, let alone "in addition". And no modern triple A game is playable at native 8k anyway, even a 5090ti isn't going to be playing Frontier's of Pandora at native 8k, so there's no point in supporting it yet, even if you have an 8k monitor.
 
Last edited:
From what he's showing I believe the VRAM usage is from Frame Generation.

I'm not sure if anyone's looked into this in detail but how does VRAM usage scale for FG as resolution scales? In theory if FG is working on the output resolution the VRAM requirement could possibly scale propertionally to resolution. This would mean 4k taking possibly being x4 the VRAM for the FG component versus 1080p, and therefore 8k taking x4 that of 4k. Hypothetically this could mean something 0.5GB->2GB->8GB just for the FG component at 8k.

I've seen some data that suggests it does currently roughy scale in that way.
 
While FG is going to be a big part of it, even without it, moving to 8k from 4k only increasing VRAM by 4GB is going to be an exception.

In this older video comparing 6900XT/3090 at 4k and 8k, 8k uses 7-10GB more over 4k.


Even in the almost decade old GTAV, VRAM usage jumps to 14GB from 7GB,



And even if you have the VRAM, I remember the fps not increasing, apparently DLSS gets overwhelmed, not sure about FSR3. So perhaps the OFA will need upgrade as well.

Lossless Scaling have beaten nvidia/AMD to the 2 generated frames.

 
Are there any rumored killer features like frame generation for Blackwell?

You mean that competitors will put on every generation while Nvidia bilks its own rabid fanbase by arbitrarily cutting off support to only the newest gen hardware?

OK sarcasm aside, I obviously had low expectations, but not this low. A general clockspeed boost, that's it? Are they going to offer more proprietary RT vendor extensions no one will use as well. Kopite at least seems to have their shit together as a leaker, so this appears to be as little of a core change as the rest of Blackwell, maybe the PR BS will be off the charts for this and that's what they're expecting to juice sales.
 
Last edited:
Are there any rumored killer features like frame generation for Blackwell?

No, but I think it's logical to expect substantial improvements for tensor and RT cores in Blackwell GPU's over traditional raster performance.

Getting a 2-3x improvement performance out of the tensor cores for DLSS inference will certainly allow for either higher frame rates, much higher quality, or both. I'm sure RT cores will see further improvements as well.

So I think it's possible that a 5070 GPU with DLSS will be able to exceed performance in RT workloads versus a 4090, but not pure raster performance. That's ok. The future is RT with AI based reconstruction.
 
Getting a 2-3x improvement performance out of the tensor cores for DLSS inference will certainly allow for either higher frame rates, much higher quality, or both.
Cutting down on DLSS runtime won't bring much performance wins unless you're running some weird combination of h/w+settings like 8K + ultra perf on a 5060.
Increasing the ML performance on gaming chips also means more interest from non-gaming markets and less sales of more profitable AI h/w.
In other words I don't think it will happen.

I'm sure RT cores will see further improvements as well.
That one is a given but remains to be seen what these improvements will be. Nvidia may feel less interested in performance this turn and more interested in features and flexibility.
Also for a 84SM GB203 to be on par with a 144(128)SM AD102 there bound to be "raster" performance improvements. They are supposedly using the same process so just clocking the thing by some 30-50% higher seems unlikely.
 
I can definitely see Nvidia trying to sell Blackwell based on boosted RT performance while offering a less impressive speed up in raster. They could get away with it if AMD doesn’t have anything faster in raster either. People will grumble but still buy as usual.

The other obvious move is to just shift all SKUs up a tier. 5080 from chopped GB202, 5070 from GB203 etc.
 
Back
Top