Nvidia Blackwell Architecture Speculation

  • Thread starter Deleted member 2197
  • Start date
My initial impressions are that the performance gains look to be very weak. Just looking at the charts and seeing DLSS frame gen being compared to mfg, the scaling we're seeing is pretty bad. The real gains do not seem to be impressive particularly with the 5090 having so much more extra silicon. While some of the new features are cool, realistically speaking, the lag time between rapid adoption of these features puts us at least the 6000 series. Also it's just too many features at once for developers to meaningfully implement. Except Nvidia pays devs to implement some of these, I don't really see a reason for developers to do so. You don't get any extra sales if you spend additional time implementing the newer features. I'm sure mfg will find some traction but there are already scalers on the market that do a similar task. Jensen also claiming that the 5070 delivers 4090 performance is rather dubious and evokes a level of scrutiny I haven't felt in a while. This is perhaps the worst Nvidia presentation I've watched in a long time. I personally can't wait for independent reviewers to put these cards through their paces.

For the first time in a long time, I'm not even remotely optimistic about the future of GPUs. If this is the best Nvidia could do then Amd/Intel and co are cooked. I won't be surprised if the stock price takes a hit once people realize how underwhelming this release truly is...
 
Last edited:
5090 still has the raw powers there. It has 33% more CUDA cores than a 4090, even with a slower base clock (2.01GHz vs 2.23GHz) it still has ~30% more raw shader performance. The difference in boost clock is even smaller (2.41GHz vs 2.52GHz). Note that a 4090 generally runs much faster than this "boost clock" (mine runs at > 2.7GHz without OC, > 2.9GHz with OC), and we don't know how 5090 will perform in this regard yet. However, combined with nearly twice the memory bandwidth, I don't think the raw raster performance will be a problem for 5090. I think what's more important for a gamer with a 4090 now is probably ray tracing (path tracing) performance. This is probably the only thing today in games that can bring a 4090 to its knees, so the improvements there is probably more important than raw rasterization power.

As the investments in AI, it's really not news, as NVIDIA has been hinting at this for years. The point NVIDIA is making is that raw raster performance probably won't scale anymore. A 4090 is rated @ 450W by NVIDIA. Compared that to 5090's 575W, that means the shader performance per watt barely increases at all. Even if you take all these AI things out and put some shader performance back, I suspect that it's probably only like 10% better at best. It won't be the best way to spend these silicons.
 
Who or how many games/applications/software do you think will ship two sets of textures/shaders just to use more AI HW exclusively for a single vendor ?
Maybe nvidia can push their own texture packs through the gf app, maybe they can make some option to just convert them with your gpu like compiling shaders? We were getting AAA games releasing high res texture packs as optionable downloads at one point aswell, even though most of the time it was the same old I see no difference.

No one is showing us any other alternatives? they seem to just be trying to land in Scrooge McJensens pool of money and landing on the tiles. You could look at it from the other direction, if nvidia weren't pushing graphics forward even if some don't like the direction they are pushing where exactly would we be at the moment on pc? Hell even Sony seem to be following nvidia now.

If there was or is a better alternative to throw the sand at someone needs to step up on the double.
 
Who or how many games/applications/software do you think will ship two sets of textures/shaders just to use more AI HW exclusively for a single vendor ?
Why do you ask? The cards will work just fine without that. More than fine, looks like 2-3 of these 4 won't have any competition in pure non-AI performance.

Then with AI h/w coming to Radeons and already being there in Arcs and the addition of said h/w access to DX API I'd say that it will be used sooner than later - and there won't be any need for two sets of anything.

And as for "validating" the investment they seem to do okay on that right now with billions of sales made on that particular AI h/w.
 
5090 still has the raw powers there. It has 33% more CUDA cores than a 4090, even with a slower base clock (2.01GHz vs 2.23GHz) it still has ~30% more raw shader performance. The difference in boost clock is even smaller (2.41GHz vs 2.52GHz). Note that a 4090 generally runs much faster than this "boost clock" (mine runs at > 2.7GHz without OC, > 2.9GHz with OC), and we don't know how 5090 will perform in this regard yet. However, combined with nearly twice the memory bandwidth, I don't think the raw raster performance will be a problem for 5090. I think what's more important for a gamer with a 4090 now is probably ray tracing (path tracing) performance. This is probably the only thing today in games that can bring a 4090 to its knees, so the improvements there is probably more important than raw rasterization power.

As the investments in AI, it's really not news, as NVIDIA has been hinting at this for years. The point NVIDIA is making is that raw raster performance probably won't scale anymore. A 4090 is rated @ 450W by NVIDIA. Compared that to 5090's 575W, that means the shader performance per watt barely increases at all. Even if you take all these AI things out and put some shader performance back, I suspect that it's probably only like 10% better at best. It won't be the best way to spend these silicons.
That's fine but based on their charts, it doesn't look like real RT performance has taken a huge leap either. That being said, Ada and Blackwell are on the same node so the gains were always going to be limited. I imagine Nvidia would want to show these cards in the best light and would pick games that do so. If these are the best examples they could find, it raises some cause for concern. For 575 watts, the performance increase is not good at all. Couple that with poor ram improvements for all but the 5090, it's looking like an easy skip.
 
Also, will the 5080 be faster than the 4090? I'm not so sure. Way less cuda cores, only a 100mhz difference in boost clock? The way Nvidia is cutting these chips is suspect.
 
Also, will the 5080 be faster than the 4090? I'm not so sure. Way less cuda cores, only a 100mhz difference in boost clock? The way Nvidia is cutting these chips is suspect.

Assuming no frame gen, I doubt it. That was the rumour, but the rumours seem to be way off. I haven't looked at each model, but the 5080 seems like maybe 20% raw raster performance, unless boosting behaviour has improved a lot. 10% more cores, 5% more base clock. Some other efficiency gains. We'll see how it goes with games that implement some of the new ray tracing efficiency features. Really seems like they focused on DLSS and ray tracing and pure shader performance took a back seat. Lots of investment in trying to minimize vram consumption, and I guess bandwidth by extension. If you're just looking to play play games without ray tracing, these gpus don't look like a good buy if you already have a 40 series, but from a 30 series or 20 series, they look like a good upgrade and Nvidia did bring the prices down a bit.
 
If that vendor has 90% of market share I guess probably a lot?
Back in the Voodoo Graphics days, many games were Glide exclusives.
I'm pretty sure many games that featured a Glide implementation had other hardware renderers ...
Maybe nvidia can push their own texture packs through the gf app, maybe they can make some option to just convert them with your gpu like compiling shaders? We were getting AAA games releasing high res texture packs as optionable downloads at one point aswell, even though most of the time it was the same old I see no difference.
I don't think you can just trivially replace an entire texturing HW pipeline transparently. People are going to be in for a rude awakening if they think the process can be treated on a similar level like some community texture mods ...
No one is showing us any other alternatives? they seem to just be trying to land in Scrooge McJensens pool of money and landing on the tiles. You could look at it from the other direction, if nvidia weren't pushing graphics forward even if some don't like the direction they are pushing where exactly would we be at the moment on pc? Hell even Sony seem to be following nvidia now.

If there was or is a better alternative to throw the sand at someone needs to step up on the double.
There already was an 'alternative' and it already existed for many years now. What else do you think the texture units on GPUs and standardized shading/material pipelines were for ?

Again, what good reason is there to litterally *redo* all of the texture/material work that can only run exclusively with the new pipeline when the old pipeline was both perfectly fine and cross-platform too ?
 
I'm pretty sure many games that featured a Glide implementation had other hardware renderers ...

In the early days, not really, because there weren't much viable alternatives. At the time almost everyone had their own proprietary APIs and many game developers don't really have time nor resources to support multiple vendors, so it was mostly exclusives (some with software renderers). OpenGL support was also very limited at the time. The situation only changed for the better after I think Direct3D 5.
 
like some community texture mods
I was talking about actual developer released ones.

 
@Lurkmass If you can compress a texture down to 1/5 of its normal size without perceptable loss in quality, I can't see why anyone WOULDN'T want to do that. You can ship smaller games, and have more VRAM. The whole "Neural Shaders" thing seems like it's going to be standardized in DX12, and it'll probably work across RTX gpus with tensor cores, and probably some RDNA gpus. That would be my guess. Nvidia has likely just optimized the SM to do it more efficiently. So I'll wait and see what the developer info says when the SDK comes out. We're already seeing Pascal get dropped from support, so if 20 series is the baseline for games, maybe it's not that crazy. I still think it'll take some time. It's a big change.

 
Is the GeForce Blackwell whitepaper available? What hardware limitations are preventing the 4000 series from supporting DLSS Multi-Frame Generation? It doesn't seem to be due to baseline AI TOPS speed, as the 5070 (FP4) is lower than the 4090 (FP8) in terms of TOPS

fix:edit
 
Last edited:
In the early days, not really, because there weren't much viable alternatives. At the time almost everyone had their own proprietary APIs and many game developers don't really have time nor resources to support multiple vendors, so it was mostly exclusives (some with software renderers). OpenGL support was also very limited at the time. The situation only changed for the better after I think Direct3D 5.
Based on a quick search comparisons, many games using Glide did have other renderer implementations. Sure you could make an argument in a few instances where Glide renderers were superior to other renderers but for the most part it wasn't the only hardware-based option or let alone option at all ...
I was talking about actual developer released ones.

They were still using the SAME texturing PIPELINE to release their updated textures against. NTC (neural texture compression) is an entirely different texturing PIPELINE compared to the existing standardized texturing PIPELINE we're mostly familiar with ...

You're absolutely going to need to author different texture sets due to the fact that BC encoded textures are fundamentally incompatible with the NTC pipeline ...
@Lurkmass If you can compress a texture down to 1/5 of its normal size without perceptable loss in quality, I can't see why anyone WOULDN'T want to do that. You can ship smaller games, and have more VRAM. The whole "Neural Shaders" thing seems like it's going to be standardized in DX12, and it'll probably work across RTX gpus with tensor cores, and probably some RDNA gpus. That would be my guess. Nvidia has likely just optimized the SM to do it more efficiently. So I'll wait and see what the developer info says when the SDK comes out. We're already seeing Pascal get dropped from support, so if 20 series is the baseline for games, maybe it's not that crazy. I still think it'll take some time. It's a big change.

We'll need more data in practice to get a better idea of either quality vs memory consumption or performance but NOT only are you going to be dropping support for specific sets of hardware configurations, you'll now have two different texturing pipelines COMPETING with each other ...

Hardware vendors are going to be tasked with a major decision on whether or not they'll have to entirely remove the texturing units on GPU HW designs. Practically all relevant software today heavily relies (cannot be emphasized enough) on the existing texturing pipeline to get good performance that even Larrabee's design had to be modified to include them!
 
My initial impressions are that the performance gains look to be very weak. Just looking at the charts and seeing DLSS frame gen being compared to mfg, the scaling we're seeing is pretty bad. The real gains do not seem to be impressive particularly with the 5090 having so much more extra silicon.
It seems to largely depend on what's being compared. Comparing a 5080 to a 4080, ~30% gains without MFG with an allegedly smaller die on the same node is very impressive. A 5090 only seeing a similar gain over a 4090 despite a significantly larger die, not so much.

The 4090 had a similar sub-linear scaling over the 4080, though it had memory bandwidth as a credible scapegoat which the 5090 definitely doesn't. It just looks like nvidia is struggling to scale performance in games at the top end despite improving their microarchitecture.

The billion dollar question is what the actual bottleneck is and can nvidia fix it for Rubin?
 
The bottleneck is software. There is nothing to fix in hardware. When a AI generated frame can be created within a few ms rasterising is not efficient enough.
 
Back
Top