Nvidia Blackwell Architecture Speculation

  • Thread starter Deleted member 2197
  • Start date
Irrelevant without knowing the power limit of the GPU. Mobile GPUs can show wildly different results depending on how much power they can use.
Agreed. and also if the GPU was prioritized over CPU -- many modern "gaming" laptops include software which will disable or limit CPU boost (Windows power settings -> Max CPU speed to 99% fully disables CPU boost) which provides measurably better performance to the GPU in terms of cooling. This in turn facilitates a more consistent and higher GPU clock.

Source: I manually do this with my Aero v8. The 8750h is entirely sufficient for 90% of my games when not running turbo, which allows me to run the 1070MQ at a notably higher speed and with consistently better framerate.
 
FC6 is most definitely CPU limited on the 5090 since it shows a higher gain on 5080 vs 4080 which makes zero sense otherwise.
APTR is a more GPU limited game so +40% is the more likely average result for 5090 vs 4090. Considering that we're looking at +30% or so FP32 change between 4090 and 5090 this seems like a solid enough gain really.
Of any game that's available, they would choose a CPU limited game? Let's wait for indipendent reviews.
 
Of any game that's available, they would choose a CPU limited game? Let's wait for indipendent reviews.
The choice is weird and all I can think of is that FC6 is highly memory bandwidth sensitive which means that in theory it should provide higher than normal gains on Blackwell.
But while this may be true for lower SKUs on a 5090 it is rather obviously CPU limited.
 
My guess is that the outer yellow one is the I/O daughterboard mezzanine connector, and the green one is for the PCIe Gen5 daugtherboard. Could be vice versa but this seems to make it easier to route the ribbon cables.
1736364300106.png20250108-1919-36.2422832.gif

NVIDIA's Unreleased TITAN/Ti Prototype Cooler & PCB | Thermals, Acoustics, Tear-Down. A similar mezzanine connector for the PCIe daughterboard on the prototype RTX 4090 PCB, courtesy of GamersNexus

1736364551828.png
 
Last edited:
I don’t know if we can say for sure that general performance scaling has hit a wall. It may just require an entire retooling of the architecture which Nvidia deems not to be worth it.
 
What we can say definitively is that Nvidia is hiding real performance figures. Whether there’s a reason for it or not, we shall find out shortly.
 
What we can say definitively is that Nvidia is hiding real performance figures. Whether there’s a reason for it or not, we shall find out shortly.
I think it will be between 20-40% depending on the game. Far Cry probably represents the lower end and Plague Tale the upper.
 
I think it will be between 20-40% depending on the game. Far Cry probably represents the lower end and Plague Tale the upper.

4080 super to 5080 will be interesting. Just by FLOPS it’s only 7 or 8 percent. In terms of architectural improvements and bandwidth, I’m not sure where that ends up. 20%?

I’m guessing RT performance will be a little bigger which is how you get to maybe 30 or 40% like Far Cry 6 in the slides.

I’m not sure what big architecture changes for the SMs and front end are on the table for them. DX13 feels a long way off but I’m not sure where things go as long as the programming model doesn’t change.
 
4080 super to 5080 will be interesting. Just by FLOPS it’s only 7 or 8 percent. In terms of architectural improvements and bandwidth, I’m not sure where that ends up. 20%?

I’m guessing RT performance will be a little bigger which is how you get to maybe 30 or 40% like Far Cry 6 in the slides.

I’m not sure what big architecture changes for the SMs and front end are on the table for them. DX13 feels a long way off but I’m not sure where things go as long as the programming model doesn’t change.
Should have clarified I meant the 5090 specifically. These cheaper tiers will probably range between 10-20%.
 
Of any game that's available, they would choose a CPU limited game? Let's wait for indipendent reviews.

I'm not sure how many alternatives there really would be that are more suitable. Farcry 6 is a game that has RT support and no DLSS support.

Bear in mind it's likely Nvidia wants to push messaging that normalizes DLSS and maximum RT settings. So they will not pick a game with the caveat of turning off DLSS/FG or lower RT settings and will likely want a game with RT (unless it's an esport specific title I guess).
 
I suspect Nvidia’s architectures have a major starvation problem. Maybe they can do something to improve utilization at the expense of top line numbers.

I don’t know. I remember seeing articles like chips and cheese talking about architectural reasons why Nvidia had lower utilization in Starfield and the what seems like weeks later the game was patched and those utilization problems went away. Don’t have a really solid grasp of big wins Nvidia could make.
 
Do we know if the RTX 50 series still have the hardware optical flow accelerators, and if so, whether they serve any purpose for regular consumers or are just dead silicon?
 
The 5090 has 3 video encoders. I wonder if streaming apps like sunshine will automatically take advantage of the extra throughput.


The GeForce RTX 5090 GPU is equipped with three encoders and two decoders, the GeForce RTX 5080 GPU includes two encoders and two decoders, the 5070 Ti GPUs has two encoders with a single decoder, and the GeForce RTX 5070 GPU includes a single encoder and decoder. These multi-encoder and decoder setups, paired with faster GPUs, enable the GeForce RTX 5090 to export video 60% faster than the GeForce RTX 4090 and at 4x speed compared with the GeForce RTX 3090.

GeForce RTX 50 Series GPUs also feature the ninth-generation NVIDIA video encoder, NVENC, that offers a 5% improvement in video quality on HEVC and AV1 encoding (BD-BR), as well as a new AV1 Ultra Quality mode that achieves 5% more compression at the same quality. They also include the sixth-generation NVIDIA decoder, with 2x the decode speed for H.264 video.
 
4:2:2 support for encoding is cool. I have never looked it up but I wonder if there are any benefits to uploading 4:2:2 to YouTube and having it convert to 4:2:0, or if that would look the same as capturing 4:2:0 or worse?
 
Back
Top