Nvidia Blackwell Architecture Speculation

Scott_Arm · Jan 8, 2025

I expect 50 series to have bigger gains in ray tracing than raster. Raster gains may be small.

troyan · Jan 8, 2025

Someone benchmarked the mobile 5080 in Geekbench:

https://videocardz.com/pixel/geforce-rtx-5080-laptop-gpu-tested-in-opencl-first-ever-performance-leak-of-rtx-blackwell

18% faster than a mobile 4080 and 6% faster than a mobile 4090.

DegustatoR · Jan 8, 2025

troyan said:
Someone benchmarked the mobile 5080 in Geekbench:

https://videocardz.com/pixel/geforce-rtx-5080-laptop-gpu-tested-in-opencl-first-ever-performance-leak-of-rtx-blackwell

18% faster than a mobile 4080 and 6% faster than a mobile 4090.

Irrelevant without knowing the power limit of the GPU. Mobile GPUs can show wildly different results depending on how much power they can use.

Albuquerque · Jan 8, 2025

DegustatoR said:
Irrelevant without knowing the power limit of the GPU. Mobile GPUs can show wildly different results depending on how much power they can use.

Agreed. and also if the GPU was prioritized over CPU -- many modern "gaming" laptops include software which will disable or limit CPU boost (Windows power settings -> Max CPU speed to 99% fully disables CPU boost) which provides measurably better performance to the GPU in terms of cooling. This in turn facilitates a more consistent and higher GPU clock.

Source: I manually do this with my Aero v8. The 8750h is entirely sufficient for 90% of my games when not running turbo, which allows me to run the 1070MQ at a notably higher speed and with consistently better framerate.

Charlietus · Jan 8, 2025

DegustatoR said:
FC6 is most definitely CPU limited on the 5090 since it shows a higher gain on 5080 vs 4080 which makes zero sense otherwise.
APTR is a more GPU limited game so +40% is the more likely average result for 5090 vs 4090. Considering that we're looking at +30% or so FP32 change between 4090 and 5090 this seems like a solid enough gain really.

Of any game that's available, they would choose a CPU limited game? Let's wait for indipendent reviews.

Remij · Jan 8, 2025

Charlietus said:
Of any game that's available, they would choose a CPU limited game? Let's wait for indipendent reviews.

Or we can speculate..

DegustatoR · Jan 8, 2025

Charlietus said:
Of any game that's available, they would choose a CPU limited game? Let's wait for indipendent reviews.

The choice is weird and all I can think of is that FC6 is highly memory bandwidth sensitive which means that in theory it should provide higher than normal gains on Blackwell.
But while this may be true for lower SKUs on a 5090 it is rather obviously CPU limited.

Man from Atlantis · Jan 8, 2025

My guess is that the outer yellow one is the I/O daughterboard mezzanine connector, and the green one is for the PCIe Gen5 daugtherboard. Could be vice versa but this seems to make it easier to route the ribbon cables.

NVIDIA's Unreleased TITAN/Ti Prototype Cooler & PCB | Thermals, Acoustics, Tear-Down. A similar mezzanine connector for the PCIe daughterboard on the prototype RTX 4090 PCB, courtesy of GamersNexus

techuse · Jan 8, 2025

I don’t know if we can say for sure that general performance scaling has hit a wall. It may just require an entire retooling of the architecture which Nvidia deems not to be worth it.

Boss · Jan 8, 2025

What we can say definitively is that Nvidia is hiding real performance figures. Whether there’s a reason for it or not, we shall find out shortly.

techuse · Jan 8, 2025

Boss said:
What we can say definitively is that Nvidia is hiding real performance figures. Whether there’s a reason for it or not, we shall find out shortly.

I think it will be between 20-40% depending on the game. Far Cry probably represents the lower end and Plague Tale the upper.

Scott_Arm · Jan 8, 2025

techuse said:
I think it will be between 20-40% depending on the game. Far Cry probably represents the lower end and Plague Tale the upper.

4080 super to 5080 will be interesting. Just by FLOPS it’s only 7 or 8 percent. In terms of architectural improvements and bandwidth, I’m not sure where that ends up. 20%?

I’m guessing RT performance will be a little bigger which is how you get to maybe 30 or 40% like Far Cry 6 in the slides.

I’m not sure what big architecture changes for the SMs and front end are on the table for them. DX13 feels a long way off but I’m not sure where things go as long as the programming model doesn’t change.

techuse · Jan 8, 2025

Scott_Arm said:
4080 super to 5080 will be interesting. Just by FLOPS it’s only 7 or 8 percent. In terms of architectural improvements and bandwidth, I’m not sure where that ends up. 20%?

I’m guessing RT performance will be a little bigger which is how you get to maybe 30 or 40% like Far Cry 6 in the slides.

I’m not sure what big architecture changes for the SMs and front end are on the table for them. DX13 feels a long way off but I’m not sure where things go as long as the programming model doesn’t change.

Should have clarified I meant the 5090 specifically. These cheaper tiers will probably range between 10-20%.

trinibwoy · Jan 9, 2025

Scott_Arm said:
I’m not sure what big architecture changes for the SMs and front end are on the table for them.

I suspect Nvidia’s architectures have a major starvation problem. Maybe they can do something to improve utilization at the expense of top line numbers.

Davros · Jan 9, 2025

The benchmark I'm interested in is 5070ti vs 4070ti and ti super

arandomguy · Jan 9, 2025

Charlietus said:
Of any game that's available, they would choose a CPU limited game? Let's wait for indipendent reviews.

I'm not sure how many alternatives there really would be that are more suitable. Farcry 6 is a game that has RT support and no DLSS support.

Bear in mind it's likely Nvidia wants to push messaging that normalizes DLSS and maximum RT settings. So they will not pick a game with the caveat of turning off DLSS/FG or lower RT settings and will likely want a game with RT (unless it's an esport specific title I guess).

Scott_Arm · Jan 9, 2025

trinibwoy said:
I suspect Nvidia’s architectures have a major starvation problem. Maybe they can do something to improve utilization at the expense of top line numbers.

I don’t know. I remember seeing articles like chips and cheese talking about architectural reasons why Nvidia had lower utilization in Starfield and the what seems like weeks later the game was patched and those utilization problems went away. Don’t have a really solid grasp of big wins Nvidia could make.

raytracingfan · Jan 9, 2025

Do we know if the RTX 50 series still have the hardware optical flow accelerators, and if so, whether they serve any purpose for regular consumers or are just dead silicon?

trinibwoy · Jan 9, 2025

The 5090 has 3 video encoders. I wonder if streaming apps like sunshine will automatically take advantage of the extra throughput.

New GeForce RTX 50 Series GPUs Double Creative Performance in 3D, Video and Generative AI

GeForce RTX 50 Series features FP4 for powerful AI performance and up to three encoders with support for the 4:2:2 color format — plus, new AI tools enhance livestreaming, DLSS 4 boosts 3D rendering and NVIDIA NIM microservices and Blueprints supercharge AI on PCs.

blogs.nvidia.com

The GeForce RTX 5090 GPU is equipped with three encoders and two decoders, the GeForce RTX 5080 GPU includes two encoders and two decoders, the 5070 Ti GPUs has two encoders with a single decoder, and the GeForce RTX 5070 GPU includes a single encoder and decoder. These multi-encoder and decoder setups, paired with faster GPUs, enable the GeForce RTX 5090 to export video 60% faster than the GeForce RTX 4090 and at 4x speed compared with the GeForce RTX 3090.

GeForce RTX 50 Series GPUs also feature the ninth-generation NVIDIA video encoder, NVENC, that offers a 5% improvement in video quality on HEVC and AV1 encoding (BD-BR), as well as a new AV1 Ultra Quality mode that achieves 5% more compression at the same quality. They also include the sixth-generation NVIDIA decoder, with 2x the decode speed for H.264 video.

Scott_Arm · Jan 9, 2025

4:2:2 support for encoding is cool. I have never looked it up but I wonder if there are any benefits to uploading 4:2:2 to YouTube and having it convert to 4:2:0, or if that would look the same as capturing 4:2:0 or worse?

Nvidia Blackwell Architecture Speculation

Scott_Arm

troyan

DegustatoR

Albuquerque

Red-headed step child

Charlietus

Remij

DegustatoR

Man from Atlantis

idk

techuse

Boss

techuse

Scott_Arm

techuse

trinibwoy

Meh

Davros

arandomguy

Scott_Arm

raytracingfan

trinibwoy

Meh

New GeForce RTX 50 Series GPUs Double Creative Performance in 3D, Video and Generative AI

Scott_Arm