Nvidia Blackwell Architecture Speculation

  • Thread starter Deleted member 2197
  • Start date
1737626155815.png1737626180576.png

The link is unfortunately dead. However, another forum member compared the 5090 results with the previous 4090 results on the same site.

csm_stresstest_afd075dd82.jpg
The 5090 FE underwent a FurMark stress test. Apparently, the card doesn’t sound like a vacuum cleaner, but to me, the sound profile is just as important as the dB level. By the way, the GDDR7 hotspot temperatures exceed 90°C.

Also on board power supply profile is similar to 4090 where PCIe slot power is reserved for auxiliary units, such as fans, LEDs, PLL, M.2 NVMe etc., and maxes out at around 20W and the rest is via 12V-2x6 pin cable.
1737592058986.png
1737592086454.png
 
Last edited:
That’s pretty disappointing even with lowered expectations. Was expecting heavier games to show more gains but 15% in Wukong and Outlaws is terrible.
 
DLSS4 DLLs are out in CP2077 update 2.21: https://store.steampowered.com/news/app/1091500/view/538843836072329454?l=english

  • Added support for DLSS 4 with Multi Frame Generation for GeForce RTX 50 Series graphics cards, which boosts FPS by using AI to generate up to three times per traditionally rendered frame – enabled with GeForce RTX 50 Series on January 30th. DLSS 4 also introduces faster single Frame Generation with reduced memory usage for RTX 50 and 40 Series. Additionally, you can now choose between the CNN model or the new Transformer model for DLSS Ray Reconstruction, DLSS Super Resolution, and DLAA on all GeForce RTX graphics cards today. The new Transformer model enhances stability, lighting, and detail in motion.
  • Fixed artifacts and smudging on in-game screens when using DLSS Ray Reconstruction.
  • The Frame Generation field in Graphics settings will now properly reset after switching Resolution Scaling to OFF.


Doesn't seem to be THAT more expensive to run on Lovelace.
 
Last edited:
It shouldn't be much about architecture, but raw power. Eg. 2060 will probably have harder penalty than 2080 Ti etc
I dont know, DLSS in the past has never been one to stress overall compute power differences much at all. Tensor power per SM seems to generally be the biggest factor, which will be architectural based.

We'll find out pretty quickly, though. My guess is that the relative performance loss even over different architectures stays in a fairly close ballpark, not differing too radically. I really doubt this model is stressing these tensor cores so hard that Turing/Ampere are gonna struggle with it. That is just a guess, though.
 
I dont know, DLSS in the past has never been one to stress overall compute power differences much at all. Tensor power per SM seems to generally be the biggest factor, which will be architectural based.
Meant raw tensor power, not shader. Transformer model is 4x heavier than CNN model
 
TPU has a nice comparison table showing the performance of CNN and TNN models for the 5090 and 4090. As expected, the 4090 experiences a more significant performance hit when switching from CNN to TNN on DLSS Q/B/P presets.

 
Meant raw tensor power, not shader. It's Transformer model is 4x heavier than CNN model
Right, but simply having more tensor cores on the die hasn't ever seemed to improve performance for DLSS in any consistent way within any given architecture range. They dont seem to all be acting as one large block together doing a workload, which is why that at least so far, tensor core performance per SM has seemed to be more relevant. Meaning it's really architecture that does make the difference.
 
Back
Top