DegustatoR
Legend
Yes. NTC should also work on any GPU with fast enough matrix math. The rest though are less clear.RTX Mega Geometry (who comes up with these names?) works on all RTX GPUs.
Yes. NTC should also work on any GPU with fast enough matrix math. The rest though are less clear.RTX Mega Geometry (who comes up with these names?) works on all RTX GPUs.
Extrapolation?When we asked how DLSS 4 multi frame generation works and whether it was still interpolating, Jensen boldly proclaimed that DLSS 4 "predicts the future" rather than "interpolating the past." That drastically changes how it works, what it requires in terms of hardware capabilities, and what we can expect in terms of latency.
Jensen says DLSS 4 "predicts the future" to increase framerates without introducing latency
Multi frame generation will be very different from framegen.www.tomshardware.com
Extrapolation?
It's why I'm strongly considering not getting a 5090 and instead waiting for 60 Series. It will be years before these features are broadly adopted in games (if they ever do) and the 4090 and 5090 will still remain overkill for the rest of this generation for VRAM.. so there's no real necessity.That’s an interesting article. It also suggests only 50 series can run neural shaders, which would be a death knell for feature adoption
That makes sense though. Neural shaders with forward shading means you need a proper interface to the tensor cores that is suitable for calling it from just groups of 4 threads, not full warps.That’s an interesting article. It also suggests only 50 series can run neural shaders, which would be a death knell for feature adoption
H100 introduced asynchronous warpgroup MMA. A single thread within a warpgroup (= 4 warps, 128 threads) submits a batch of tensor operations. The tensor cores then pull data out of shared memory (and, optionally, registers) and write the result to registers across the whole warpgroup. This happens asynchronously, threads can go on doing other work until they require the MMA results, while the tensor cores use any "spare" shared memory and register bandwidth to do their work. I would expect Blackwell to adopt (in fact, to expand) this model.Yeah, well, they could've done shading and tensor operations "at the same time" since Ampere. It is usually impossible due to register file and memory bandwidth though, not because of how the SM is designed. So it is unclear what has changed in Blackwell. Maybe they've moved the tensor ALUs inside main shading SIMDs and they all are controlled by the same logic now? Would be wild as that's seem to be how AMD has implemented AI h/w in RDNA4, and also would actually be a step backwards from how tensor h/w was built into Nvidia GPUs since Volta.
Well strictly speaking nothing stops Nvidia from running such workloads in the same way RDNA does it right now. It would probably be too slow to be usable though. The convergence between Blackwell and RDNA is happening with RDNA4 specifically which isn't exactly generic "RDNA" and means the exact same thing for "neural shaders" compatibility on AMD as it does for Nvidia.That makes sense though. Neural shaders with forward shading means you need a proper interface to the tensor cores that is suitable for calling it from just groups of 4 threads, not full warps.
Older Nvidia generations would not be able to use the tensor cores properly from a fragment shader.
Funny though, RDNA should not have any issues with neural shaders at all, if anything Blackwell should behave much closer to it now.
5070 is beating 4070Ti in Nvidia provided benchmarks.will the 5070 even beat the 4070 super? I'm not so sure.... I'm, not so sure...
Which is why nobody should look at the specs to figure out the performance.The 5080 also looks to be barely faster than a 4080 super if we just look at the specs
Yea, let's wait for independent benchmarks. The Nvidia provided benchmarks are filled with caveats and often compare unlike things. Even Intel provides better benchmarks.. For those of us not interested in mfg, there's certainly reason for concern.... Especially as it relates to real performance gains in RT and Raster when compared to the super line..5070 is beating 4070Ti in Nvidia provided benchmarks.
Which is why nobody should look at the specs to figure out the performance.
That's for the 5090 and it has significantly more cuda cores, more bandwidth, more rt cores, etc. For the other gpus, when compared to super line, there is barely any improvement in base specs, clock speeds, etc. Very suspect benchmarks released by Nvidia... Very suspect.That 27% and 40 something percent uplift for far cry and a plague tale tells me that scaling outside of node reductions is dead and buried.
The 5090 consumes like 25% more energy and has 70% percent more memory bandwidth and that's the percentage improvement?
"For GeForce RTX 50 Series laptops, new Max-Q technologies such as Advanced Power Gating, Low Latency Sleep, and Accelerated Frequency Switching increases battery life by up to 40%, compared to the previous generation."nVidia claims that their notebook versions are 40% more efficient. And they use basically the same configuration outside of GDDR7.
FC6 is most definitely CPU limited on the 5090 since it shows a higher gain on 5080 vs 4080 which makes zero sense otherwise.The 5090 consumes like 25% more energy and has 70% percent more memory bandwidth and that's the percentage improvement?