Yeah, the shader count "only" goes up by about 27%, although the clocks drop by around 5%, so the effective CUDA / shading rate seems to only grow by maybe 25% total. The really BIG gain is the memory bandwidth, which is a nearly 80% bump from the 4090. This likely matters less for the typical rasterized scene, but I wager it matters a lot more for FP32 and especially FP64 functions where the loads/stores are significantly larger and will burn through said bandwidth. I wager this extra bandwidth also measurably enhances raytracing performance for the same reasons, but I guess we'll have to wait and see.