Except your metrics aren't wrong by 5%. They're wrong by a lot more than that. Where you claimed 12.6 TFLOPs on Vega 64 vs. 6.1 TFLOPs on RX580, it's actually 11.5 TFLOPs on Vega 64 vs. 6.5 TFLOPs RX580. We just went from 106% higher throughput to 77% higher throughput. It's a 30% difference from your claims, not 5%.
6.2 vs 6.5 is what again? Ammm 5%? Wow!
We just went from 106% higher throughput to 77% higher throughput. It's a 30% difference from your claims, not 5%.
And even with your nitpìcked numbers of best posible RX580 clocks and worst case Vega clocks, you still have to explain a 21% difference between
all performance metrics on Vega (and not just FP32) and actual gaming performance. Meanwhile 3080's scaling vs 2080s is
30% faster than its increase in both pixel and texture fillrate. Totally same situation. Totally! Not.
In the case of Vega 56, it's 9.3 TFLOPs vs. 6.5 TFLOPs, which means it has 43% higher FP32 throughput than the RX580.
Confirming that Vega had scaling issues that have nothing to do with FP32, because for the nth time, Vega56 not only had less FP32 , it had less memory bandwidth, less texture fillrate, etc. by the exact same amount. It is not only FP32 that was unused.
And if you only increased FP32 math on games and nothing else (somehow without increasing bandwidth requirements, which might be impossible in any real life scenario unless other loads are reduced), then the Vega 64 would also increase game performance.
It wouldn't gain anything, because nothing else was holding it back, because it has the exact same amount of extra texture fillrate, pixel fillrate and memory bandwidth as it does FP32, so using more of one type is not going to change the existing balance. Vega is not held back by anything in particular, except probably triangle setup like it was mentioned (this again qualifies as scaling issues). Ampere has
only an excess of FP32, and lacks fillrate and bandwidth. Ampere didn't get its FP32 increase by adding more of everything (aka CU, SMs), they did it by increasing FP32 SIMDs.
At
ISO clocks, the Vega 56 and Vega 64 have the exact same gaming performance, meaning the Vega 64 always has,
at least, 8 CUs / 512sp / 1.4 TFLOPs / 1xXBOne-worth-of-compute-power just idling.
Which is the definition of having scaling issues. CUs do a lot more than just FP32. It does INT. It does textures. It does special functions. It does load/store. It does atomics. Etc.
It didn't mean the architecture was broken then, it just meant RTG at the time had very limited resources and by only being able to launch one 14nm GCN5 chip for desktop, they chose to make one that served several markets at the same time, while sacrificing gaming performance.
They didn't sacrifice anything. It was a pixel and texel fillrate monster compared to RX580 with an over 2X increase!!!!! They didn't only increase FP32 SIMDs, they more than doubled everything. It just couldn't scale.
You either believe that games will push higher proportions of FP32 or you don't. You don't get to select that only a favorite IHV gets to blame game engines while the other gets accused of being broken and hopeless.
No one has done that. No one has blamed game engines. Pointing out that an architecture that is very heavy in a
single metric would see gains when that single metric is used more extensively than the others, is not blaming anything, it's pointing out the obvious. Pointing out that an architecture which was much stronger than its predecessor
in every metric, yet didn't scale performance accordingly, has scaling issues, is not the same as saying it is "broken and hopeless". No one has said that either, so stop putting words on other people's mouths, in the most hilarious and pathetic attempt at a strawman that I've seen in months. And maybe then we can start talking about class.