But TFLOPs are not pointless when talking about GPU performance from the same familiy. Unless Navi is a major change to GCN, that for example has much better efficiency and maybe additional clockability compared to Vega 20, then we can roughly guesstimate what performance we can expect with next gen consoles based on TFLOPs (in conjunction with power draw) of previous and current GCN generations.
View attachment 3060
Looking at these charts from computerbase.de [0] we can see that the clock for clock performance increase between GCN generations is decent but not that large.
To compare the generations they picked cards (280X, 380X, RX 470) with the same number of shaders (2048) and equalized the clocks to get the same TFLOPs (4.260) as well as memory bandwidth (211.100 GB/s with the exception of the 280X which has 10 GB/s more due to a 384 bit bus).
3rd gen GCN (Tonga, Fiji) was the first generation with delta color compression which is probably a decent contributor to the better performance. 4th gen GCN (Polaris) has even better color compression which I would guess is again a decent contributor to the increase considering how bandwidth starved GCN seems to be.
View attachment 3052
Interestingly if we compare 5th gen GCN (Vega) to 3rd gen GCN (Fiji) we only see similiar gains as Polaris (4th gen GCN) had, even though AMD changed the compute units quite a bit according to their marketing material. hardocp [1] and gamersnexus [2] show similiar or even worse results when comparing Fury and Vega.
View attachment 3046
Some of the architectural changes in Vega have been said to be not active due to problems, not correctly working or just not giving the advertised performance increases outside of special applications (DSBR). So, maybe this will lead to a bigger performance increase compared to other generations if they bring those Vega features over to Navi in a working state and add new Navi features (possibly variable rate shading?)?
Is it crazy? No, they would "only" need to double the TFLOPs and bring the usual generation to generation performance increase and we would see more than twice the performance. There are even working current gen features that will only really be used once all new consoles have it. For example FP16 is only available on the PS4 Pro and Switch which together have only ~20% market share (just spitballing), so if Lockhart, Anaconda and the PS5 all support FP16 we will probably see more engines take advantage of it which should increase the performance.
Of course, if AMD knocks it out of the park with Navi maybe a GPU with 9 TFLOPs will already deliver twice the performance. For example Nvidias Turing based GTX 2060 FE with 6.5 TFLOPs is almost as fast as the Pascal based GTX 1080 FE with 8.9 TFLOPs. But I doubt it, Nvidia has better generation to generation increases then AMD.
Since I didn't dare to use the web to search for better comparisons because Mozilla fucked up and deactivated all the add-ons for all Firefox users - inlcuding essential stuff like NoScript and add blockers - let's use the performance per watt data to show the difference (they compare the cards while running WoW at 60 fps). I added some scribbles to the pictures.
View attachment 3061
Since the consoles have a limited power draw budget we also have to look at the power draw in order to guesstimate what performance we can expect. For example the Xbox One X draws ~180 watt while gaming, which not only includes the GPU portion but the CPU, RAM, HDD, fans etc. as well. The original PS4 and Xbox One are in the same ballpark.
If we look at the Radeon 7 we see that AMD can deliver 13.44 TFLOPs on 7nm while running 60 CUs @ 1750 MHz. But to get there they need a power draw of 288 watt for the GPU alone which goes far beyond the console power budget:
View attachment 3057
Unlike Radeon 7 the consoles will most likely use GDDR6 which has a bit more power draw then HBM. Sadly I don't know of a chart which shows how much the power draw of the Radeon 7 changes when the clocks are lowered. Otherwise we could better guesstimate because it would only need 1565 MHz to reach ~12 TFLOPs, which seems like a more realistic console clockspeed to me. On the other hand the Vega 20 die used in the Radeon 7 has 64 CUs which could be a problem for consoles due to yields.
For a recent power draw chart with AMD cards I'm only aware of the following one for the RX 580 vs RX 590. If Vega 20 has a similiar behaviour and the last 200 MHz increase the power draw by 75-100 watt then Vega 20 would consume 188-213 watt for ~12 TFLOPs (= roughly double the performance since Polaris which is used in the Xbox One X and Vega seem similar in games).
View attachment 3058
Of course, if the console manufacturers really want to then they have ways to better optimise the power draw as can be seen on the Xbox One X which in total consumes less power then the RX 580 while having similiar TFLOPs (6.001 vs 6.175), a bigger bus (384 bit vs 256 bit), more memory (12 GB vs 8 GB) but uses the same architecture (Polaris) - only wider (40 vs 36 active CUs) and lower clocked (1172 vs 1340 MHz).
I would guess removing gaming irrelevant stuff like 1:4 FP64 and machine learning instructions that Radeon 7 has will make the GPU also more efficient.
As the 2nd biggest consumer the CPU is also interesting for the total power draw since it eats into the power budget for the GPU. When measuring the package of current 65W TDP Zen and Zen+ based Ryzen CPUs they consume ~50W while gaming and ~80W under full load. But since the consoles can't run their CPU at 20 degrees celsius the power draw would be higher. I'm not aware of measurements for AMDs 45W TDP 8c/16t CPU - the Ryzen 2700E - which has a base clock of 2.8 GHz.
View attachment 3063
The 8c/16t Zen 2 sample with a TDP of 65W that was showcased at CES can compete with Intels 9900k in multi threaded workloads (AMDs forte). Based on that users have guesstimated that it would need an additional ~500 MHz plus ~10% IPC increase to reach those scores. In other words Zen 2 running the same clocks as the current CPUs will be more power efficient, so something between 2.8 GHz and 3.2 GHz like is mentioned in almost all rumors seems realistic for consoles without impacting the GPU performance budget too much.
So, based on current GPUs TFLOPs I expect anything between 10 and 12 TF for the PS4 and Anaconda. Which means depending on the architecture enhancements I think double the performance of the Xbox One X seems not crazy but realistic.
However, I'm curious if they target native 4k or if they use checkerboard 4k or someting along those lines. The Xbox One X needs 4.5 times the power of the Xbox One to render Xbox One games at native 4k, even though 4k has "only" 4 times as much pixels as 1080p. Probably because many Xbox One games run below 1080p and quite a few games get enhanced textures, better tesselated models etc. on the Xbox One X. According to Mark Cerny to run PS4 games at 4k you would need around 8TF which seems to be in line with the increase Microsoft gave the One X (PS4 with 1.84 TF * 4.5 would be 8.28 TF).
Which means developers would only have ~4 TFLOPs more to play around with compared to the PS4 (if we take the PS4 as baseline) and devs target native 4k with consoles that have ~12 TFLOPs. Of course depending on the architectural enhancements the difference might be more then what the TFLOPs difference suggests, but I only expect similiar gains like previous GCN changes.
Luckily the best thing about the new consoles will be the powerful Zen 2 based 8 core CPUs and the SSDs anyway - at least in my opinion. Everything else is only the icing on the cake. Consoles can have advantages due to being closed systems and I hope Sony can deliver something which is even better than "just" a PCIe4 SSD with their custom SSD solution.
TL;DR: forgot what I really wanted to write half way through, but posted my jumbled thoughts anyway in order to tempt others with more knowledge/actual kowledge into writing their thoughts as well while correcting me, heh.
[0]
https://translate.google.com/translate?sl=auto&tl=en&u=https://www.computerbase.de/2016-08/amd-radeon-polaris-architektur-performance/
[1]
https://www.hardocp.com/article/2017/09/12/radeon_rx_vega_64_vs_r9_fury_x_clock_for
[2]
https://www.gamersnexus.net/guides/2977-vega-fe-vs-fury-x-at-same-clocks-ipc