Just going to list the numbers here since there might be some confusion/misunderstanding.
Number of SMs of each full chip and how much more SMs the larger chip (left) has over the smaller (right).
AD102 - 144. AD103 - 84. =71% more.
GA102 - 84. GA104 - 48. =75%.
GA102 - 84. GA103 - 60. =40%.
TU102 - 72. TU104 - 48. =50%.
GP102 - 30. GP104 - 20. =50%.
GM202 - 24. GM204 - 16. =50%.
GK110 - 15. GK104 - 8. =87.5%.
GF100/110 - 512. GF104 - 384. 50%. I used FP32 Shaders here, as the FP32 per SM is not uniform for Fermi.
Well, yeah, the point here is likely that AD102 turned out to be better than expected which created a bigger gap between it and the chip below it in the stack forcing to clock this chip out of its best efficiency range.
Units wise though the differences between all top end and the next to it chips in Nvidia lineups are fairly the same, the only kinda recent exception here was with GK104 vs GK110.
I also kinda wonder if N5 pricing makes it impossible for Nvidia to use AD102 for a 4080 product like they did with GA102 on 8N.
There has been different messaging regarding this from what I understand. It's partially that it was viably but not ideal for Nvidia that they cutdown GA102 as much as they did for RTX 3080. As well as that 8N yields were such that it made sense to offer a GA102 config cut down that much to to preserve overall yields.
With ADA and TSMC it could similarly be both. Cutting down AD102 any further is effectively just throwing away essentially too many transistors that are otherwise viable, as actual yields are relatively good. Combined with that the cost profile means you need to sell it above a certain point to sense.
A 116 SM AD102 would mathematically at least be the same cut ratio as that of RTX 3080 with GA102. But if the yield rate for 116 SM vs. the 128 SM rumor is actually that much better, you actually aren't gaining much on the cost side to use that as a significantly cheaper config. It could be that 128 SM out of 144 is already in that fairly optimal yield zone.
The other factor here is they were able to save costs by offering RTX 3080 with only 10GB VRAM to fit into that $700 price point. While now in 2022 with where every other product sits even 12GB would look out of place. So they have to at least double up memory this time. Not to mention the issue in that if they are going GDDR6 18Gbps, are there even 1GB chips at that speed?
There would also be a market segmentation issue beyond just actual costs. RTX 3080 10GB $700 and RTX 3090 24GB $1500 at least had VRAM mostly as a big difference to segment them. But a 128 SM RTX 4080 20GB at $700 and say even 144 SM RTX 4090 at $1500 would look even more frivolous for the latter, with almost no practical VRAM difference and an even smaller perf diff. Which means you'd be essentially losing out (on the business side) from both ends (less margin and/or sales for the 4090).