Playing devil's advocate a bit, but I don't agree with you that downclocks would erode the point of using the TX2, although this is completely sidestepping the question of whether or not the extra price would (pretty much unknown to us) Switch is very heavily power limited in handheld mode (and decently in docked mode). Just a straight decrease in power consumption in handheld mode would have been a pretty big deal, a ~3 hour battery life is pretty bad. On the flip side, the 128-bit memory interface (downclocked by about half in handheld mode to try to come closer to the 64-bit power consumption) would have helped keep performance from getting totally hamstrung in docked mode.
I meant erode rather than eliminate, as the lower you have to clock to fit the power envelope then the closer to TX1 performance - compared to the 7.5W TX2 performance touted earlier - you get and the more you're paying per unit of performance (however you measure that). After all, lower clocks save power but don't particularly affect the cost of the chip (unless you're after the higher clocking bins or have very tight voltage tolerances).
As I understand it, there's also a point at the very power focused end where going wider and slower becomes counter productive, although I have no idea if Switch would have been there with TX2. I just think it's easy to look at the TX2 7.5W figures and say "that should have been in Switch" when it may not have helped mobile configuration much, and would have raised BOM significantly.
A 16nm TX1 otoh would have reduce power, increased frequency and given Nintendo more flexibility. Though the cost of custom 16nm chips can easily be into the hundreds of millions of dollars, from what I've read. That's quite a gamble.
TX1 would have benefited from more BW in docked mode of course, but rather than a double with bus that might impact mobile mode power draw perhaps using faster memory that could go above 3200 when docked would have been a good idea. There are already upper/mid phones using above 3200, and Samsung have supposedly had 4266 chips since 2015 ....
The interesting question is exactly what Nintendo would have done with the two Denver cores, if developers would find such a wildly heterogeneous setup desirable. I guess it really depends on their efficiency. If the two Denvers can tend to supply better performance than three A57s at less power, which is at least within the realm of possibility, then they'd probably be a win. Then the OS functions could stay on the A57 cluster and use as little clock speed and as few cores powered as it can get away with.
In the embedded development kit the Denvers are disabled at 7.5W, in preference for the A57s. I am presuming (though don't know) that perf/Watt is better for the A57s at the low end, but that absolute performance of the Denvers is better when less power constrained.
Given that Nintendo have the same CPU configuration for both mobile and docked, would it even have been possible for Nintendo have used the Denver cores? If so, why have Nvidia disabled them in the 7.5W configuration?
I really would like to see where TX2 ends up. It looks like the perfect chip for a 20W console. But ... that not currently a market niche.