I know that it's not their only goal, however my point is that such a move would likely take up a pretty significant portion of the transistor budget. Granted, they have a LOT of die real estate to play with, given how incredibly tiny they've made their processors over the past few generations. However, I am just curious as to how they're going to balance things out. I think it's pretty clear that AVX2 was a pretty big contributor to the TDP increase from Ivy to Haswell. AVX3 would further raise peak power, at least compared to a theoretical Skylake with 256 bit AVX. Overall power should be down, unless clock speeds increase significantly.
That's actually another question I have. I know that Intel's 14nm process details haven't been released yet (please, please IEDM 2013), but will we see increased clock speeds at the high end? When Intel moved to FinFETs, they actually treaded water with transistor performance at the voltages over 1V, and regressed when past a certain point. Theoretically, this penatly only applies once -- after that, "traditional" electrical performance should come back at the high end. In other words, 14nm FinFETs would out perform 32nm planar at the higher end of the voltage spectrum, despite the penalty associated with multigate devices. [sources available upon request]
So, unless we see more changes that reduce power consumption at the expense of performance at the higher end of things, 14nm should allow for higher clock speeds on the desktop. Not much, but maybe a couple 100 MHz bumps.
Back to AVX3, am I correct in assuming that cache sizes would need to increase in order to provide more bandwidth to the execution engines? If so, we could theoretically see pretty decent performance increases from going from 32KB L1D to 64KB, and from 256KB L2 to 512KB, right? Maybe a small handful of precentage points (2-5%, ish)?
I've been hypothesizing that Skylake could bring pretty substantial core changes. For one, the Israeli Intel team is likely to be at the helm of Skylake. Two, Intel has very small dies -- Broadwell is less than 115 mm2. Intel hasn't had a die that small since Cedar Mill, and if we exclude Cedar Mill, it is likely the smallest "mainstream" die since the Pentium III. Three, Haswell was a relatively conservative update on a CPU level.
Even if Intel integrated the PCH (which would take up less than 30 mm2 on 14nm), there would be tons of room for more to be added. Intel could theoretically create 6 core parts at a reasonable cost. Unlikely, but feasible. We could see L4 get integrated, which could have interesting implications. We could see more fixed function hardware. Perhaps we could see more in the way of power supply being integrated.
Or, we could just see Intel keeping things conservative, as they have the past two generations. I don't really know what the best option would be as far as keeping ARM at bay, but I'd imagine that Skylake is crafted to do just that.