Should we expect the AVX base clock to drop further (relative to normal base clock) with the upcoming introduction of AVX-512?One item of note is that running with the base clock for a Haswell Xeon in AVX is not what is on the spec page.
http://www.anandtech.com/show/8423/intel-xeon-e5-version-3-up-to-18-haswell-ep-cores-/5
The 2699 has a base of 1.9 if it detects AVX.
This is time-based throttling as well, rather than a check of the buffers. The altered status can hang around for a millisecond, which is absolutely glacial relative to the cores themselves. It sounds like a physical/electrical consideration and not as effective as Intel's management tends to be. This seems to point to greater difficulty in satisfying the disparate demands high-performance generalist cores are being saddled with, at least for current cores.
Yeah there's really no magic here guys - it takes a lot more power to be doing all that extra SIMD stuff and the vast majority of code barely touches the throughput of Haswell, etc. Thus rather than constrain *all code* to lower frequencies, it makes more sense to only constrain code that is actually using a significant portion of the machine (i.e. AVX2). I agree that it is non-ideal that it takes so "long" to transition between the states, but presumably there's a fair amount of complexity there. I don't believe normal turbo bin transitions happen at a much higher rate than that anyways, correct?What I found interesting is how much hotter my haswell-e gets running avx2 code (I think).
I'm not clear on whether client versions of Skylake will have AVX512, at least for now it is a Skylake Xeon and Xeon Phi extension.
Skylake is a different architecture, so the circumstances that lead to this specific mode could be significantly different in later chips. One possibility is that the different clock modes and the apparent use of AVX512 in Xeon/HPC chips points to a more complex trade-off than Intel found worthwhile in the client space.
At least Socket 1150 Haswell boosts voltage whenever it encounters AVX2 code.What I found interesting is how much hotter my haswell-e gets running avx2 code (I think).
Yeah there's really no magic here guys - it takes a lot more power to be doing all that extra SIMD stuff and the vast majority of code barely touches the throughput of Haswell, etc. Thus rather than constrain *all code* to lower frequencies, it makes more sense to only constrain code that is actually using a significant portion of the machine (i.e. AVX2). I agree that it is non-ideal that it takes so "long" to transition between the states, but presumably there's a fair amount of complexity there. I don't believe normal turbo bin transitions happen at a much higher rate than that anyways, correct?
extra cost
Particularly for gaming, aren't we essentially limited by the console focus? Intel has leapt so far beyond previous and current gen that it doesn't seem like multicore will be an issue until the end of time.
DX12/Vulcan etc looks to alleviate CPU pressure as well...
?
Nah, there's really no benefit there. A single core at double the frequency is pretty much strictly superior to two cores at half the frequency in terms of raw performance so even if you were extremely generous to the IPCs of consoles and pretend that games can actually use all the 8 cores and don't really touch floating point operations, a 3Ghz quad core is still going to run circles around the console CPUs. In practice, even a dual core has no problems competing given the large gap in IPC, cache, SIMD, etc.This is one of the reasons why I'd like to see 8 cores become mainstream. Presumably being able to map a single core to single core would be easier for developers and beneficial from a performance pov, especially when we have DX12 which should offer similar scaling to consoles.
In this case I don't think there's any direct comparison to be made between VR and consoles here to be honest. For VR you do whatever you can afford while still hitting the relevant performance. There are likely to be few if any compelling VR experiences that are "ported" from non-VR, fewer from consoles and basically zero that are ported naively enough to not receive major modifications when running in VR anyways. VR content really needs to be designed directly for VR as the constraints across the board from game design to rendering tradeoffs and choices are very different.Obviously it doesn't take a hyperthreaded 4Ghz Skylake core to match a 1.6 Ghz Jaguar core but when VR hits we're going to want to be pushing 30fps console games at a locked 90fps, possibly with extra CPU intensive effects applied.
In the era of multi-patterning, CPUs made at the newest processes need sales in the millions to amortize the initial costs. The initial costs of $5M I used are seriously lowballing it. If Intel could actually make money by selling you a different kind of top-end CPU, they would do it.
Broadwell IPC is roughly 2x compared to Jaguar. 3.5 GHz Broadwell is 2x clock rate compared to 1.75 GHz Jaguar. Jaguar has twice the number of cores. One Broadwell core (at 3.5GHz) is roughly as fast as 4 Jaguar cores. As you said, a high clocked dual core Broadwell pretty much matches the 8 core Jaguar when both are running perfectly threaded code. A quad core Broadwell would be twice as fast as the 8 core Jaguar.Nah, there's really no benefit there. A single core at double the frequency is pretty much strictly superior to two cores at half the frequency in terms of raw performance so even if you were extremely generous to the IPCs of consoles and pretend that games can actually use all the 8 cores and don't really touch floating point operations, a 3Ghz quad core is still going to run circles around the console CPUs. In practice, even a dual core has no problems competing given the large gap in IPC, cache, SIMD, etc.
Yes, game design is a problem for CPU scaling. You don't want to have more enemies and/or NPCs on more powerful CPUs. However frame rate scaling from 30 fps (dual core mobile = equal to console fps) to 60 fps (gaming desktop) to 90 fps (VR) already triples the CPU cost. Scaling to ultra settings increase the CPU cost on top of that. It is highly probable that some VR games (at the end of this console generation) benefit greatly from a 8 core CPU. Of course these games also require monster GPUs to run properlyPersonally I'd love to see more games make use of the vast amount of CPU power that is actually available on quad core PCs these days already (let alone on 6-8 core machines), but realistically it's just not going to happen unless games sort out a better way to scale CPU load that is acceptable from a design perspective. That is really what is at a crux of this matter for gamers - if your game needs to run at all on a dual core machine and a console it's likely not going to be able to scale up to "max out" a quad, let alone more cores of similar speeds.