Intel is able to disable hyperthreading from all middle class products because there's no competition. If there was more competition, it would make little sense to purposefully disable working features. Hyperthreading has very small amount of dedicated extra hardware. It is highly unlikely that most Intel chips have defects in HT transistors. If we had more competition i5 would definitely have hyperthreading as well (i3 and Celeron would propably be the defective chips).Just think of Core i5 versus i7 ...
There were two Raven Ridge's as I recall. I'd have expected different names, but an interposer and HBM seem a likely difference. Assuming that list is the B stock binned parts it may be reasonable. Withholding the best parts for professional lines is plausible and the premium unlocked cores they may want for larger APUs. I wouldn't be surprised if they were binning already. The real question is how Zen is packaged? I'd think a part with HBM and an interposer would be a different design just to accommodate packaging.More likely the diagram is just as fake as the one above it - especially since it already has Raven Ridges on it when they're nowhere near being released
(edit: also, at least rumors suggest Raven Ridge would feature up to 16 CUs, while the diagram says max 8)
The "HPC APU" with Zeppelin CPU, separate GPU and HBM one one package wasn't supposed to be Raven Ridge, but completely separateThere were two Raven Ridge's as I recall. I'd have expected different names, but an interposer and HBM seem a likely difference. Assuming that list is the B stock binned parts it may be reasonable. Withholding the best parts for professional lines is plausible and the premium unlocked cores they may want for larger APUs. I wouldn't be surprised if they were binning already. The real question is how Zen is packaged? I'd think a part with HBM and an interposer would be a different design just to accommodate packaging.
And that's probably only for the desktop parts. Notebook parts are... well... more messy.Just think of Core i5 versus i7 ...
Disabling AVX/2/FMA3 is just one of the most boneheaded fucking things Intel's ever done.
"Let's introduce a new instruction set to speed up certain types of problems!"
"Okay, then we disable it on the low end stuff to try and force people to buy the higher end stuff!"
*Eight years later*
"So, why isn't this getting more use?"
Yes, I know that's not the only reason why instruction sets are slow to pick up on and have been for a long time, but it is a major component.
This is probably done to shave off few more watts, since those wide and feature-heavy FP ALUs are not really optimized for low-power operations. Even the big multi-core Xeons have to enter in a lower performance state when running AVX/FMA code. The mobile SoCs already come with a plethora of dedicated logic (DSP, ISP, decoders, IO offloading) for specific consumer tasks and at the same time those blocks are much more manageable and power efficient anyway. Why keep a (mostly) redundant programmable logic warmed up, just to process a hastily written video decode loop once in a while, if this can be done by the integrated video decoder that's an order of magnitude more efficient.Disabling AVX/2/FMA3 is just one of the most boneheaded fucking things Intel's ever done.
AMD have, but you have to go back quite a way. The first K8-based Semprons had AMD64 disabled.I also wouldn't expect AMD to disable features (AVX2, TSX, etc) in their low end models. They have never done that. Intel is disabling AVX in their low end Skylake Celerons. SSE4.2 only.
This is probably done to shave off few more watts, since those wide and feature-heavy FP ALUs are not really optimized for low-power operations. Even the big multi-core Xeons have to enter in a lower performance state when running AVX/FMA code. The mobile SoCs already come with a plethora of dedicated logic (DSP, ISP, decoders, IO offloading) for specific consumer tasks and at the same time those blocks are much more manageable and power efficient anyway. Why keep a (mostly) redundant programmable logic warmed up, just to process a hastily written video decode loop once in a while, if this can be done by the integrated video decoder that's an order of magnitude more efficient.
Exactly. Make the SIMD narrower in low end models or artificially limit the performance, but do not disable the features. AVX(1) is still not used widely in games, and the reason is that there's too many CPUs around that do not support it. TSX is also a great feature, but it is disabled in low end models. People are not going to write two versions of their threading synchronization primitives (one for TSX and one without). When TSX has good enough coverage people start using it. Disabling it from low end models currently makes no sense. TSX is only used in some HPC applications. There's no additional value currently to consumers -> no consumer is going to select a more expensive model because lack of TSX. However disabling TSX of some consumer parts greatly hurts the adaptation of TSX.I'm too displeased of this ISA "segregation", but apparently the SoC mentality of dedicated block integration is prevailing here and the general purpose logic is being sidestepped. I mean, Intel is already keeping some ultra-wide SIMD extensions exclusive to their HPC and server SKUs, so why not cut some fat on the other side of the spectrum (mobile)?
Intel could've opted for a new design with narrower and power-efficient SIMD ALUs while keeping all the ISA extensions intact, but probably the cost-benefit of supporting yet another architecture branch is not there.
I wonder if the Intel compiler actually enables AVX for it now?AMD handled AVX perfectly in Jaguar.
Exactly. Make the SIMD narrower in low end models or artificially limit the performance, but do not disable the features. AVX(1) is still not used widely in games, and the reason is that there's too many CPUs around that do not support it. TSX is also a great feature, but it is disabled in low end models. People are not going to write two versions of their threading synchronization primitives (one for TSX and one without). When TSX has good enough coverage people start using it. Disabling it from low end models currently makes no sense. TSX is only used in some HPC applications. There's no additional value currently to consumers -> no consumer is going to select a more expensive model because lack of TSX. However disabling TSX of some consumer parts greatly hurts the adaptation of TSX.
AMD handled AVX perfectly in Jaguar. They supported full AVX instruction set with their narrow 128 bit SIMD. AVX instructions are split to two internal 128 bit instructions and thus run at half rate. The key difference is that Jaguar still runs AVX code. Intel ATOM on the other hand is limited to SSE4.2. And same is true for low end Skylake Celerons and Pentiums.
It is complex. IIRC AMD tried to bring similar extensions to Bulldozer, but no luck either. And no rumors that Zen is going to support it either. IBM has great tech in this field. More complex memory versioning and huge EDRAM based LLC. IIRC Intel's implementation is L1 cache only, so transaction size must be very small. IIRC IBM also uses their tech for speculative execution. Way ahead of Intel.Seriously, though. With this many issues, TSX has to be insanely complex to implement in the hardware.
I think that TSX might be less impressive than assumed in the scenarios that are typically of interest on this forum,. For example, software in this neck of the woods would have probably already migrated to a somewhat granular locking policy...or at least one would hope so. In general, one would be excused for assuming that the incentive for Intel to aggressively push TSX is low, IMHO.