Yes, this is what Haswell's implementation of AVX2 and FMA will be capable of. It hasn't been officially confirmed yet, but it's easy to deduce as the only logical answer.Is this actually what AVX will have or just a guess at this point. 1Tflop SP from an 8 core x86 would be pretty impressive!
We know for a fact Haswell will support FMA, and we also know Sandy Bridge has a separate ADD and MUL execution unit. They can't go for a single FMA unit with Haswell, since that would dramatically cripple legacy performance. They also can't go for an ADD+FMA or MUL+FMA combination, because then the same port is needed by MUL and FMA or ADD and FMA respectively, and with typical Instruction mix frequencies this actually results in lower performance due to port contention!
So under the safe assumption that they want the extra transistors to pay off, the only sane option is dual FMA units. This also simplifies scheduling. And note that Bulldozer already has dual FMA (even though it's 128-bit each, note that it's on 32 nm).
This also isn't all that incredible compared to what we've come to expect from GPUs. And Intel clearly is putting a lot of Larrabee's technology into AVX2.
AMD is clearly aiming to hit 4 GHz sooner rather than later. Regardless of superior IPC, the market will demand Intel to follow suit (or steal their thunder). Also for what it's worth 3.9 GHz would actually suffice for 500 GFLOPS out of a quad-core, and we're at 3.8 GHz Turbo Boost frequencies already.Ive no doubt Haswell will be capable of hitting 4 Ghz but I doubt Intel will clock it that high given the lack of competition. I'm fairly sure intel could have been releasing stock 4ghz CPU's since Sandybridge if they'd have felt the need.
Last edited by a moderator: