If I'm reading the diagram correctly it looks like each BD module has the same FMAC throughput as FADD/FMUL. Great for future workloads, but not so hot for today's software if I'm not mistaken.
It is the same as present day hw, so no better or worse for today's sw. However, it has less fp/core than Sandy Bridge, so not so good for future sw.
However, I can't see much use for AVX in consumer apps though even going forward. There's going to be dedicated video decode hw on _every CPU _ in that time range.