and I can't find anything more salient. I don't know if AVX increases throughput for integer SIMD code, per se.
I rechecked the descriptions for SB's integer SIMD units, and the current implementation does not have wider integer operations, rather there is a promise for more at some later date.
There are two integer SIMD blocks for both BD and SB. BD does have an integer FMAC on the first FP port.
One SIMD block is on the store pipe, which may cut into how often it can be used. I am not sure whether the XBAR unit covers integer permutes as well.
The downside to the FMAC and XBAR is that their ops are neither SSE nor AVX.
On the other hand, a few codecs could be made with code paths to use XOP and FMA.
Otherwise, it seems like rough parity per-clock, though the store port sharing may rear its head.
As for consumer apps, I'm wondering which of them makes such intensive use of FP, specifically, that they will lead to BD being considered "meh", or to AVX being rapidly-implemented.
Consumer apps have a problem in that they are poorly threaded and like single-threaded performance better. The BD FPU is higher latency and its read/write capability is not better than a single core.
It becomes "meh" because Zambezi is a server chip that will be tangling with a mid-range desktop SB.
How much game code is SIMD-FP limited on the CPU?
For many games, it is more of a question if SIMD shows up at all.
The integer pipelines and single-threaded performance would matter more in the client space for games, which does not look like it favors BD.
On the other hand, I believe graphics drivers do leverage SIMD instructions, of what flavor I am unsure.
edit:
With regards to x264, it seems the Sandy Bridge preview thread has some chatter about using the special-purpose hardware in SB for the codec.
This is a lateral move around the SSE/AVX debate, apparently.
It does not change that I had erroneously assumed AVX had doubled the Int SIMD side as well, or my misclassification of the FP needs of video codecs.