Intel's model is explicity based on the notion of scalar issue. It says so right in the slides!! It's an integral part of the model.
"First step is to “scalarize” the code. Remember that each “scalar” op is doing 16 things at once"
The dichotomy is scalar versus packed, and that is with regards to data layout in the operands, either in memory or in the registers.
In the case of a reg/mem architecture like Larrabee, the two can exist side by side.
It doesn't make a difference if v1 has values that all come from the same primitive, and then that register is scaled by some arbitrary value.
It's one primitive, not 16, but the ALUs and issue logic do not care.
Perhaps, but you're still using AOS to refer to data layout. The LRB presentation doesn't concern itself with that. (We are still talking about the LRB presentation right?)
I find it hard to believe AOS and SOA don't apply to data layout when the presentation goes through all those diagrams showing how the data is laid out in the registers.
I'm not doing that. Intel did. Don't get hung up on pedantic definitions of the terms. Notice how they make an explicit differentiation between the execution and the format?
VLIW is an specific term that applies to a specific realm. I don't think it's pendantic to say that VLIW--which applies to how instructions are organized, decoded, and issued, and SOA or AOS--which applies to how the data is handled, are orthogonal to one another.
There's nothing inherent to a design that uses VLIW instruction issue that keeps it from running a program than handles data in SOA or AOS format.
"SOA or “scalar”: a register holds XXXX" - Execution
"And the data is usually not in an SOA-friendly format" - Storage
Slides 9,10 and 11 are explicity demonstrating that even data stored as AOS is processed as SOA
Can't get much clearer than this:
No, it says that it can be done that way.
Slide 6 outlined two areas where it is appropriate to use AOS packed registers.
Instruction issue in either AOS or SOA programming scenarios is the same for Larrabee.
Barring implementation-related wrinkles for RV770 unrelated to it being VLIW, the same can be said for the GPU SIMD.
"Data is usually not in a friendly format! Larrabee adds scatter/gather support for reformatting"
"Allows 90% of our code to use this mode. Very little use of AOS modes"
Larrabee doesn't have a completely separate instruction issue mode for the 10% of code that can't use SOA.
It issues instructions in the exact same way, scalar issue of instructions once per clock. It just so happens that the operand registers are 16-wide SIMD registers that just so happen to have data entered in an SOA format.
edit: (issue is scalar for vector math because there is only one vector unit disclosed)
A SIMD in RV770 has superscalar issue of 5 statically compiled instructions once per clock.
Superscalar issue has nothing to do with the relationship or lack thereof between data elements within the operands.