inefficient
Veteran
It is interesring that the VMX128 has a dotproduct instruction.
I wonder what the fastest way to do a dot product on a SPU would be. The straightforward way would by 5 instructions by my estimation. But there must be a faster way.
mul
rotate quad 2 bytes
add
rotate quad 1 byte
add
I wonder what the fastest way to do a dot product on a SPU would be. The straightforward way would by 5 instructions by my estimation. But there must be a faster way.
mul
rotate quad 2 bytes
add
rotate quad 1 byte
add