Actually I should clarify - what the two CPUs I mentioned do is in fact having 'larger' registers.
Ie. on PS3/360 you would write matrix*vector multiplication something like
(this is purely symbolic code, so don't complain about syntax or anything
)
Code:
permute vec_xxxx, vec
mul result,vec_xxxx,mat01
permute vec_yyyy, vec
madd result,vec_yyyy,mat02
permute vec_zzzz, vec
madd result,vec_zzzz,mat03
permute vec_www, vec
madd result,vec_www,mat04
Where each operand is a vector register.
On DC/PSP it would look like
Code:
mulvecmat result,vec, Mat
Where 'result' and 'vec' are vector registers, and Mat is a matrix register.