PDA

View Full Version : Performance Analysis


seanyin
12-Sep-2008, 04:52
Hi, I am trying on the spu_timing tool with the following sample:

px = body.px;
py = body.py;
fx = body.fx;
fy = body.fy;
vx = body.vx;
vy = body.vy;
m_inv = body.m_inv;


px = spu_add(px, vx);
py = spu_add(py, vy);

vx = spu_madd(fx, m_inv, vx);
vy = spu_madd(fy, m_inv, vy);


body.px = px;
body.py = py;
body.vx = vx;
body.vy = vy;

I was expecting that the compiler would generate well-separated load, calculation, and store assembly code. But the result has those instructions mixed as follows:

lqd $2,256($sp)
stqd $2,432($sp)
lqd $4,272($sp)
stqd $4,416($sp)
lqd $4,320($sp)
lqd $5,336($sp)
lqd $6,288($sp)
stqd $6,464($sp)
lqd $7,304($sp)
stqd $7,448($sp)
lqd $3,352($sp)
nop
lqd $6,432($sp)
lqd $7,464($sp)
fa $2,$6,$7
stqd $2,432($sp)
lqd $6,416($sp)
lqd $7,448($sp)
fa $2,$6,$7
stqd $2,416($sp)
lqd $6,464($sp)
fma $2,$4,$3,$6
stqd $2,464($sp)
lqd $7,448($sp)
fma $2,$5,$3,$7
stqd $2,448($sp)
nop
lqd $2,432($sp)
stqd $2,256($sp)
lqd $4,416($sp)
stqd $4,272($sp)
lqd $6,464($sp)
stqd $6,288($sp)
lqd $7,448($sp)
stqd $7,304($sp)
lqd $2,224($sp)
ai $4,$2,1
lqd $2,224($sp)
cwd $3,0($sp)
shufb $2,$4,$2,$3
stqd $2,224($sp)


Could anyone enlighten me on why I got this kind of result? Thanks!
By the way, I'm using -O0.

Vitaly Vidmirov
12-Sep-2008, 10:09
Could anyone enlighten me on why I got this kind of result? Thanks!
By the way, I'm using -O0.
You just answered your own question.

seanyin
15-Sep-2008, 02:32
Thanks for your reply. I understand -O1 will be able to reorder the instructions as expected.
I am just wondering if it is possible to do it without any compiler optimization.

Vitaly Vidmirov
15-Sep-2008, 16:43
Instructions reordering is an optimization.
You either use it or not.
-O0 must guarantee correct execution and generate easy to debug code.