Performance Analysis

Discussion in 'CellPerformance@B3D' started by seanyin, Sep 12, 2008.

  1. seanyin

    Newcomer

    Joined:
    May 8, 2008
    Messages:
    4
    Likes Received:
    0
    Hi, I am trying on the spu_timing tool with the following sample:

    Code:
    		px	=	body.px;
    		py	=	body.py;
    		fx	=	body.fx;
    		fy	=	body.fy;
    		vx	=	body.vx;
    		vy	=	body.vy;
    		m_inv	=	body.m_inv;
    
    
    		px	=	spu_add(px, vx);
    		py	=	spu_add(py, vy);
    
    		vx	=	spu_madd(fx, m_inv, vx);
    		vy	=	spu_madd(fy, m_inv, vy);
    
    
    		body.px	=	px;
    		body.py	=	py;
    		body.vx	=	vx;
    		body.vy	=	vy;
    I was expecting that the compiler would generate well-separated load, calculation, and store assembly code. But the result has those instructions mixed as follows:

    Code:
    lqd	$2,256($sp)
    stqd	$2,432($sp)
    lqd	$4,272($sp)
    stqd	$4,416($sp)
    lqd	$4,320($sp)
    lqd	$5,336($sp)
    lqd	$6,288($sp)
    stqd	$6,464($sp)
    lqd	$7,304($sp)
    stqd	$7,448($sp)
    lqd	$3,352($sp)
    nop
    lqd	$6,432($sp)
    lqd	$7,464($sp)
    fa	$2,$6,$7
    stqd	$2,432($sp)
    lqd	$6,416($sp)
    lqd	$7,448($sp)
    fa	$2,$6,$7
    stqd	$2,416($sp)
    lqd	$6,464($sp)
    fma	$2,$4,$3,$6
    stqd	$2,464($sp)
    lqd	$7,448($sp)
    fma	$2,$5,$3,$7
    stqd	$2,448($sp)
    nop
    lqd	$2,432($sp)
    stqd	$2,256($sp)
    lqd	$4,416($sp)
    stqd	$4,272($sp)
    lqd	$6,464($sp)
    stqd	$6,288($sp)
    lqd	$7,448($sp)
    stqd	$7,304($sp)
    lqd	$2,224($sp)
    ai	$4,$2,1
    lqd	$2,224($sp)
    cwd	$3,0($sp)
    shufb	$2,$4,$2,$3
    stqd	$2,224($sp)
    
    Could anyone enlighten me on why I got this kind of result? Thanks!
    By the way, I'm using -O0.
     
  2. Vitaly Vidmirov

    Newcomer

    Joined:
    Jul 9, 2007
    Messages:
    108
    Likes Received:
    10
    Location:
    Russia
    Could anyone enlighten me on why I got this kind of result? Thanks!
    By the way, I'm using -O0.

    You just answered your own question.
     
  3. seanyin

    Newcomer

    Joined:
    May 8, 2008
    Messages:
    4
    Likes Received:
    0
    Thanks for your reply. I understand -O1 will be able to reorder the instructions as expected.
    I am just wondering if it is possible to do it without any compiler optimization.
     
  4. Vitaly Vidmirov

    Newcomer

    Joined:
    Jul 9, 2007
    Messages:
    108
    Likes Received:
    10
    Location:
    Russia
    Instructions reordering is an optimization.
    You either use it or not.
    -O0 must guarantee correct execution and generate easy to debug code.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...