PS2.0 and instruction co-issue

Colourless

Monochrome wench
Veteran
Got a few questions about instruction co-issue on the R300 when using PS2.0. Since you can't explicitly co-issue instructions when using PS2.0, I'm guessing the drivers will optimize the 'trivial' cases for co-issue, such as
Code:
mul r1.rgb, r2, r3
add r4.a, r5, r6

Now, I'm wondering when exactly Co-issue does occur. Does it occur between any vector and scaler instruction, regardless of the write masking. So does this co-issue?
Code:
mul r1.rba, r2, r3
add r4.g, r5, r6

If it does, then does replication make a difference, such as would the following co-issue?
Code:
mul r1.rba, r2.g, r3.a
add r4.g, r5.r, r6.a

Finally, if that does co-issue can the vector pipe be used as a second pseudo scaler pipe so the following would co-issue?
Code:
mul r1.r, r2.g, r3.a
add r4.g, r5.r, r6.a

If no one knows, I'll probably investigate this myself. While it's not important, it could still be something useful to know when hand optimizing pixel shaders.
 
ATI say:

- You do not need to pair instructions. The shader optimizer will do this for you.

- You can use different registers for the color and alpha part.

- If you use the alpha part of a register with a color instruction the instruction ist not pairable.

- If you use alpha in the writemask for a color instruction the instruction ist not pairable

- If you use something else as the alpha part of a register with a alpha/scalar instruction the instruction ist not pairable

- if you use something else as alpha in the writemask for a alpha operation the instruction ist not pairable

- RCP, RSQ, EXP and LOG allways run on the scalar pipe. If you dont not use only the alphapart of a register (writemask too) it will use the color pipe too.

To make a long story short. Every time you transfer register parts from one pipe to the other (RGB -> Scalar-Pipe -> RGB or A -> Color-Pipe -> A) the instruction need both pipes to execute.
 
RADEON 9500/9700 chips have dual-pipe pixel shader units, which operate as two relatively independent engines performing calculations on the different entities. One engine operates on 3D vectors or RGB-colors and the other on scalar or alpha values. This means that in most cases two instructions, one operating on the color and another operating on alpha can be performed at the same time. Such architecture provides a perfect opportunity for optimizing shaders by splitting the computational workload between pipes and thus resulting in up to twofold speedup. Careful examination of a shader for splitting the workload between the pipes should focus on a couple of things – identifying computations that can be executed only in one pipe (vector or scalar) and balancing number of instructions in each pipe. Sometimes scalar or alpha computations can be executed in the color pipe and the other way around, the color computations can be executed in the alpha pipe.

Explicit instruction co-issue in pixel shaders is available only in the older shader models. However, this does not mean that the benefits of instruction pairing can be enjoyed only in the older pixel shader models. On the contrary, the full benefit of instruction co-issue can be achieved in 2.0 pixel shaders with some clever shader programming. In 2.0 pixel
shader model, write masks can be used to implicitly indicate opportunity for instruction pairing. The shader optimizer in RADEON 9500/9700 drivers will look for write masks to determine which pipe should execute instruction and will try reordering and coissuing instructions.

There are some nuances the shader developers have to be aware of when optimizing shaders for instruction co-issue. The color and alpha parts of the instruction pair can reference different registers, however attempting to access alpha values in color instruction or to access color values in alpha instructions might break co-issue. This also applies to .ABGR or .WZYX swizzles available in 2.0 shaders as they force data to cross vector and scalar pipes.

Another important fact is that RCP, RSQ, EXP and LOG instructions are always executed in the scalar pipe. For that reason it is better to always use scalar arguments and destinations (.W or .A) when using these instructions. This will ensure the vector pipe is available for co-issue with these instructions.

http://mirror.ati.com/developer/dx9/ATI-DX9_Optimization.pdf
 
Demirug said:
Every time you transfer register parts from one pipe to the other (RGB -> Scalar-Pipe -> RGB or A -> Color-Pipe -> A) the instruction need both pipes to execute.

attempting to access alpha values in color instruction or to access color values in alpha instructions might break co-issue.

I read it as it might need both pipes.
 
Ah ok, looks like all my q's have been answered, for better, or for worse as is the case.

Thanks for the answers
 
Hyp-X said:
Demirug said:
Every time you transfer register parts from one pipe to the other (RGB -> Scalar-Pipe -> RGB or A -> Color-Pipe -> A) the instruction need both pipes to execute.
attempting to access alpha values in color instruction or to access color values in alpha instructions might break co-issue.
I read it as it might need both pipes.
And you are right.
 
In reply to the original questions: actually, I think that apart from the first, the others are just about identical.

As to whether they coissue: if the destination was alpha, the answer would definitely be yes. With the destination not being alpha, it's a little fuzzier, it depends on the surrounding code and how scalar operands get used. Try to use alpha if you can, and if you can't, just let the optimiser (which is getting smarter every day) worry about it.
 
Back
Top