Shader swizzles and write-mask confusion

Discussion in 'Tools and Software' started by Ethatron, Apr 9, 2011.

  1. Ethatron

    Regular Subscriber

    Joined:
    Jan 24, 2010
    Messages:
    922
    Likes Received:
    361
    I found the shader-assembly reference not quite clear, where the exact difference between lane-masking and assignment applies. I got some like this:

    add r7.yz, r7.x, c2.xxyw

    // lanes:
    add r7.x, r7.x, c2.x -> masked
    add r7.y, r7.x, c2.x
    add r7.z, r7.x, c2.y
    add r7.w, r7.x, c2.w -> masked

    I assume r7.x is recognized as scalar, and internally put into this form:

    add r7.yz, r7.xxxx, c2.xxyw

    But what about this?:

    add r7.yz, r7.xy, c2.xxyw // illegal?

    Will it be:

    add r7.yz, r7.xyyy, c2.xxyw

    or is it even illegal syntax?

    Thanks for any clarification.
     
  2. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    With swizzles, the last given swizzle is replicated across the remaining channels. Thus "xy" is equivalent to "xyyy", "wz" is equivalent to "wzzz", etc.

    Write masks are, just that, a mask so you can't change the order of them.
     
  3. Ethatron

    Regular Subscriber

    Joined:
    Jan 24, 2010
    Messages:
    922
    Likes Received:
    361
    The opcodes only have scalar or 4-vector form, even dot3 have 4-vectors parameters given. Did you know .xxy is extended by .y by deduction, or because you have some explicitly stating documentation of the assembler about it?

    The vaguesness or rather incompleteness in MS' documentation crept up once you try to turn assembly into expressions:

    add r0.xw, r1.xyyy, r2.xywz

    which could have been:

    r0.xw = r1.xy[yz] + r2.xy[wz] // naive swizzle, r1&r2 are truncated float2
    r0.xw = r1.xy + r2.xy // then demultiplexed

    which is wrong, it must be:

    r0.x??w = r1.x[yy]z + r2.x[yw]z // lane masked swizzle
    r0.xw = r1.xz + r2.xz // then demultiplexed

    This distinction beween the ways of treating basically the same expression (asm isn't that different an expression) in asm. vs. HLSL is nowhere to be found explained; or maybe hard to come up with a sensefull search-term for this, if there are explanations in peoples blogs.
    Maybe it only crops up when you go asm. -> HLSL, maybe humans never even really leverage the swizzle in the first place. I can't even really tell how I managed to get the idea the naive form is wrong without ever executing the translated codes (this is my first hard contact with shader assembly & HLSL & DX as a whole BTW).
     
  4. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    I know this because I've worked on this stuff for 9 years ;) I can't recall where the documentation is, but I assume the D3D9 docs have it some where.

    There is no ambiguity here. DX asm code is inherently a vector-based language. Thus, write masks are *always* masks and not swizzles.

    Your corrected version is still wrong. How can r1.xyyy be interpreted as r1.xz?!
    Code:
    add r0.xw, r1.xyyy, r2.xywz
    
    is equivalent to
    Code:
    r0.x__w = r1.xy + r2.xz
    
     
  5. Ethatron

    Regular Subscriber

    Joined:
    Jan 24, 2010
    Messages:
    922
    Likes Received:
    361
    Just a leftover from trying to make the swizzle look more interesting.

    The .x__z syntax is more intuitive BTW.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...