ATi, Cg and swizzling

Discussion in 'Rendering Technology and APIs' started by Randell, Apr 8, 2003.

  1. Randell

    Randell Senior Daddy
    Veteran

    Joined:
    Feb 14, 2002
    Messages:
    1,869
    Likes Received:
    3
    Location:
    London
    Is it true one of the issues with the R300 & Cg range is that they do not fully follow the ARB specs in relation to swizzling, which is one reason why Cg doesnt compile well to the R300 in OGL? If that is the case, why didn't ATI do this, is there an alternative route?

    Remebr I'm a layman :)
     
  2. andypski

    Regular

    Joined:
    May 20, 2002
    Messages:
    584
    Likes Received:
    28
    Location:
    Santa Clara
    As far as I know there is no problem with the spec and swizzling on R300, so I would have thought that any problem lies elsewhere. I can't say I've looked at this specific case though.
     
  3. Randell

    Randell Senior Daddy
    Veteran

    Joined:
    Feb 14, 2002
    Messages:
    1,869
    Likes Received:
    3
    Location:
    London
    Ok, it was soemthing someone posted on anotehr board - here is what they say.

    'Now, going back to what i said about the output produced by Cg not being 100% compatible with the Radeon series of cards (I speak from an OpenGL pov btw, I dont know about D3D stuff) when compiled with teh ARB profile..
    This ISNT because they are not following the Nvidia Cg specs, this is because they [ATI] havent completely covered teh ARB shader specs which say that a card should be able to do swizzling in one operation (swizzling being an instruction such as this : mov reg.xyzw reg.xywz so the z and w components are swapped around during the move instruction), which the 9700 and below cant do.

    In conclusion, the Cg output correctly follows the ARB specs, its the card which doesnt follow them 100%, so this isnt a case of Nvidia dictating things, its a case of one of the (few) cases where ATI dont quite do things properly.'

    I thought I'd check with you guys, before I responded :)
     
  4. andypski

    Regular

    Joined:
    May 20, 2002
    Messages:
    584
    Likes Received:
    28
    Location:
    Santa Clara
    My understanding of the ARB fragment program spec (based on reading the language syntax) is that swizzling any register requires a special swizzle instruction - you can't arbitrarily specify a swizzle on each source register as you can in vertex programs. I can't claim to be an expert on this, though so don't quote me. ;)
     
  5. Randell

    Randell Senior Daddy
    Veteran

    Joined:
    Feb 14, 2002
    Messages:
    1,869
    Likes Received:
    3
    Location:
    London
    Ok, now you are losing me. I understood that the R300 can 'swizzle', but not perform 'arbitrary swizzling' is how I understood it, after nVidia claimed the R300 couldn't perform swizzling. So are saying the ARB specs dont allows for arbitrary swizzling anyway?

    Either way, does his comments have the ring of truth?
     
  6. arjan de lumens

    Veteran

    Joined:
    Feb 10, 2002
    Messages:
    1,274
    Likes Received:
    50
    Location:
    gjethus, Norway
    The way I read the ARB extension, complete rgba/xyzw swizzles are available for all source arguments to all instructions that take vector arguments. There is also a special instruction "SWZ" that can include the numeric constants 0.0 and 1.0 into the swizzle as well.

    Dunno what limitations the ATI architecture has, though.
     
  7. andypski

    Regular

    Joined:
    May 20, 2002
    Messages:
    584
    Likes Received:
    28
    Location:
    Santa Clara
    I'm not sure - I've just read it again and it looks like arbitrary swizzling is in the ARB spec.

    I think I'll have to plead ignorance - I haven't really looked at the ARB fragment shader spec before.
     
  8. Randell

    Randell Senior Daddy
    Veteran

    Joined:
    Feb 14, 2002
    Messages:
    1,869
    Likes Received:
    3
    Location:
    London
    Np andypski, thanks for answering.

    shame to see Man U lose wasn't it :twisted:
     
  9. andypski

    Regular

    Joined:
    May 20, 2002
    Messages:
    584
    Likes Received:
    28
    Location:
    Santa Clara
    I'm kind of torn - I always feel that I should want English teams to do well in European competitions, but it's always hardest when it's ManU. :)

    I had no problem supporting the barcodes at all (while they were still in it).
     
  10. Randell

    Randell Senior Daddy
    Veteran

    Joined:
    Feb 14, 2002
    Messages:
    1,869
    Likes Received:
    3
    Location:
    London
    I'm always torn, until I see them losing, then a sort of joy overtakes me :)
     
  11. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    Yes, full swizzling is available in the ARB_fragment_program. However, the r300 does not support all possible swizzles in hardware, so sometimes more than one instruction is needed. I'm not entirely sure about what swizzles it can't do, but I think it can propagate or reorder components, that is, combinations such as x/y/z/xyz/yzx/xzy etc, but I don't think it can do xxy/xyy/yzz etc, so these will require additional instructions. This should work fine though as long as the shader keeps within the hardware instruction count.
     
  12. Randell

    Randell Senior Daddy
    Veteran

    Joined:
    Feb 14, 2002
    Messages:
    1,869
    Likes Received:
    3
    Location:
    London
    so Cg not compiling R300 code properly using the ARB path is easily fixed - if ATI wanted to? Does Cg assumes it can do a kind of swizzling which the R300 cant do then?
     
  13. LeStoffer

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,253
    Likes Received:
    13
    Location:
    Land of the 25% VAT
    :arrow: Bias meter: Real Madrid[x- - - - | - - - - ]Man United

    Absolutely brilliant game (even if you don't count the result in). 8)
     
  14. Hyp-X

    Hyp-X Irregular
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,170
    Likes Received:
    5
    Well if PS2.0 matches the R300 abilities then the following swizzles are available:

    Code:
    xyzw  yzxw  zxyw  wzyx
    xxxx  yyyy  zzzz  wwww
     
  15. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    To my knowledge the driver already supports arbitrary swizzling, at least it should, since the ARB_fragment_program declares it should, it will just take more instructions. For instance,

    MUL a, b, c.xxy;

    would be expanded to something like

    MOV temp, c.x;
    MOV temp.z, c.y;
    MUL a, b.xyz, temp;

    Unless the instruction count overflows the 64 ALU instruction limit this should work fine.
     
  16. Xmas

    Xmas Porous
    Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    3,298
    Likes Received:
    137
    Location:
    On the path to wisdom
    Yes, it is possible to recreate every possible swizzle by combining those eight patterns. Unfortunately this might be costly in terms of temp register usage and instruction slots.

    Another weak point is that an "optimized" algorithm with clever swizzling where appropriate may become significantly slower than a non-optimized version on limited swizzle hardware. But that's another reason why you usually should use scalar ops where appropriate and leave the optimization to the compiler.

    btw, that reminds me of the mandelbrot shader. Humus, are you going to update your article/demo with the faster versions? :wink:
     
  17. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    Gah, have forgot all about that. Will have to check that out, the deadline is coming close.
     
  18. Hyp-X

    Hyp-X Irregular
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,170
    Likes Received:
    5
    Are you sure about this?
    The ARB_fragment_program spec does not describe 3 component swizzles.
    Assuming the behaviour is the same as in DX9 then it should be expanded as:

    MOV temp.xy, c.x;
    MOV temp.zw, c.y;
    MUL a, b, temp;
     
  19. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    Well ... seams like I've worked too much in HLSL's. :) Haven't written a single line assembly shader code the latest two months.
     
  20. Rockman

    Newcomer

    Joined:
    May 21, 2003
    Messages:
    40
    Likes Received:
    0
    How well does rendermonekey work and were can I find a manual for it ?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...