Static vs Dynamic loops on X1K

Discussion in 'Architecture and Products' started by rwolf, Oct 13, 2005.

  1. rwolf

    rwolf Rock Star
    Regular

    Joined:
    Oct 25, 2002
    Messages:
    968
    Likes Received:
    54
    Location:
    Canada
    Correct me if I am wrong here...

    I understand that with static loops the driver unroles the shader code. With the performance of dynamic loops on X1K hardware would it make more sense to replace the static loop with a dynamic loop or is the static method still preferred?
     
    #1 rwolf, Oct 13, 2005
    Last edited by a moderator: Oct 13, 2005
  2. 991060

    Regular

    Joined:
    Jul 29, 2003
    Messages:
    640
    Likes Received:
    2
    Location:
    Beijing
    unrolling loops will lead to instruction count explosion, if the struct within loop is very complex or the loop count is high, hence the driver may not always do unrolling.

    There's no need to replace static loop with dynamic loop, because it gains you nothing. I assume X1K execute static loop no slower than dynamic loop.
     
  3. rwolf

    rwolf Rock Star
    Regular

    Joined:
    Oct 25, 2002
    Messages:
    968
    Likes Received:
    54
    Location:
    Canada
    I thought static loops were always unrolled by the driver, but I may be wrong.
     
  4. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,560
    Likes Received:
    157
    Location:
    In the Island of Sodor, where the steam trains lie
    It's easy to support static count loops in the hardware and have them run as fast as the unrolled version.
     
  5. neliz

    neliz GIGABYTE Man
    Veteran

    Joined:
    Mar 30, 2005
    Messages:
    4,904
    Likes Received:
    23
    Location:
    In the know
    #5 neliz, Oct 13, 2005
    Last edited by a moderator: Oct 13, 2005
  6. rwolf

    rwolf Rock Star
    Regular

    Joined:
    Oct 25, 2002
    Messages:
    968
    Likes Received:
    54
    Location:
    Canada
    I was thinking that you could remove the overhead of unrolling the static loop in the driver and just branch the code.
     
  7. Xmas

    Xmas Porous
    Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    3,303
    Likes Received:
    137
    Location:
    On the path to wisdom
    That's what Simon is saying. But static branching is easier to implement in hardware because all pixels in a draw call are guaranteed to go the same route. So you don't have to worry about batching, granularity and the order of quads.
     
  8. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    943
    Likes Received:
    42
    Location:
    LA, California
    unrolling the loops should also allow a good compiler to reorder instructions across different loop iterations to achieve higher functional unit utilization (or are register constraints to much an issue for that?)...
     
  9. rwolf

    rwolf Rock Star
    Regular

    Joined:
    Oct 25, 2002
    Messages:
    968
    Likes Received:
    54
    Location:
    Canada
    Thanks, I misinterpreted his reply.
     
  10. akira888

    Regular

    Joined:
    Jul 15, 2003
    Messages:
    652
    Likes Received:
    11
    Location:
    Houston
    If I'm not mistaken would this not force recompilation of the shader every time the application changed an integer constant that a "loop" instruction uses as its iteration parameter? I've only begun to write for my new GF Go 6800U so I'm probably missing something...
     
    #10 akira888, Oct 13, 2005
    Last edited by a moderator: Oct 13, 2005
  11. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    943
    Likes Received:
    42
    Location:
    LA, California
    good point - another reason not to do it I suppose...
     
  12. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    For cases where the constants are known at compile time though it will often be unrolled.
     
  13. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    There's also speculative compilation. The driver can compile several versions and keep them cached. One, unrolled for specific values (based on profile statistics) The other, not-unrolled, using the static branch instruction.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...