SN Systems Andy Thomason next gen coding

Discussion in 'Rendering Technology and APIs' started by Eleazar, Dec 20, 2005.

  1. Eleazar

    Newcomer

    Joined:
    Nov 21, 2005
    Messages:
    95
    Likes Received:
    5
    Location:
    USA
    Gamasutra is running an article on proper coding practices for the next gen. It covers cache misses, branch avoidance, inlining and a whole lotta other fun stuff. So head on over. I like this article because it pretty much somes up how this gen is going to be different than the past gen. I lot of it we have already said on these forums one way or the other but it is nice to see it all put in the perspective of someone in the field and who has done such a good job of organizing that information together.

    http://www.gamasutra.com/features/20051220/thomason_01.shtml
     
  2. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    I don't see how

    Is true, unless the instruction set you are compiling to contains a conditional assignment or predicated assignment instruction, otherwise a branch will still be required.
     
  3. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,243
    Likes Received:
    618
    You can do it with simple boolean logic too.
     
  4. Mate Kovacs

    Newcomer

    Joined:
    Dec 12, 2004
    Messages:
    163
    Likes Received:
    3
    Location:
    Mountain View, CA
    You can do it without them.
    For example, the C expression "a = (!b ? c : d)" is equivalent to the x86 asm
    Code:
    mov    eax,[b]
    or     eax,eax
    setz   bl
    dec    bl
    movsx  ebx,bl
    mov    eax,[d]
    and    eax,ebx
    not    ebx
    and    ebx,[c]
    or     eax,ebx
    mov    [a],eax
    
    EDIT: MfA was quicker. :)
    EDIT2: Maybe you can still think of setz as a "conditional assignment", but I'm pretty sure it could be done without it, too. :D
     
    #4 Mate Kovacs, Dec 21, 2005
    Last edited by a moderator: Dec 21, 2005
  5. Colourless

    Colourless Monochrome wench
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,274
    Likes Received:
    30
    Location:
    Somewhere in outback South Australia
    With x86 probably easier using the cmov instructions introduced with the P6. Microsofts x86_64 compiler uses it.

    The instructions for 'a = (!b ? c : d)' will end up being this and only uses 1 register

    Code:
    mov eax, [b]
    test eax, eax
    cmovnz eax, [c]
    cmovz eax, [d]
    mov [a], eax
     
    #5 Colourless, Dec 21, 2005
    Last edited by a moderator: Dec 21, 2005
    Geo likes this.
  6. Mate Kovacs

    Newcomer

    Joined:
    Dec 12, 2004
    Messages:
    163
    Likes Received:
    3
    Location:
    Mountain View, CA
    Yep, but cmov definitely is what DemoCoder referred to as "a conditional assignment or predicated assignment instruction", IMO. :)

    EDIT: BTW, you mixed up [c] and [d], so it's equivalent to "a = (b ? c : d)" now. :)
     
    #6 Mate Kovacs, Dec 21, 2005
    Last edited by a moderator: Dec 21, 2005
  7. Zengar

    Regular

    Joined:
    Dec 3, 2003
    Messages:
    288
    Likes Received:
    0
    A good compiler should recognize such cases and use conditional movs automatically. I don't see why I should change my coding practices...
    At least pascal compiler,where you have no ? operator, does it that way.
     
  8. Mate Kovacs

    Newcomer

    Joined:
    Dec 12, 2004
    Messages:
    163
    Likes Received:
    3
    Location:
    Mountain View, CA
    Yep, it all depends on the compiler. Even if you change your practices, but the compiler is stupid, you'll still get inefficient code. For example, since the ANSI C specification states that the inline keyword is only a 'hint', it's still up to the compiler. :)
     
  9. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    Well, I suppose one could store the result of a condition in a boolean, say A, and then use the following boolean equation:

    R = (A ? X : Y)

    becomes

    R = A * X + not(A) * Y

    (* = AND, + = OR)

    Of course, this is just poor man's predication, with A as the predicate. :)
     
  10. Mate Kovacs

    Newcomer

    Joined:
    Dec 12, 2004
    Messages:
    163
    Likes Received:
    3
    Location:
    Mountain View, CA
    Yep, you got the basic idea. (BTW, simply storing a condition is just not enough (either in the C language, or at the x86 asm level), because it's either 0 or 1. You've to convert it, such that it's either 0 or all 1 bits.)
    And yes, it's just "poor man's predication", but you don't need "a conditional assignment or predicated assignment instruction", which was our point. :)
     
  11. JHoxley

    Regular

    Joined:
    Oct 18, 2004
    Messages:
    391
    Likes Received:
    35
    Location:
    South Coast, England
    This whole micro-optimization stuff seems a bit silly at times.

    I personally prefer the "Get it right then get it tight" approach - write some good clean code (free of such ugly micro-optimizations) and then profile it to work out where those micro optimizations will really make a difference.

    Theres a lot to be said for writing good quality clean code - maintenance, (lack of) bugs, adaptability, portability...

    Jack
     
  12. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    953
    Likes Received:
    51
    Location:
    LA, California
    Mate, do you know if MSVC/GCC __forceinline/ __attribute__ ((always_inline)) actually force inlining, or if they simply force the compiler to consider inlining even when optimizations are turned off?
     
  13. Mate Kovacs

    Newcomer

    Joined:
    Dec 12, 2004
    Messages:
    163
    Likes Received:
    3
    Location:
    Mountain View, CA
    Yep.
    "Premature optimization is the root of all Evil." (? Knuth ?)

    @psurge: I don't know. Honestly. I'll try to poke around. :)
     
  14. ERP

    ERP Moderator
    Moderator Veteran

    Joined:
    Feb 11, 2002
    Messages:
    3,669
    Likes Received:
    49
    Location:
    Redmond, WA
    The issue with this is that these types of micro optimisations are hard to measure, any one might not have a significant impact, but thousands a frame can be significant.

    Virtual function overhead is probably the most obvious one (other than it's hard to eliminate them after the fact) one virtual function call doesn't kill you (not even on PS2) but 10's or even 100's of thousands a frame can really hurt.

    Anecdote --- A friend of mine was just realying his experience removing a lot of virtual function calls from the inner workings of a fairly major system on a cross platform product. The net result was almost no performance difference on PC and doubling of the performance one particular console. There is no way to estimate the impact of those virtual function calls without actually removing them.
     
  15. Xmas

    Xmas Porous
    Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    3,317
    Likes Received:
    149
    Location:
    On the path to wisdom
    And whether there is such a conditional assignment instruction or not, the lines with if are just as good or better even in the second case.
     
  16. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    Yes, many compilers will have an almost identical internal representation, except for the fact that ?: is an expression, and 'if' is a statement. But they are otherwise identical. Much like for/while/dowhile.
     
  17. Graham

    Graham Hello :-)
    Moderator Veteran Subscriber

    Joined:
    Sep 10, 2005
    Messages:
    1,479
    Likes Received:
    209
    Location:
    Bend, Oregon
    I can't help feel like I'm stepping back 5 years reading that article, when in fact it's aimed as a prediction of the next 5 years of development.

    IMO, the choice of algorithms, and overall design structure will have a greater effect on performance than things such as choice of branch style.

    He talks about about going to extreme lenghts to reduce memory overhead, then effectivly says 'inline everything'. ?! I've done that before... and I got an 8mb executable instead of 700k. Fantastic advice. Yes, selective inlining is very important, but this is usally done by a smart compiler, and will be obvious when it's needed with proper profiling. He also suggests templating as much as possible. Same deal, Code bloat.



    Takes me back to the 'C is faster than C++' wars of days gone by.


    "Calling malloc or the default new in a game loop is considered irresponsible". Urgh.
     
  18. Xmas

    Xmas Porous
    Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    3,317
    Likes Received:
    149
    Location:
    On the path to wisdom
    I strongly disagree. Get the texture content right and leave LOD bias alone, please.

    That's certainly supposed to read negative.
     
    Geo likes this.
  19. ERP

    ERP Moderator
    Moderator Veteran

    Joined:
    Feb 11, 2002
    Messages:
    3,669
    Likes Received:
    49
    Location:
    Redmond, WA

    To give you some idea of how far games are from general application develpment, many companies have a 0 runtime memory allocation policy (although it's less prevalent than it used to be). Not so long ago my games had no free, the only way to free memory was to revert the heap (actually just a stack) to a previously saved state.

    Most of what's in the article can make a significant performance difference. Obviously these types of optimisation go hand in hand with good algorythm choices.

    It's harder to do this type of optimisation as teams get bigger, development practices move more towards generally accepted large scale development. But as I mentioned above if you can enforce these types of optimisations they can be a significant performance win on todays console processors. IME on PC they make sod all difference.
     
  20. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    I don't buy it, ERP. Readable code is, these days, vastly more important than slightly faster code. Better to enforce programming practices that lead to stable, readable code than much less readable but a tiny bit faster code. As JHoxley said, better to write readable code first, then go back and examine where your code is spending all of its time and optimize there.

    And, more importantly, most of these optimizations are things that should be handled by the compiler in the first place.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...