Origin of Quake3's Fast InvSqrt()

Discussion in 'Beyond3D Articles' started by Geo, Dec 1, 2006.

  1. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,560
    Likes Received:
    157
    Location:
    In the Island of Sodor, where the steam trains lie
    Yes that would work. Sorry, I was assuming you were using this "binary chop-like" method to evaluate the entire result, not just to produce the seed value.
     
  2. l_mammel

    Newcomer

    Joined:
    Dec 6, 2006
    Messages:
    5
    Likes Received:
    1
    I was. When I said "easily remedied" I meant there was an easy modification. I think your objection was more esthetic than practical, since the introduced error is trivial in the face of other issues of fixed point calculations. Fixed point numbers have relative error in inverse proportion to their size, so a dither in the lowest sig fig which introduces a fixed error in the lowest digit is not much of a concern.

    But even so, there's no reason to do it that way. The problem itself scales, so adjusting the guess is formally equivalent to scaling the input and output. Note that the first and second order iteration rules:

    y = y * ( 1.5 - 0.5 * w )
    y = y * ( 1.875 - ( 1.25 - 0.375 * w ) * w

    depend only on w = x*y*y = (f*x) * (y/sqrt(f)) * (y/sqrt(f)), so the calculation done "in place" with a scaled initial guess is the same as scaling to a fixed interval. This means that my logarithmic scheme is just an elaboration of your suggestion about making initial guesses of 2^-M .

    BTW, choosing 1 would not work, as the first order iteration ( i.e. Newton- Raphson ) diverges if w > 3. You could pick 0.0000000001 , though! That is, the lowest fixed point value. Undervalued guess will eventually converge, as the iteration just mulitlplies by 1.5- until it gets into range.

    I call my scheme logarithmic because it just gives a non integer value to M in scaling rule, f( 2^2M * x ) = 2^-M * f(x) e.g. if M has a binary value of 1.101, then 2^-M = 1/2 * 1/sqrt(2) * 1 * 1/sqrt(sqrt(sqrt(2))). Of course, the logical conclusion of the scheme is f(2^2M) = 2^-M .

    One more note about evaluating the iterations in fixed point. To get the best possible precision you should evaluate w=x*y*y as (x*y)*y when x>1. This is because y<1, and y*y loses precision. This is just normal left association, so you wouldn't think anything of it.

    However, if x<1, then y>1, and you should take w=(y*y)*x, otherwise x*y loses precision. This requires that you can represent b^N, where b^-N is the lowest digit of your base.
     
    #42 l_mammel, Dec 9, 2006
    Last edited by a moderator: Dec 9, 2006
    Simon F likes this.
  3. Frank

    Frank Certified not a majority
    Veteran

    Joined:
    Sep 21, 2003
    Messages:
    3,187
    Likes Received:
    59
    Location:
    Sittard, the Netherlands
    l_mammel, could you put that in actual code? I can read that a lot better than math (and I understand reduction loops much better than "magic" Newton-Raphson math), especially when there's a lot of sqrts in it, which I assume have to be replaced with something simpler and faster before it's useable.
     
  4. l_mammel

    Newcomer

    Joined:
    Dec 6, 2006
    Messages:
    5
    Likes Received:
    1
    Note the sqrt expressions are all constants, so they are to be evaluated to a precision appropriate to the application. Below is the code for a fixed point calculator with ten decimal places. This is real code that runs on the cygwin version of the old UNIX bc calculator. ( When I installed cygwin a few years ago, they didn't have bc, so I complained on the mailing list, and they added it the next day! )

    Also note, I referred to the "magic" constant as per the original celebrated InvSqrt function analyzed in Chris Lomont's paper. The magic is in the generation of the guess by shifting the FP value and subtracting. The application of the N-R formula is just elementary calculus.

    /* i(v) is a bc function to compute InvSqrt to 10 places */
    scale=10
    define i(v){
    auto x,f,y,w;

    x=v
    f=1

    /* reduce interval to 1 <= x < 4 */
    /* This would be done by shift in a base 2 format */

    while( x >= 4 ){ x = x/4 ; f = f/2 }
    while( x < 1 ){ x = x*4 ; f = f*2 }

    /* reduce interval to 1 <= x < 1.0905077327 and compute scale factor */

    if( x >= 2 ){ f=f*0.7071067812; x=x*0.5 ; }
    if( x >= 1.4142135624 ){ f=f*0.8408964153; x=x*0.7071067812; }
    if( x >= 1.1892071150 ){ f=f*0.9170040432; x=x*0.8408964153; }
    if( x >= 1.0905077327 ){ f=f*0.9576032807; x=x*0.9170040432; }

    /* compute first iteration on reduced interval and scale result */
    /* initial guess apprx 1/sqrt(1.045) */
    /* iteration formula is a second order Taylor series of f(x) = 1/sqrt(x) expanded about 1/y^2 */

    y = 0.978
    w = x*y*y
    y = f*y*( (1.875 - (1.25 - 0.375*w)*w) )

    /* compute second iteration on input interval and return the value */
    w = v*y*y

    /* Optimize precision by order of evaluation ( no "else" in bc! ) */
    if( v < 1 ) w = y*y*v

    return ( y*(1.875 - (1.25 - 0.375*w)*w) )
    }
     
    #44 l_mammel, Dec 10, 2006
    Last edited by a moderator: Dec 10, 2006
  5. mondoterrifico

    Newcomer

    Joined:
    Mar 9, 2003
    Messages:
    32
    Likes Received:
    4
    Did you post this yet and I missed it?
    Inquiring minds want to know. (Is it sad to get excited over this stuff?)
     
  6. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,560
    Likes Received:
    157
    Location:
    In the Island of Sodor, where the steam trains lie
    I've programmed a PDP-10 with the 36-bit word format. IIRC, filenames were encoded with either 5 or 6 bits per character (can't remember exactly) so that they would fit in a single word.
     
    K.I.L.E.R likes this.
  7. Rys

    Rys PowerVR
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,156
    Likes Received:
    1,433
    Location:
    Beyond3D HQ
    It took a back seat to the G80 IQ piece and some other behind the scenes cool stuff, but I can likely finish off the 2nd look sometime this weekend.
     
  8. JDT

    JDT
    Newcomer

    Joined:
    Aug 30, 2007
    Messages:
    1
    Likes Received:
    0
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...