Origin of Quake3's Fast InvSqrt()

Discussion in 'Beyond3D Articles' started by Geo, Dec 1, 2006.

  1. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    213
    Location:
    Uffda-land
    <a href="http://www.beyond3d.com/articles/fastinvsqrt/"><img border="1" src="http://www.beyond3d.com/articles/fastinvsqrt/focus.png" align="right" width="75" height="75"></a> Our Ryszard Sommefeldt dons his seersucker hunting jacket and meerschaum pipe to take on his secret identity as graphics code sleuth extraordinaire. In today's thrilling installment, the origins of one of the more famous snippets of graphics code in recent years is under the microscope --Quake3's Fast InvSqrt(), which has been known to cause strong geeks to go wobbly in the knees while contemplating its simple beauty and power.

    This one is not for the faint of heart. <a href="http://www.beyond3d.com/articles/fastinvsqrt/">Enter the puzzle palace if you dare.</a>
     
  2. Rys

    Rys PowerVR
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,156
    Likes Received:
    1,433
    Location:
    Beyond3D HQ
    It's just a quick one page republish of something I'd written for my personal website back in the day, and linked here at some point or another, that we thought deserves some more limelight. One for the geeks (all of you then :lol: ) !
     
    Geo likes this.
  3. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
  4. Rys

    Rys PowerVR
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,156
    Likes Received:
    1,433
    Location:
    Beyond3D HQ
    The article doesn't exist anymore on lungexpress (never rewrote my article display code), hence the republish and non-link to that thread, which I'm well aware exists :razz:
     
  5. PeterAce

    Regular

    Joined:
    Sep 15, 2003
    Messages:
    489
    Likes Received:
    6
    Location:
    UK, Bedfordshire
    This article created a double flashback!

    1) Buying a Deltron Realvision Flash3D Voodoo card in 1997 (my first 3D card).

    2) Starting the mentioned NV40/Doom3 thread where DemoCoder posted the FastInvSqrt (first time I saw this code).
     
  6. zmaniac

    Newcomer

    Joined:
    Dec 1, 2006
    Messages:
    1
    Likes Received:
    0
    This 2003 paper offers a detailed analysis of the algorithm and its math, but doesn't seem to have much to offer on earlier history:

    http://www.lomont.org/Math/Papers/2003/InvSqrt.pdf

    But as the response from Terje shows, the origin of the code is unlikely to have even been graphics-related. Chances are it wasn't even done by someone still programming. Serious floating point/fixed point assembly language hacking by uber math geeks has been going on since the fifties.
     
  7. Bludd

    Bludd Experiencing A Significant Gravitas Shortfall
    Veteran

    Joined:
    Oct 26, 2003
    Messages:
    3,201
    Likes Received:
    769
    Location:
    Funny, It Worked Last Time...
    That paper is linked to in the article.
     
  8. iollmann

    Newcomer

    Joined:
    Dec 2, 2006
    Messages:
    2
    Likes Received:
    0
    There are better ways

    (double post)
     
    #8 iollmann, Dec 2, 2006
    Last edited by a moderator: Dec 2, 2006
  9. iollmann

    Newcomer

    Joined:
    Dec 2, 2006
    Messages:
    2
    Likes Received:
    0
    It is faster and better to use the AltiVec/SSE instructions for reciprocal sqrt estimate. The reciprocal sqrt estimate instructions alone are more accurate than this method. Max error over the set of normal numbers:

    this method: 28402 ulps
    RSQRTSS: 4980 ulps
    RSQRTSS + NR: 4 ulps (but Inv_sqrt(0) is NaN )

    For altivec, you'd use vec_rsqrte instead. PowerPC also has a scalar reciprocal square root estimate instruction, which gets similar accuracy on some PowerPC machines (others get much less and need a NR iteration).

    Note that none of these are a drop in replacement for division when stricter IEEE-754 conformance is required. Also note that in many cases, you don't need sqrt at all. For example, when looking to see if a point is inside a sphere, you can compare the square of the distance of the point from the center against the square of the radius.

    Ian Ollmann

    #include "xmmintrin.h"
    //rsqrtss + Newton Raphson
    float isqrt_est_nr( float x )
    {
    float xhalf = 0.5f * x;

    //get reciprocal sqrt estimate
    x = _mm_cvtss_f32( _mm_rsqrt_ss( _mm_set_ss( x ) ) );

    x = x * (1.5f - xhalf*x*x);

    return x;
    }

    //RSQRTSS
    float isqrtf_est( float x )
    {
    return _mm_cvtss_f32( _mm_rsqrt_ss( _mm_set_ss( x ) ) );
    }
     
    #9 iollmann, Dec 2, 2006
    Last edited by a moderator: Dec 2, 2006
  10. Rufus

    Newcomer

    Joined:
    Oct 25, 2006
    Messages:
    246
    Likes Received:
    60
    Very true, but as the story says this code is ancient. There was no 3DNOW, MMX, SSE, AltiVec or anything back then. You either did an actual FP sqrt and divide (extremely slow), or you do an aproximation like this. No one would ever use something like this anymore, but it's interesting from a historical perspective.

    Also good job geo slashdotting B3D :roll: . Glad you were only down for an hour or so.
     
  11. Skrying

    Skrying S K R Y I N G
    Veteran

    Joined:
    Jul 8, 2005
    Messages:
    4,815
    Likes Received:
    61
    B3D has not been down for me at all today, its been a little slow on loads but its been stable all the way.
     
  12. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    213
    Location:
    Uffda-land
    You're welcome. :smile: You do understand that we do content for people to read it, and the more the better? We'd submit to slashdot every single piece if we felt they were appropriate for that audience. We definitely felt this one was, and apparently they agreed. It took something less than an hour for them to post it.

    The more serious question is why did the site/server react that badly to it. The last time we were slashdotted it didn't have nearly that impact on us. And it was never down for me, just a bit slow-ish. Needed a refresh now and then to bring a page up is all.
     
  13. RedMenace

    Newcomer

    Joined:
    Dec 2, 2006
    Messages:
    1
    Likes Received:
    0
    Okay, trying 3rd time

    (or are posts moderated before displaying?)

    Did you ask <a href="http://trac.bookofhook.com/bookofhook/trac.cgi/wiki/AboutMe">Brian Hook</a>?

    He seems like a likely candidate, or at least a conduit from 3dfx to id.
     
  14. Rufus

    Newcomer

    Joined:
    Oct 25, 2006
    Messages:
    246
    Likes Received:
    60
    I'm not questioning you submitting the article, it's the type of thing I think /. should have more of. I was just commenting on the server's response. I read the story first then headed over to /. (yes, b3d is before /. on my morning reading list :grin: ) and saw it was posted. One of the tags was "slashdotted" so I tried coming back and the connection just hung.
     
  15. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    213
    Location:
    Uffda-land
    Well, some are. A small minority of total posts, tho 100% of those that are mod'ed are noobs (but not 100% of noob posts are moderated). I won't go into the nitty gritty, as it's an anti-spam thang, and we don't care to over-advertise the details.

    Sorry for the inconvenience for those genuine new members who get caught. :smile:
     
  16. Gumby

    Newcomer

    Joined:
    Dec 2, 2006
    Messages:
    1
    Likes Received:
    0
    Who wrote the code in HAKMEM?

    The HAKMEM reference makes me think of Steve Russell, though I think I heard him say that Spacewar's numerics came from DEC.

    However it's far more likely to be bill gosper; if not he would remember. Of course there was no IEEE format fp back then (there was barely an IEEE!) not to mention the 36-bit architecture of the PDP-6 with 18-bit floats so HAKMEM would really have just been a suggestion.

    (don't forget the PDP-6 was the origin of BLT, the parent of bitblt, so no wisecracks about 36-bit words please. Those machines remain the most convenient and fun machines to program in assembly).
     
  17. Rys

    Rys PowerVR
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,156
    Likes Received:
    1,433
    Location:
    Beyond3D HQ
    As it happens, the original author is now known ( :!: :!: :!: ) after someone close to the code made themselves known, so we'll be running with a follow up piece tomorrow that explains all. Stay tuned!
     
  18. Magnum PI

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,054
    Likes Received:
    12
    02/23/2000 usenet post from Jason Dorie, in group comp.graphics.algorithms


    http://groups.google.com/group/comp.graphics.algorithms/browse_frm/thread/a4bdef092236355a/3ec1b02e96dc6dbc?lnk=st&rnum=3#3ec1b02e96dc6dbc

    float InvSqrt(float v)
    {
    float x2 = v * (float)0.5F;
    float y = v;
    long i = *(long *) &y;
    i = 0x5f3759df - (i>>1);
    y = *(float *)&i;

    y = y * (1.5f - (x2 * y * y));
    y = y * (1.5f - (x2 * y * y));
    return y;

    }



    1997/10/13 post from Robert Harley

    http://groups.google.com/group/comp.arch.arithmetic/browse_frm/thread/ddd30bfce90255b8/d2f08ddf951489fe?lnk=st&q=0x5fe80000UL&rnum=1#d2f08ddf951489fe

    64 bit code, seems to use some of the same ideas.
     
    #18 Magnum PI, Dec 3, 2006
    Last edited by a moderator: Dec 3, 2006
  19. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    213
    Location:
    Uffda-land
    Awesome. This would be the other reason to have submitted it to /. in the first place. There was still some mystery there, and that group would be most likely to flush out that last step. :grin:
     
  20. captainfreedom

    Newcomer

    Joined:
    Dec 3, 2006
    Messages:
    5
    Likes Received:
    0
    Does anyone know if there is a version of this routine that doesn't use floats?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...