Crippled DP performance holds back gamers or not?

Discussion in 'Beginners Zone' started by punchinthejunk, Sep 7, 2014.

  1. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,287
    Location:
    Helsinki, Finland
    As Andrew said before, fixed point math doesn't need any extra hardware support. Regular integer math is enough.

    When converting fixed point result (or loading a stored fixed point value) to float, you need one extra (32 bit float multiply) instruction to scale the fixed point value to the wanted range. Also most GPUs support directly loading normalized fixed point values to floats from memory (normalized to [0,1] or [-1,+1]). This special case saves one ALU instruction.

    16 bit fixed point values can be directly loaded to floating point mantissa, where the (2^n) scaling factor can be directly loaded to the float exponent bits. This is only possible if the scaling factor is 2^n (... 1/8, 1/4, 1/2, 1, 2, 4, 8...). This way the conversion is completely lossless (no rounding errors from 16 bit fixed -> float). Some GPUs have a single instruction for this (and some shader languages such as OpenGL ES 3.1 have API support for this). This is nice if your scale factor is 2^n.

    For example in our vertex data preprocessor we round up the mesh bounding box (xyz channels separately) to the next power of two before we convert the floating point value to fixed point (scaled to fit the 2^n bounding box exactly). This way the float -> fixed -> back to float conversion doesn't incur any rouding error for vertices that are snapped to any origin centered grids (with 2^n spacing). This is important if you need to snap multiple objects side by side to a grid and don't want to see tiny holes in between them (holes occur because of rounding differences).

    NVIDIA Kepler ("GK") integer performance isn't as good as AMD GCN integer performance. Shifts are 1/3 rate, 32 bit integer multiply is 1/6 rate, etc. It's still much better than the 64 bit float performance on GeForce cards. Teslas/Quadros actually can actually do a 64 bit FMA twice as fast as many integer operations (including 32 bit integer multiply). Integer shift rate is also crippled on consumer cards (half rate compared to pro cards). Chart here: http://docs.nvidia.com/cuda/cuda-c-programming-guide/#arithmetic-instructions
     
  2. punchinthejunk

    Banned

    Joined:
    Mar 6, 2014
    Messages:
    21
    Likes Received:
    0
    That said, I wish I could be more precise instead of always worrying about the precision from others when I aint even an expert and can't even appreciate the difference half the time.
     
  3. willardjuice

    willardjuice super willyjuice
    Moderator Veteran Alpha Subscriber

    Joined:
    May 14, 2005
    Messages:
    1,366
    Likes Received:
    218
    Location:
    NY
    What an odd thread.
     
  4. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    9,922
    Likes Received:
    1,447
    Star Citizen is supposed to go to 64bit precision for larger envirorments sometime in the next 6 or so months. So we can see how the cards handle this soon
     
  5. Pixel

    Regular

    Joined:
    Sep 16, 2013
    Messages:
    928
    Likes Received:
    378
    I wonder if No Man Sky makes any use of double precision.
     
  6. Abwx

    Newcomer

    Joined:
    Sep 30, 2014
    Messages:
    5
    Likes Received:
    0
    According to Wiki usual flight simulators have values tables where the planes approximated aerodynamical caracteristics are stored while X Plane can compute whatever plane shapes and extract its caracteristics.

    To do so require solving (Navier-Stokes) non linear differential equations and those ones are generaly extremely sensitive to initial conditions and computing precision, the computation is surely done by the CPU and in principle is requiring double precision.
     
  7. mc6809e

    Newcomer

    Joined:
    Jan 24, 2007
    Messages:
    46
    Likes Received:
    5
    Anyone else wish for a 48-bit word size? It seems like the sweet spot for so much.

    Deep color with 16 bits per channel, audio processing with 24-bit PCM stereo pairs, fixed point 32.16 formats, and of course a float with a nice sized mantissa would all seem to work well with a 48-bit word.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...