FP16? But it's the current year!

Discussion in 'Architecture and Products' started by Markus, Oct 27, 2016.

  1. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,069
    Likes Received:
    1,028
    God, I was blinded by threadbouncing. Thanks for the writeup, Sebbbi.
     
  2. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Yes, changing data types "blindly" to fp16 one shader at a time and validating the end result visually between each change is a valid strategy for porting game shaders to fp16. If the error can't be seen visually, there's no problem. Of course this assumes that your test images/sequences have good enough coverage of different corner cases. Testing a daylight scene would behave completely different than a night scene. Highly specular materials can also be problematic (GGX math at fp16 can output larger values than 65504 -> infinite -> problems).

    However, understanding the basics of floating point helps when you encounter problems. Most problems are caused either by catastrophic cancellation or by adding/subtracting values of vastly different magnitudes. If you can isolate operations that are prone to these issues, you end up hitting lot less problems. Fp16 math is generally good enough for calculating sRGB8 and HDR10 image output, as long as these problematic operations are separated and executed at fp32.
     
  3. milk

    Veteran Regular

    Joined:
    Jun 6, 2012
    Messages:
    3,013
    Likes Received:
    2,588
    Maybe that is another good fit to ND style use of different variatios of an Uber shader for every on-screen tile.
     
  4. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    1,970
    Likes Received:
    1,109
    @sebbbi Have to admit I thought there would be automated tools to check for level of variance.
    Benchmark that then checks by xor'ing images to give a level of variance or something, then someone one can eyeball it afterwards if it's too high.
     
  5. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Automated testing is a good idea. Pick important locations around levels and take fp16 & fp32 screenshots. If you already have a testing script that loads all levels and performs other tests (such as memory consumption), it is easy to integrate this as part of the same test. Having a key mapped to instantly switch between fp32<->fp16 shaders at runtime is a great feature as well. And another key to show difference (error x4 for example). If you (or a tester) spot something ugly, you can just press a key to confirm whether it was caused by fp16 or something else.

    If you use ifdefs for half types, it is trivial to compile two versions of each shader.
    Code:
    #ifdef ENABLE_16_BIT_FLOAT
    #define HALF min16float
    #define HALF2 min16float2
    #define HALF3 min16float3
    #define HALF4 min16float4
    #else
    #define HALF float
    #define HALF2 float2
    #define HALF3 float3
    #define HALF4 float4
    #endif
    
     
    Jay, Kej, Heinrich4 and 2 others like this.
  6. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    It wouldn't be surprising to see a driver optimization setting/replacement that demoted FP32 to FP16 based on some hueristics either. A method of improving performance on older games.
     
  7. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,135
    Likes Received:
    2,935
    Location:
    Well within 3d
    It's not unprecedented, although the overriding heuristic was sometimes whether it made the vendor's card do better when competitively benchmarked, or something to that FX effect.
     
    Razor1, CSI PC, I.S.T. and 2 others like this.
  8. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    It does seem likely we'll be needing more IQ reviews in the near future. I'm kind of curious how much FP16 already exists that could simply execute more efficiently? I'm sure some devs have packed FP16 registers. Stands to reason compiling for FP16 execution given FP16 registers would be fairly straightforward.
     
  9. Ethatron

    Regular Subscriber

    Joined:
    Jan 24, 2010
    Messages:
    859
    Likes Received:
    262
    sebbbi likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...