Okay so not likely the filtering in humus' method 3 instead of simply bagging it I've decided to come up with another method of encoding which could be used in some situation to replace humus. Also this method allows for better contrast between the primary colour channel and the secondry ones for values above 1.0 the contrast is 1:65536 for values under 1.0 its 1:Value*65536. As far as accuracy goes this method might loose some precision for the primary channel under 1.0 compared to humus but not much.
Method 4 encoder ( very similar to what I thought humus' method 3 was)
Code:
float maxvalue=65536.0;
// Edit should have some extra code to handle pure black so there is no dived by zero
// say like if maxChannel < 0.5 maxchannel = 1
//new method
float maxChannel = ceil( max(max(rgba.r, rgba.g), rgba.b) );
float range=1/maxChannel;
float r = rgba.r/maxChannel;
float g = rgba.g/maxChannel;
float b= rgba.b/maxChannel;
return float4(r,g,b,maxChannel/maxvalue);
Edit: Unfortunatly if you implement this code straight into a PS shader you get problems because your emulating INT maths so its not great for on the fly calculations.
Now this method of filtering has two sets of errors 1st is crossing blend X+X/65536 with Y where X is a whole number ( worest case ) and the 2nd is when crossing the 1.0 boundry.
Now for X+X/65536 with Y the blending is slight better then humus' filtering. Where X=1 The error drop as you increase X
2x contrast
2+2/65536 with 4 97.22% accuracy// not accurate this is the clostest I can get
4x contrast
1+1/65536 with 4 90.01% accuracy
16x contrast
1+1/65536 with 16 79.14% accuracy
256x
1+1/65536 with 256 75.29% accuracy
65536x
1+1/65536 with 65536 ~50% accuracy
Worest case where X =4
16834 x contrast
4+4/65536 blended with 65536 has ~90% accuracy.
Okay now for the average case X + 1/2 blended with Y. Where X=1 The error drop as you increase X
2x contrast
2+1/2 with 4 98.71% accuracy // not accurate this is the clostest I can get
4x contrast
1+1/2 with 4 95.45% accuracy
16x contrast
1+1/2 with 16 90.0% accuracy
256x
1+1/2 with 256 87.66% accuracy
65536x
1+1/2 with 65536 87.5% accuracy
For bleing 1 value of between 2^-16 and 2^0 and between 2^0 and 2^16 there is an error and it should be exactly the same as the error in humus method 3. ( atleast when working with 2^x blended with 2^-x )
2^16 blended with 2^-16 50.0015258789059% accuracy
2^8 blended with 2^-8 50.3906190396265% accuracy
2^4 blended with 2^-4 56.2256809338521% accuracy
2^2 blended with 2^-2 73.5294117647059% accuracy
2^1 blended with 2^-1 90% accuracy
For images with the vast majority of values over 1.0 I'd highly recommend method 4 as actually method 4 has much more precesision then FP16 for the primary channel for values over 1.0 and because as long as the blends between values are above 1.0 there will be less filtering errors. For the occasional blends accross 1.0 boundry the errors are no worse then humus' method 3.
Edit: Some extremely incorrect comments here will fix tomorrow.
For textures with values below 1.0 this method the colour storage accuracy is exactly the same as a normal clamped INT16 texture.
And of course this work was based on humus' work
and thanks to ATI for creating render monkey.