Alphablend errors on R300 (at least)

True, unless the multiplication is made as an hand-optimised block that the synthesizer don't know is distributive over addition.

The multiplication generated from an old Synopsis version we used long time ago sucked so badly that I had to make my own "multiplicator generator". It gave a significant boost in size and speed, but it would have left the synthesizer with a hard task to see that optimisation.

But I guess current synthesizers are a lot better.
 
I wrote a stupid little app to test this which I will post later and I made the following (potentially wrong) findings. Firstly, Simon's and Zeck's method always agreed; this is pretty obvious though. Second, there were about 12 cases out of the 32896 unique combinations of two bytes in which Simon's and the 64-bit float precision blend disagreed. However Basic's and the double precision always arrived at the same answer.

EDIT: Here is the source (MS VC++ 6.0). Click on the second link "Alpha Blend Correctness Tester". I'd love to post an executable, but my Visual Studio is "academic use only." :rolleyes:
 
akira888 said:
I wrote a stupid little app to test this which I will post later and I made the following (potentially wrong) findings. Firstly, Simon's and Zeck's method always agreed; this is pretty obvious though. Second, there were about 12 cases out of the 32896 unique combinations of two bytes in which Simon's and the 64-bit float precision blend disagreed. However Basic's and the double precision always arrived at the same answer.
Hee hee. If you were really paranoid then you could hard code those 12 cases (there'd be symmetry so I'd imagine it'd be fairly cheap!). Anyway, these are all trivialities. As long as 0xFF=> 1.0 and 0x00=>0.0 and everything in between is monotonic, who cares? ;-)
 
I did some more investigation and discovered that this works as well -

Code:
unsigned long temp;
temp = x * ((a << 8)| a);
temp >>= 7; 
temp++;   
temp >>= 8;
temp++;
temp >>= 1;
return temp;

The reason being that the zeck/Simon 16-bit disposed modulus when in error was always greater than 65408, and when correct was always less than that. In other words, in the first case all nine of the nine highest order bits (bits 7-15) were asserted and in the second case they always weren't all on. So this way, adding in a bit at the eighth bit from the MSB will pull the integer value up to correctness.

Simon F said:
As long as 0xFF=> 1.0 and 0x00=>0.0 and everything in between is monotonic, who cares?

Circa 1998 I had a V2 and my then roommate a TNT1. For identical scenes in the same game (on almost the same monitor) my games always looked brighter. I now wonder whether 3dFX was calculating dither values in gamma space rather than linear space.
 
I did say the errors were few. :)

I doubt that hard coding for the 2*12 cases would be smaller than the addition of the (temp>>15) term here. (It's just adding one bit.)
Code:
temp = x*a; 
result = (temp + (temp>>8) + (temp>>15) + 0x80) >> 8;
akira888:
Yes, that should indeed give the right result. And if it's sent through a syntheszier that can find the optimizations that Simon mentioned, and then optimize away a 1 bit adder, it will end up with...
...the stuff I wrote above. :)

And yes, I know I'm weird who think this is fun. :D
 
Back
Top