Deano Calver
Newcomer
I noticed Chris Egerter talking about per-pixel specular and I thought I mention an article for ShaderX2 thats been done by a friend of mine.
Matt Halpin (formally of Creature Labs and now of Frontier Studios) has several techniques for per-pixel specular on basic pixel shaders. I'll quickly surmise them here but his article makes a lot more sense.
The first is that he uses dependent lookups for (almost) arbirtary exponent on ps_1_1 devices. Basically you pick 4 exponents that represent the range of exponents (e.g. 0, 20, 40, 60) you want and then use a dp4 to blend between them (4 basis functions provide the blending factors). In other words you SIMD process (N dot L)^p for 4 different values of p (doing 4 fixed exponent lookups) and use the stored (per texel) exponent to interpolate between the 4 (N dot L)^p values.
The other takes that technique and merges with the cubemap, so that instead of receiving a single exponent, you recieve the the 4 blend factors and then a dot product gives you arbitary exponent. Viola PS1_1 per-pixel specular with arbirary exponent with normalised half-angles in a single pass.
The key insight is that 4 samples on the descretised set of exponent curves provides enough infomation for interpolation to work well.
Trust me it makes alot more sense in Matt's article.
Matt Halpin (formally of Creature Labs and now of Frontier Studios) has several techniques for per-pixel specular on basic pixel shaders. I'll quickly surmise them here but his article makes a lot more sense.
The first is that he uses dependent lookups for (almost) arbirtary exponent on ps_1_1 devices. Basically you pick 4 exponents that represent the range of exponents (e.g. 0, 20, 40, 60) you want and then use a dp4 to blend between them (4 basis functions provide the blending factors). In other words you SIMD process (N dot L)^p for 4 different values of p (doing 4 fixed exponent lookups) and use the stored (per texel) exponent to interpolate between the 4 (N dot L)^p values.
The other takes that technique and merges with the cubemap, so that instead of receiving a single exponent, you recieve the the 4 blend factors and then a dot product gives you arbitary exponent. Viola PS1_1 per-pixel specular with arbirary exponent with normalised half-angles in a single pass.
The key insight is that 4 samples on the descretised set of exponent curves provides enough infomation for interpolation to work well.
Trust me it makes alot more sense in Matt's article.