Shader optimisations on my R300

K.I.L.E.R · Apr 8, 2006

Fixed and here are the results:

Before (NO FILTER):
http://members.optusnet.com.au/ksaho/work/imageProcessing/before.PNG

After (RGB -> YIQ -> HIGH PASS ON INTENSITY(Y component)-> RGB)
http://members.optusnet.com.au/ksaho/work/imageProcessing/after.PNG

Not a good idea to run at 1280x1024 mind you.

Runs very fast at 640x480.

Here's my pixel shader code if anyone's interested:
http://members.optusnet.com.au/ksaho/work/imageProcessing/gpuPeteOGL2.slf

The vertex shader is in the directory but it's a passs through shader so it's mostly useless.

Remember I have no access to Pete's source code and so I can only work on the given parameters. OGL2Size, OGL2Parameters(z being the pixel shader level) and TMU0-TMU7.

I don't think there is much I can do but I do know that Ati + swizzle = bad.
So which is faster on Ati hardware:

n = vec3(some)
or
n = some.xyz?

Also what can I do to further gain performance?

Mintmaster · Apr 8, 2006

A few of points:
- Where in the shader do you update or even initialize the y and z components of texColor? Can I assume you meant to put "texColor.yz = rgbTOyiq * original" before the loop?

- texColor.x += vec3(rgbTOyiq * tmp.xyz).x;
Why are you multiplying a 3x3 matrix by a vector and then only using one component? That's simply a dot product.

-Your conversions and filters are linear operators, so take advantage of that property. You can simplify the math doing some algebra.

Looking at your code, this should give you the same result:

Code:

original = texture2D(OGL2Texture, gl_TexCoord[0].xy + maskPos[4]);

vec4 sum = vec4(0,0,0,0);
for(int i=0; i < SAMPLE_SIZE; i++)
     sum += texture2D(OGL2Texture, gl_TexCoord[0].xy + maskPos[i]);

k = OGL2Param.z*0.05;
tmp = k * sum - original;
f = dot( (0.299, 0.587, 0.114), tmp) - filter[4] * k * original.x;
gl_FragColor = (1.0, 1.0, 1.0) * f + origcolor;

I checked this in MATLAB using random sample values and I get the same result as your original code. I'm assuming my first point is true, because otherwise I don't know what you want in the y and z channels of texColor. You're also sampling the original position twice, so that's extra work you can snip out as well.

This should run a heck of a lot faster than your original code.

K.I.L.E.R · Apr 8, 2006

The reason y and z values aren't used is because the image processing operations are done on the intensity dimension and not on the colour information.

I'm going to run your code at 1280 to see how it runs.
Thanks for the pointers.

Humus · Apr 9, 2006

K.I.L.E.R said:
I don't think there is much I can do but I do know that Ati + swizzle = bad.
So which is faster on Ati hardware:

n = vec3(some)
or
n = some.xyz?

Swizzle on R300/R420 is fine as long as you stick to xxxx, yyyy, zzzz, wwww, xyzw, yzxw, zxyw or wzyx. Of course, vec2 and vec3 versions of the above will work too, since it's just a write mask, so .wz is just .wzyx with yx masked away. Using vec3() or .xyz on a vec4 variable is equivalent and doesn't make any difference.
On R520 fully arbitrary swizzles are supported.

K.I.L.E.R · Apr 9, 2006

Thanks.
Always a benefit to have an Ati employee on these boards.

Mintmaster · Apr 9, 2006

K.I.L.E.R said:
The reason y and z values aren't used is because the image processing operations are done on the intensity dimension and not on the colour information.

I understand that but you don't have anything in those values.

This determines the final pixel colour:
gl_FragColor = vec4(yiqTOrgb * texColor.xyz, 0.0);

The problem is that you haven't initialized texColor.y or texColor.z to anything in your code. It's a fluke you're even getting something remotely similar to what you're aiming for. Judging from your description of what you are attempting to do, texColor.yz = rgbTOyiq * original is what you need there.

By the way, have you done any tests elsewhere to see if your algorithm gets you any benefit over just applying the low-pass filter over everything? You're making a big leap of faith in thinking your output image is correct.

K.I.L.E.R · Apr 9, 2006

I initally do all my work in RenderMonkey.
The final output was much brighter and I made sure the texture had nearest neighbour filtering for both min and mag.

The intensity really came out vs the original image.

Mintmaster · Apr 9, 2006

Yeah, but increasing intensity is a simple scale - a one instruction shader. Do you know what the difference in the image would be between blurring just intensity and blurring everything? It's not at all obvious for an image like the one you posted.

I'm just saying you should try running a simple low pass filter shader without the colour conversion instructions to see if you get the output you desire. I bet you'll get pretty much the same output as you currently do.

Here's another weird thing about your shader:
texColor.x -= original.x * filter[4];

'original' is a RGB value. texColor is a YIQ value. So you're subtracting part of the intensity based on the red content of the centre sample? Why?

K.I.L.E.R · Apr 9, 2006

Whoops. I didn't see that.
Thanks.

I will redo my entire thing. Looks as though I've got my stuff all mixed up.
I knew I'd make a few mistakes but I didn't think I would make too many.

REDID with much better results:

Code:

original = texture2D(OGL2Texture, gl_TexCoord[0].xy);
   
   for(int i=0; i < SAMPLE_SIZE; i++)
      texColor += texture2D(OGL2Texture, gl_TexCoord[0].xy + maskPos[i]);
   texColor /= 9.0;
   
   texColor *= 0.25;
   texColor = original - texColor;
   texColor *= 1.25;

   gl_FragColor = texColor;

The Ati building in RenderMonkey appears much sharper than the original.
Which also means shimmering.

EDIT2: It doesn't appear to be doing anything.
Just loaded a "noise" texture.
I must be misreading my silly book on the way it's done.

Mintmaster said:
Yeah, but increasing intensity is a simple scale - a one instruction shader. Do you know what the difference in the image would be between blurring just intensity and blurring everything? It's not at all obvious for an image like the one you posted.

I'm just saying you should try running a simple low pass filter shader without the colour conversion instructions to see if you get the output you desire. I bet you'll get pretty much the same output as you currently do.

Here's another weird thing about your shader:
texColor.x -= original.x * filter[4];

'original' is a RGB value. texColor is a YIQ value. So you're subtracting part of the intensity based on the red content of the centre sample? Why?

Shader optimisations on my R300

K.I.L.E.R

Retarded moron

Mintmaster

K.I.L.E.R

Retarded moron

Humus

Crazy coder

K.I.L.E.R

Retarded moron

Mintmaster

K.I.L.E.R

Retarded moron

Mintmaster

K.I.L.E.R

Retarded moron

Similar threads