Pavlos said:
Unless you have finished adding all the functionality in your engine/library/renderer, no matter how much time you have spend designing the whole thing, chances are that something will come up and invalidate part of your design/code. The whole issue with the partial derivatives, I think, is a good example. Maybe in some months you will decide to use a tile based deferred renderer using the a-buffer/Z3 algorithm
I'm not finished adding functionality, but I already have a complete list of things things to be added which I try not to touch any more. So there should only be small design changes (like needed for dsx/dsy).
Also, Intel will be introducing new vector instructions on every major processor release. The horizontal vector instructions of Prescott can easily invalidate any decision between SoA and AoS. Not to mention Apple’s AltiVec. Anyway, your project seems nearly finished, so probably you will not have any problems.
Well the conditional run-time compilation make it very easy to optimize specifically for one processor. For the switch to SoA some more work is required but it all remains nicely abstract unlike the mess that hard-coded assembly brings.
Do you have any reason to recompile the shaders for each frame? Do you re-optimize the shader for every object or something??? Usually shaders are compiled at content creation time, so compilation speed is not an issue, and I think this can be the case with real-time rendering (correct me if I’m wrong).
Well the first example that pops into my head is to optimize minification and magnification filtering in the fixed function pipeline. For magnification a bilinear filter is sufficient, since there are no higher resolution mipmaps. For minification a trilinear filter is desired. The usual way to implement this is to check per pixel wether there is minification or magnifications.
The way hardware handles this is simple, a handful of transistors decide whether to use bilinear or trilinear filtering, at virtually no extra cost. In software it's less advantageous because the cost of the check (and possible mispredictions) could in total be higher than what can be won by bilinear filtering. So most software renderers just do the full trilinear.
My method is to generate versions of the fixed function pipeline specifically for bilinear and trilinear filtering. If a triangle has magnification at all vertices, it can be rendered with the bilinear version, else I use the trilinear version. For the fixed function pipeline the list of specializations that can be done this way is endless (I cache the last 16 combinations of render settings).
Another example is that with bilinear filtering if mipmap LOD is the same a every vertex, then no mipmap coifficient has to be interpolated and recomputed. For shaders the similar things are possible because the application can switch certain render states keeping the same shader. The conclusion is that it wouldn't have been possible to do this if I couldn't compile several shaders per frame...
It’s good to hear that you can easily port SoftWire in other platforms. I think portability is vital for any project. I don’t want my code tied with a specific architecture or operating system. For example Apple’s G5 (aka IBM’s PPC970) is an amazing (and too expensive
) platform for software rendering. Using a G5, Pixar was rendering at Siggraph an untouched frame from Finding Nemo at only 4 minutes!!!.
Sweet! The most abstract I can go with SoftWire is to make it like a C-language with vector types. For example:
float4 a = ...;
float4 b = ...;
a += b;
If float4 is a SoftWire class which creates the corresponding code of the operations performed on it (using operator overloading), you get the ultimate in portability, readability ánd performance.
Any sponsors for a Mac version?