J. Pineda, A Parallel Algorithm for Polygon Rasterization, Computer Graphics, Vol. 22,
1988, Nr. 4, 17–20.
Well, may be not so surprisingly it seems none at RAD knew about the parallel rasterization algorithms first used on Pixel Planes. Which maps perfectly with a vector or SIMD architecture. Same for other papers using the same method combined with recursive rasterization.
Incremental and hierarchical Hilbert order edge equation polygon rasterization
Or 'software' triangle setup:
Triangle scan conversion using 2D homogeneous coordinates
Akeley, likely, put a reference to the Olano and Greer paper (which pointed to the old Pineda paper) and the words 'recursive descent' on the 2001
Real-Time Graphics Architectures course (see Rasterization
slide 34). So from that hint so many years ago I made the clueless decision that it should be the method used to implement the rasterizer in ATTILA. You may look for the code in emul/RasterizerEmulator.cpp (code
here not in my old page, I should change the signature) but good luck understanding that code, there are two versions of the algorithm (single triangle and parallel triangle processing) and also, mixed there, a typical 'scan line' (actually scan tiled) rasterizer based on one of the papers about Compaq's Neon graphics processor to even made that file even less readable.
Of course when implemented I finally hit the wall on the long start times required for recursive rasterization. As it was just a simulator (and performance was never a problem, 1 hour per frame is really fast
) the solution was to 'cheat' by adding more ALUs that I would find, at that time, reasonable, parallel processing of more than one triangle and tile per cycle and using bounding boxes to select the start tile size rather than start at the framebuffer resolution. That brought the throughput of a single rasterizer using the method to a reasonable number (ATTILA has always worked by generating blocks of 8x8 fragments which at the end is also a problem for other reasons).
All the setup and rasterization process was perfectly mapeable to SIMD (you can see 4-way SIMD cover functions in the code, the idea was that they could be mapped to ARB like 4-way SIMD fragment shader instructions). In fact we presented triangle setup on a the shader processor in a paper some years ago but never got to move the rasterization code to the shader. We started working on other unrelated topics ... and the shader processor never implemented branches.
But it isn't that strange that something that was relatively well known on the graphics hardware community (actually how many people even knows here what Pixel Planes was) wasn't known in the software community. They have been making mostly serial rasterizer based for decades with good success (because they targeted serial CPUs). And I'm pretty sure they will make a way more clever implementation that the one that can be found in the ATTILA source code.
Disclaimer: As you can see this post is merely cheap self promotion of ATTILA