Scali said:
Don't you need to calc gradients anyway? The edges would only add two extra gradients, which should be relatively cheap.
Gradients like for color and texture coordinates? Yes I have to calculate those anyway, but they're not part of the rasterization process in the narrow sense. It's mostly independent of it. Calculating gradients for edge stepping is different. I use integer DDA because that's exact and eliminates the need for prestepping per scanline. But it still requires conversions to fixed-point and back, six divisions, and the DDA steps. For small triangles, that's a lot of setup.
Using the half-space functions avoids this. Pixel coverage for a block is calculated in one fast go. No divisions, no jumps, no float to integer conversion, no prestepping.
You need to have code that handles partially covered quads anyway.
Not really. The coverage masks per quad are literally masks. The pixel pipeline always shades four pixels completely in parallel. The MMX maskmovq instruction then very conveniently writes only the pixels that are in the mask.
But you said there is more per-pixel overhead.
Yes, a little, for partially covered blocks. Non-covered or completely covered blocks are close to free. That's what this new algorithm is all about.
Obviously the two scanlines correspond to the 2x2 blocks you were interested in. And obviously you could do larger blocks by either doing more scanlines at a time, or by doing the upper and lower scanline in the larger block first (if both are inside, everything between them has to be inside aswell, per trapezoid).
Yes, that's entirely true of course. You'd still be tracing edges though, "doing more scanlines at a time". The coverage test using half-space functions doesn't require that. And you still also have that setup overhead. The only advantage would be to skip the 'inside' quicker, but with the half-space functions that's nearly for free without tracing, without much setup, and without checking for special cases.
Since ehart mentioned something very similar, I didn't care. But then you start ranting about how I didn't contribute anything to the thread, which obviously I did.
Between all the noise, yes you contributed something. But it was the most straightforward algorithm, with little interesting properties, and it wasn't what ehart suggested. And seriously, it's not a bad thing to try and fail, but please admit it when ideas with more potential are being presented.
You obviously lack the skill and comprehension to implement it in an efficient and elegant way. But that has been my point in this thread the whole time.
Nobody can implement this efficiently and elegantly. It's just not suited for it. There are so many special cases that it's always going to be messy if you want to keep it reasonably efficient, or it's going to be slow if you want to keep it clean.
Just the fact that you say this makes it clear that you lack the skill and comprehension to know when you have to look for other algorithms. I started looking for new algorithms when I started this thread...
Not at all, but the idea I presented was the same as the one ehart presented, the main difference is that the way ehart presented it, made you somehow think about implementing it with the halfspace algorithm (of which I am still not convinced anyway). The idea is still the same though: snap to a larger grid and work out coverage from there, because the coverage is always a single continuous block.
Scali, I knew the direction I didn't want to go into. What you described I had already considered. So although ehart's posts weren't very elaborate, they did spark the ideas of determining coverage of whole blocks using the half-space functions, and computing masks for partially covered blocks in one go. You method was and still is unrelated to quickly determining block coverage using the half-space functions.
So don't go around saying that I didn't contribute or that my idea wasn't helpful while ehart's was.
I can't change that. That's the way things went and it hardly could have been different. Just accept it.
Then again, you think a lot of things. Like that the average P4 2.66 GHz has ~20 gb/s bandwidth. Or that you can implement an efficient raytracer without acceleration structures.
I never did think that.
Don't act like you are the expert, because obviously you aren't.
Now where would I "act" like "the expert"? I weigh technical arguments, I run tests, I experiment, I endure and I succeed. You keep stuck in argumenting.
Or were you implying that you are "the expert"? So you've tried all of this before? I didn't think so.
And don't go around patronizing people who are actually trying to help you, just because you didn't understand what they said. That's just pathethic.
With "people" I assume you mean just you? Because ehart seemed to be really interested to hear about my advancements. Well if this is the way -you- try to help me, then I seriously suggest trying less hard. Really, if you feel patronized and think I didn't understand what you said, then prove me wrong. And I do mean prove. But no, the best thing for you to do is to just try less hard.