Triangle setup

Nick

Veteran
Hi all,

I was wondering what exactly the triangle setup stage comprises, and how it's implemented on a modern GPU.

If I understand correctly, it's main task is to compute gradients of interpolants. When graphics cards specify a setup rate of one triangle per clock cycle, does this include all interpolants (including up to 8x4 texture coordinates)? Or is the triangle rate dependent on the number of interpolants?

Do they derive the gradients from the 'plane equation approach' or is there a faster method? With 'plane equation approach' I mean for example for the z interpolant (could be any other interpolant) we compute the plane equation Ax+By+Cz+D=0 so the gradients are dz/dx=-A/C, etc. Since (A, B, C) is the plane normal it can be computed with a cross product. C can be reused for all interpolants, but it still looks like a lot of work. And since every interpolant requires perspective correction this adds three divisions and a whole lot of multiplications.

Any thoughts?
 
Thanks! I just managed to get the same result with the plane equation approach:

dz/dx = -A/C = ((y1/w1 - y0/w0)*(z2/w2 - z0/w0)-(y2/w2 - y0/w0)*(z1/w1 - z0/w0)) / ((x1/w1 - x0/w0)*(y2/w2 - y0/w0)-(x2/w2 - x0/w0)*(y1/w1 - y0/w0))

= z0 * (y1*w2-y2*w1) / (-x1*w0*y2+x1*y0*w2+x0*w1*y2+x2*w0*y1-x2*y0*w1-x0*w2*y1)
+ z1 * (y2*w0-y0*w2) / (-x1*w0*y2+x1*y0*w2+x0*w1*y2+x2*w0*y1-x2*y0*w1-x0*w2*y1)
+ z2 * (y0*w1-y1*w0) / (-x1*w0*y2+x1*y0*w2+x0*w1*y2+x2*w0*y1-x2*y0*w1-x0*w2*y1)

Which is exactly the product of (z0, z1, z2) with (the first column of) the inverse of the matrix formed by the (x, y, w) coordinates, as shown in the paper you refer to. :)

So is the setup engine's task limited to computing this matrix inverse as efficiently as possible?
 
So is the setup engine's task limited to computing this matrix inverse as efficiently as possible?
well..that's a clever implementation but dunno if real hw does use this system to setup triangles. It's also nice to note that if the inverse matrix does not exist..then you automatically know that you triangle has to be rejected :)
It seems next generation GPUs might use shader ALUs to setup triangles..
 
the cleverest way i've seen for computing triangle gradients is through the ratio of triangle areas. i saw that eons ago in the glide drivers. have been using it ever since.

so you want to compute the gradient of interpolant I:

consider the area of the triangle in two different coordinate spaces:

(a) the original screen-space (X,Y), and
(b) a space composed of one of the screen space base vectors (X or Y) and the interpolat of interest, I, i.e. that'd be either (X, I), or (I, Y)

then

dI/dX = tri_area(I, Y) / tri_area(X, Y), and
dI/dY = tri_area(X, I) / tri_area(X, Y)

needless to say the code for this is minimalistic and elegant, but if you want to see a sample check thurp's tri plotter (rend/rendPrim.cpp)
 
darkblu, that's exactly the plane equation approach. ;) The C component is exactly the area of the triangle (actually, twice the area). The other componets also correspond with areas.

The method with the inverse matrix appears to be faster when computing multiple gradients.
 
no, cause if your triangle has a zero area you don't want to rasterize it ;)
Sorry, I'm not with you here. Perspective requires division by w, so when it's zero things go wrong. Triangle area can only be computed after assembly (not in the vertex engine like Jawed said). Or am I confusing a couple things here?
 
Sorry, I'm not with you here. Perspective requires division by w, so when it's zero things go wrong. Triangle area can only be computed after assembly (not in the vertex engine like Jawed said). Or am I confusing a couple things here?
Sorry, I missread your post, I thought you were replying to darkblu and not to Jawed ;)
 
Sorry, I missread your post, I thought you were replying to darkblu and not to Jawed ;)
No problem. I still wonder what part of the triangle setup or perspective can be done in the vertex pipeline though. Obviously doing things per triangle is triple the work of doing it per vertex. So anything that can be done in the vertex pipeline should be well worth it.
 
No problem. I still wonder what part of the triangle setup or perspective can be done in the vertex pipeline though. Obviously doing things per triangle is triple the work of doing it per vertex. So anything that can be done in the vertex pipeline should be well worth it.
I think Jawed is simply referring to the perspective division, ATI drivers patch shaders and append some instructions to perform projection to screen space
 
I was referring to the diagram on page 6 of the PDF, which shows that before setup you have backface culling, clipping, perspective divide and viewport transform.

But this could be nothing more than specific to ATI's GPUs.

I'm still trying to understand what the PDF is saying about interpolation :???:

http://www.beyond3d.com/forum/showthread.php?t=5642

Jawed
 
Ah, I see. The slide shows that backface culling and clipping is done after the vertex pipelines, then perspective divide, which makes sense. It doesn't seem like the vertex pipelines are doing any actual perspective work, although it would make sense to compute 1/w there already for later use (ignoring division by zero). For triangles crossing the near clip plane 1/w would have to be recomputed for the new vertices. After that the perspective divide is safe and efficient.
 
Actually that's a seriously groovy thread, largely over my head. Something to come back to...

Still a bit confused about when/where interpolation happens in ATI and NVidia...

Jawed
 
For triangles crossing the near clip plane 1/w would have to be recomputed for the new vertices. After that the perspective divide is safe and efficient.
AFAIK nvidia hw does not perform any geometric clipping, so this step is not required
 
Back
Top