# Triangle setup

Discussion in 'Architecture and Products' started by Nick, Sep 7, 2006.

1. ### Nick Veteran

Joined:
Jan 7, 2003
Messages:
1,881
Location:
Montreal, Quebec
Hi all,

I was wondering what exactly the triangle setup stage comprises, and how it's implemented on a modern GPU.

If I understand correctly, it's main task is to compute gradients of interpolants. When graphics cards specify a setup rate of one triangle per clock cycle, does this include all interpolants (including up to 8x4 texture coordinates)? Or is the triangle rate dependent on the number of interpolants?

Do they derive the gradients from the 'plane equation approach' or is there a faster method? With 'plane equation approach' I mean for example for the z interpolant (could be any other interpolant) we compute the plane equation Ax+By+Cz+D=0 so the gradients are dz/dx=-A/C, etc. Since (A, B, C) is the plane normal it can be computed with a cross product. C can be reused for all interpolants, but it still looks like a lot of work. And since every interpolant requires perspective correction this adds three divisions and a whole lot of multiplications.

Any thoughts?

#1
2. ### nAo Nutella Nutellae Veteran

Joined:
Feb 6, 2002
Messages:
4,324
Location:
San Francisco
#2
3. ### Nick Veteran

Joined:
Jan 7, 2003
Messages:
1,881
Location:
Montreal, Quebec
Thanks! I just managed to get the same result with the plane equation approach:

dz/dx = -A/C = ((y1/w1 - y0/w0)*(z2/w2 - z0/w0)-(y2/w2 - y0/w0)*(z1/w1 - z0/w0)) / ((x1/w1 - x0/w0)*(y2/w2 - y0/w0)-(x2/w2 - x0/w0)*(y1/w1 - y0/w0))

= z0 * (y1*w2-y2*w1) / (-x1*w0*y2+x1*y0*w2+x0*w1*y2+x2*w0*y1-x2*y0*w1-x0*w2*y1)
+ z1 * (y2*w0-y0*w2) / (-x1*w0*y2+x1*y0*w2+x0*w1*y2+x2*w0*y1-x2*y0*w1-x0*w2*y1)
+ z2 * (y0*w1-y1*w0) / (-x1*w0*y2+x1*y0*w2+x0*w1*y2+x2*w0*y1-x2*y0*w1-x0*w2*y1)

Which is exactly the product of (z0, z1, z2) with (the first column of) the inverse of the matrix formed by the (x, y, w) coordinates, as shown in the paper you refer to.

So is the setup engine's task limited to computing this matrix inverse as efficiently as possible?

#3

Joined:
Oct 2, 2004
Messages:
10,662
Location:
London
#4
5. ### nAo Nutella Nutellae Veteran

Joined:
Feb 6, 2002
Messages:
4,324
Location:
San Francisco
well..that's a clever implementation but dunno if real hw does use this system to setup triangles. It's also nice to note that if the inverse matrix does not exist..then you automatically know that you triangle has to be rejected
It seems next generation GPUs might use shader ALUs to setup triangles..

#5
6. ### darkblu Veteran

Joined:
Feb 7, 2002
Messages:
2,642
the cleverest way i've seen for computing triangle gradients is through the ratio of triangle areas. i saw that eons ago in the glide drivers. have been using it ever since.

so you want to compute the gradient of interpolant I:

consider the area of the triangle in two different coordinate spaces:

(a) the original screen-space (X,Y), and
(b) a space composed of one of the screen space base vectors (X or Y) and the interpolat of interest, I, i.e. that'd be either (X, I), or (I, Y)

then

dI/dX = tri_area(I, Y) / tri_area(X, Y), and
dI/dY = tri_area(X, I) / tri_area(X, Y)

needless to say the code for this is minimalistic and elegant, but if you want to see a sample check thurp's tri plotter (rend/rendPrim.cpp)

#6
7. ### Nick Veteran

Joined:
Jan 7, 2003
Messages:
1,881
Location:
Montreal, Quebec
How is this possible exactly? Couldn't it create division by zero? Thanks!

#7
8. ### nAo Nutella Nutellae Veteran

Joined:
Feb 6, 2002
Messages:
4,324
Location:
San Francisco
isn't this exactly the same computation?

#8
9. ### nAo Nutella Nutellae Veteran

Joined:
Feb 6, 2002
Messages:
4,324
Location:
San Francisco
no, cause if your triangle has a zero area you don't want to rasterize it

#9
10. ### Nick Veteran

Joined:
Jan 7, 2003
Messages:
1,881
Location:
Montreal, Quebec
darkblu, that's exactly the plane equation approach. The C component is exactly the area of the triangle (actually, twice the area). The other componets also correspond with areas.

The method with the inverse matrix appears to be faster when computing multiple gradients.

#10
11. ### Nick Veteran

Joined:
Jan 7, 2003
Messages:
1,881
Location:
Montreal, Quebec
Sorry, I'm not with you here. Perspective requires division by w, so when it's zero things go wrong. Triangle area can only be computed after assembly (not in the vertex engine like Jawed said). Or am I confusing a couple things here?

#11
12. ### nAo Nutella Nutellae Veteran

Joined:
Feb 6, 2002
Messages:
4,324
Location:
San Francisco

#12
13. ### Nick Veteran

Joined:
Jan 7, 2003
Messages:
1,881
Location:
Montreal, Quebec
No problem. I still wonder what part of the triangle setup or perspective can be done in the vertex pipeline though. Obviously doing things per triangle is triple the work of doing it per vertex. So anything that can be done in the vertex pipeline should be well worth it.

#13
14. ### darkblu Veteran

Joined:
Feb 7, 2002
Messages:
2,642

well, you know, sometimes people are in a 'write-before-read' mode. and it usually happens in the morning of busy days ; )

don't mind me, everybody, carry on.

#14
15. ### nAo Nutella Nutellae Veteran

Joined:
Feb 6, 2002
Messages:
4,324
Location:
San Francisco
I think Jawed is simply referring to the perspective division, ATI drivers patch shaders and append some instructions to perform projection to screen space

#15
16. ### Jawed Legend

Joined:
Oct 2, 2004
Messages:
10,662
Location:
London
I was referring to the diagram on page 6 of the PDF, which shows that before setup you have backface culling, clipping, perspective divide and viewport transform.

But this could be nothing more than specific to ATI's GPUs.

I'm still trying to understand what the PDF is saying about interpolation

Jawed

#16
17. ### Nick Veteran

Joined:
Jan 7, 2003
Messages:
1,881
Location:
Montreal, Quebec
Ah, I see. The slide shows that backface culling and clipping is done after the vertex pipelines, then perspective divide, which makes sense. It doesn't seem like the vertex pipelines are doing any actual perspective work, although it would make sense to compute 1/w there already for later use (ignoring division by zero). For triangles crossing the near clip plane 1/w would have to be recomputed for the new vertices. After that the perspective divide is safe and efficient.

#17
18. ### Jawed Legend

Joined:
Oct 2, 2004
Messages:
10,662
Location:
London
Actually that's a seriously groovy thread, largely over my head. Something to come back to...

Still a bit confused about when/where interpolation happens in ATI and NVidia...

Jawed

#18
19. ### nAo Nutella Nutellae Veteran

Joined:
Feb 6, 2002
Messages:
4,324
Location:
San Francisco
AFAIK nvidia hw does not perform any geometric clipping, so this step is not required

#19
20. ### Simon F Tea maker ModeratorVeteranSubscriber

Joined:
Feb 8, 2002
Messages:
4,524
Location:
In the Island of Sodor, where the steam trains lie
Are they still using texkill to do user plane clipping?

#20