Question about a quote from the B3D G80 article, and nvidia FUD doc

armchair_architect · May 26, 2007

Arun Demeure said:
It is multiplied 1/pos.w, and that is calculated by the SFU (via rcp) at the beginning of the program.

I think you've got that backwards. If the VS outputs 6 scalar attributes (say vec4 position and vec2 texcoords) (x,y,z,w,s,t), then the values used for plane equation setup are (x/w, y/w, z/w, 1/w, s/w, t/w). To get the perspective-correct interpolation of 's', for example, you'd need to do:

Code:

INTERP r0, v[3]; # evaluate 1/pos.w at pixel
RCP r0, r0;      # compute interpolated pos.w
INTERP r1, v[4]; # evaluate s/pos.w at pixel
MUL r1, r1, r0;  # compute p-c interpolated s

Jawed · May 26, 2007

Sigh, confusing G84/G86.

Anyway, within a cluster, because G84 can produce twice as many texture results per core-clock as G80, that should mean that G84 has a higher write capability for the register file.

In fact don't texture operations read from the register file as they're submitted (reading u,v after interpolation). So, the higher texel throughput of G84 means that both read and write rates have to be bumped up.

If that's the case then register file bandwidth would have less impact on your MUL + MUL shaders, which would lead to higher "co-issued" MUL throughput in G84.

Jawed

Arun · May 26, 2007

armchair_architect said:
I think you've got that backwards.

Heh, I guess the way I was thinking about it wasn't too intuitive. I was thinking of "pos" as the position register that is stored in hardware per-pixel. If you think of pos as, errr, actual position... Then yeah, ignore me in that case!

Question about a quote from the B3D G80 article, and nvidia FUD doc

armchair_architect

Jawed

Arun

Unknown.

Similar threads