NV40 Pixel Cluster Dependency

Luminescent · May 24, 2005

It has been previously indicated that NV40 assigns a quad to a one of its four pipeline clusters, dynamically, as each cluster becomes available for processsing (source). However, does this imply that NV40's 4 pixel processing clusters are independent of each other instruction-wise?

More specifically, can each pixel cluster work on its owh pixel shader instruction and data irrespective of the other clusters in NV40?

Rys · May 24, 2005

I'm fairly sure that each pixel quad has to run the same fragment program. I guess that means the same instruction per clock.

Anyone know otherwise?

RoOoBo · May 24, 2005

Rys said:
I'm fairly sure that each pixel quad has to run the same fragment program. I guess that means the same instruction per clock.

Anyone know otherwise?

I don't think there is any good reason to limit all the quad fragment shaders to execute the same instruction from the fragment shader program at a given cycle. Unless, of course, they weren't different processing units but a single processing unit. Shader programs aren't (usually) that large and a small instruction cache of some kind per shader unit wouldn't represent many transistors.

Obviously all fragment (or vertex) shader units run the same fragment (or vertex) program. Changing the shader program is a state change and happens between batches and because complexity reasons GPUs aren't likely to work on different batches at the same time (at least at the same pipeline stage, you may divide the pipeline in geometry and fragment, for example, and run a batch, two pipelined, at each phase).

Jawed · May 24, 2005

Nice digging Luminescent! Hope we get an answer to this.

Jawed

Rolf N · May 24, 2005

Luminescent said:
It has been previously indicated that NV40 assigns a quad to a one of its four pipeline clusters, dynamically, as each cluster becomes available for processsing (source). However, does this imply that NV40's 4 pixel processing clusters are independent of each other instruction-wise?

More specifically, can each pixel cluster work on its owh pixel shader instruction and data irrespective of the other clusters in NV40?

I think the performance characteristics of dynamic branches indicate that there's only a single instruction stream for all quads at any given time.

Xmas · May 25, 2005

I believe each quad pipeline works independently, if not for performance, then for scalability reasons. C&P design.

Of course there could also be a single decoder frontend for all pipelines, but maybe that is not enough redundancy, or maybe it's more difficult getting flow control that way. Also, that could mean batches would have to grow with more pipelines, so each pipeline still gets the same amount of quads.

Overall, considering just NV40 and how it works on quad batches, it doesn't matter much. I mean, NV40 is said to work on batches of ~1000 pixels, ~250 quads. They don't have to be the same triangle, just the same shader. Whether all pipelines work on 64 quads, 1/4 of a batch, or each has its own batch to process, it takes the same time. But output synchronisation might be an issue with different batches.

Jawed · May 26, 2005

This document (thanks to Xmas for digging it up in another thread):

http://developer.nvidia.com/object/xdc_2005_presentations.html

Slide 9

Says that fragment processing is SIMD.

Jawed

NV40 Pixel Cluster Dependency

Luminescent

Rys

Graphics @ AMD

RoOoBo

Jawed

Rolf N

Recurring Membmare

Xmas

Porous

Jawed

Similar threads