Please explain traditional vs. Nvidia unified vs. Ati unified

kemist

Newcomer
I know that there is only speculation right now, over the upcoming architectures, but can someone please explain to me a few things.

First, my understanding of a traditional pipeline is like this, you have a vertex unit, a tmu, an alu for pixel shading, and a ROP (this correct?) I understand the rop must be last for output but does the order of the others matter? If this is incorrect can someone explain the set up of a traditional pipeline to me?

Second, in a unified architecture how is this changed? You have a general purpose shader, a tmu, and a rop. How does order work with this? Does GS->VS->PS have to be executed in that order since GS is a bunch of triangles, vs is one triangle, and ps is just a pixel of a triangle? Does a PS have to wait for the results of a GS before execution etc?

Also, this is why its expected that nvidia has only unified GS/VS because then it just looks like GS/VS, tmu, PS, ROP, which is much more conventional, correct?

Also, just to clear up a bit, a 16-1-3-1 part is 16vertex shaders, 1 tmu, 3 alu, and 1 rop? wouldnt a better way of saying this be 4*(4-1-3-1) or 16 + 4(1-3-1). Vertex can output to any shader unit correct?

Finally, with Xenos or with R600 is it the scheduler able to dynamically set the shaders to do anything on the fly, for example 3gs, 5Vs, and say 16 ps (random numbers) or must an entire array be set to perform one type of shader?

Sorry for all of the questions. I have read a good amount about architectures, but im just an interested enthusiast so some things are unclear.
 
I'll answer a couple questions and leave the rest for others. First the order does matter. All the vertices of a triangle must go through the VS before the GS can start. Then the GS primitive must finish before the pixels for that primitive can be rendered through the PS. If blending is not enabled other triangles/primitives can go through the process in parallel. Despite this ordering restriction all 3 stages will be going on at the same time.

16-1-3-1 means 16 color rops, 1*16 texture units, 3*16 alus, 1*16 depth/stencil units.
 
This is not true for overlapping primitives if Z testing is on (with a useful Z test), regardless of blending. See my previous post on the topic for details and an example.
I was being brief and just trying to say you don't have to do all the VS for a draw packet, before starting the GS, etc. Of course this still is true even if blending was enabled so I probably should have left that part out of my response.
 
I was under the impression that there wasn't necessarily a fixed order other than the PS runs last. I thought it was said that the VS/GS could loop on themselves before making it to the PS. So VS->GS->VS->PS. Looking at the capabilities of VS/GS it should be possible to use just one or both. Assuming you don't have transformed and lit geometry in which case you would skip both.
 
DX10 defines a VS before the GS although you could just make it passthrough/null shader if you don't need it. There is no VS after GS so if you want to do that you have to streamout or just include the VS part in the GS. Hope that makes sense.
 
Back
Top