Idiot Question about Vertex/Shader processing...

I have been reading a lot about branchig and looping Etc and how that makes some architectures more flexible etc...

Does this Equate to *more powerfull* by default??

Purely Speculative situation.. (in other words lets pretend)

1. Radeon 9700's Quad Vertex engine can outpu 325M Tri per second.. (right)? Say it has absolutly *no* branching looping etc..

2. The Nv30 puts out the exact same 325m but Does support the above

3. The P10 puts out say.. 200m Tri and also supports...

Now keep in mind i just pulled some of this out of thin air for discussion sake. What would the difference be in an actual *game* be?? It seems to me from a Developers perspective the one that does 325M is the best right?

-what does all that branching and looping do for you.. would it actually change my above conditions? Say in 2. Would the Nv30 actually ouput more than the 325m on a same clock speed comparrison? perhaps needing less pipelines?

I have not seen this discussed yet here, other than just mentioning it in passing. Please Feel free to correct my Termiology Etc as need be. I think you guys understand the point Im getting at.. and can also see any specific area to this topic that may need to be brought up that I have left out.
 
The issue is not how many tris per second, but how many vertex ops per second.

The number you are quoting is the raw triangle setup speed. As soon as you add any extra lights or transformation effectives that figure gets cut way down.

A vertex shader that was 65,536 ops long would only be capable of delivering about 5000 vertices per second performance if you only have 1 vertex unit


The reason looping and branching leads to "more power" is that in some cases, to emulate a dynamic branch or loop, you need to go multipass, which may or may not kill your performance, depending on the situation.

I feel that vertex shaders are the wrong place to concentrate your attention, since many things that used to be done per-vertex will now be done per-pixel in DX9.
 
Hmmm..

When You say branching... branching to what???

I understand what branching does on a common X86 processor. In a Vertex operation.. what are you branching to/between?
 
The same thing you'd branch on a CPU -- conditional execution.

One example is a loop whose iteration count is calculated at runtime

(pseudo basic)
COUNT = F(X)
FOR I = 0 TO COUNT
some code
END FOR

This generates a dynamic branch that looks like this

LOOP:
some code
I = I + 1
IF I < COUNT GOTO LOOP

This loop cannot be unrolled by the compiler, thus the only way to do it on an architecture without dynamic branching is to go multipass.

Note: even the NV30 cannot evaluate this loop in pixel shaders.
 
Back
Top