Vertex Shader Emulator and SSE

Nick

Veteran
If I understand correctly, the software vertex shader of DirectX has been written by Intel. Now, for best performance with SSE instructions, they have to put the data in a SoA format, right?

But how can they do this with vs 2.0, which allows dynamic branching? With the SoA format you have four components of four different shader registers in an SSE register. But since branch control is per vertex they can't keep it in this format.

So do they really use SoA or is it plain AoS where every SSE register corresponds with a shader register? If there is a big performance difference between fixed-function vertex processing and the corresponding shader, they must be using two implementation...

Any ideas?
 
Dio said:
Well, you can convert any branch to a predicate...
You mean with cmov? Yes, I've been thinking about that as well, but then you would be doing twice the work, and with multiple jumps or calls it must be worse. :?
 
Hmm ... you can in general convert forward branches to predicates rather easily, but I don't see how you can do it with backward branches (in particular while-loops).
 
arjan de lumens said:
Hmm ... you can in general convert forward branches to predicates rather easily, but I don't see how you can do it with backward branches (in particular while-loops).
You need a predicate per iteration? :(
 
Back
Top