ps/vs 2.0 a subset of ps/vs 3.0 and future?

Nick

Veteran
There are significant differences between ps/vs 1.x versions, and also the step to ps/vs 2.0 meant a whole new way of shader programming. But can anyone confirm that ps/vs 2.0 is a true subset of ps/vs 3.0 or are there small differences? Also, do you expect future shader versions to be based on ps/vs 2.0 as well?
 
I don't think 3 is a superset of 2. IIRC, if you look in the reference rasteriser source, for example, I think you can find some differences... but don't ask me to remember what they are!
 
In terms of features 3.0 may be a superset of 2.0, but in terms of hardware implementation I suspect there's a world of difference. Creating hardware to support dynamic flow control must be alot different than just a linear sequence of instructions.
 
In any pipelined microarchitecture adding dynamic branches increases the difficulty of the ALU's design considerably. In a single core superscalar CPU it's difficult enough, but in a GPU which runs four threads (the pixel quad) concurrently it would seem to be a nightmare to implement.

Wasn't it the ARM that takes care of branching in an interesting way? It doesn't change the IP but instead has a "condition code" that simply masks the result of instructions not to be executed due to the branch. This is only for short branches of course.
 
akira888 said:
In any pipelined microarchitecture adding dynamic branches increases the difficulty of the ALU's design considerably. In a single core superscalar CPU it's difficult enough, but in a GPU which runs four threads (the pixel quad) concurrently it would seem to be a nightmare to implement.

Actually more threads can be better, at least you don't need to worry about branch prediction. Multi-threading hides the latency for you.

Wasn't it the ARM that takes care of branching in an interesting way? It doesn't change the IP but instead has a "condition code" that simply masks the result of instructions not to be executed due to the branch. This is only for short branches of course.

D3D pixel shaders use a "cheaper" kind of predication: compute two results and choose from them using a condition. Of course, real predication is still better, but it's almost useless if you have no branch misprediction penalty.
 
akira888 said:
Wasn't it the ARM that takes care of branching in an interesting way? It doesn't change the IP but instead has a "condition code" that simply masks the result of instructions not to be executed due to the branch. This is only for short branches of course.
For all intents and purposes, the predicate register in the shaders does the same thing.
 
Back
Top