On the other hand, i thought i read that CineFX had 2 address registers in the VS, and 64 temporaries in the PS...
Kristof, are these VS/PS3.0 specs final do you think? I was thinking about it, and doesn't having dynamic flow control in the pixel shader units pretty much require an intruction cache per pipeline (since with K pipes you potentially need to fetch K instructions from different addresses in the i-cache)? With a max instruction count of 1024 this sounds really expensive, assuming Basic's estimated 64bit+ instruction encoding is what's used.
Apart from that the VS/PS3.0 units look very similar - both have "texture samplers". Also since pixel programs which start at the same time will not necessarily end at the same time, does this mean that a set of K=NxM pipelines will stall until the "longest" pixel program finishes before starting on new pixels? Perhaps PS/VS 3.0 units can be made identical in hardware implementation, allowing for load-balancing of these units...
Anyway, very interesting stuff.
Kristof, are these VS/PS3.0 specs final do you think? I was thinking about it, and doesn't having dynamic flow control in the pixel shader units pretty much require an intruction cache per pipeline (since with K pipes you potentially need to fetch K instructions from different addresses in the i-cache)? With a max instruction count of 1024 this sounds really expensive, assuming Basic's estimated 64bit+ instruction encoding is what's used.
Apart from that the VS/PS3.0 units look very similar - both have "texture samplers". Also since pixel programs which start at the same time will not necessarily end at the same time, does this mean that a set of K=NxM pipelines will stall until the "longest" pixel program finishes before starting on new pixels? Perhaps PS/VS 3.0 units can be made identical in hardware implementation, allowing for load-balancing of these units...
Anyway, very interesting stuff.