Hi all,
For some time now I've been contemplating what the instruction set of a unified architecture should look like. AVX2 is obviously a huge step in the right direction by adding gather support and FMA, but to adequately fulfill the roles of a low-end GPU it seems like a few more things will be needed...
In particular, to implement the fixed-function portions of the GPU we still need a relatively large number of legacy CPU instructions. I observed though that many of these instructions just move small bit fields in the right places. Therefore I think that SIMD versions of the bit-level gather/scatter instructions (pext/pdep) could make a significant difference. Note that unlike fixed-function hardware, they could be used to implement many custom algorithms as well. Programmable rasterization, anti-aliasing, and even texture filtering have lately become popular research topics, but they have yet to be implemented efficiently without sacrificing flexibility or costing lots of single-purpose die area. The pext/pdep instructions could help achieve high efficiency at a low cost, and their uses would even go beyond 3D.
So I'm looking for more examples of instructions that would both aid in implementing legacy functionality and enable developers to create new experiences.
Thanks for your suggestions,
Nick
For some time now I've been contemplating what the instruction set of a unified architecture should look like. AVX2 is obviously a huge step in the right direction by adding gather support and FMA, but to adequately fulfill the roles of a low-end GPU it seems like a few more things will be needed...
In particular, to implement the fixed-function portions of the GPU we still need a relatively large number of legacy CPU instructions. I observed though that many of these instructions just move small bit fields in the right places. Therefore I think that SIMD versions of the bit-level gather/scatter instructions (pext/pdep) could make a significant difference. Note that unlike fixed-function hardware, they could be used to implement many custom algorithms as well. Programmable rasterization, anti-aliasing, and even texture filtering have lately become popular research topics, but they have yet to be implemented efficiently without sacrificing flexibility or costing lots of single-purpose die area. The pext/pdep instructions could help achieve high efficiency at a low cost, and their uses would even go beyond 3D.
So I'm looking for more examples of instructions that would both aid in implementing legacy functionality and enable developers to create new experiences.
Thanks for your suggestions,
Nick