Anarchist4000
Veteran
Different aspect of software scheduling, but yes it's what they have been doing. This would be the compiler generating short instruction sequences to use temporary registers. The register file cache if you will. Problematic if GCN did it as each subsequent instruction is a different wave so temporary registers would be filled. It would require each wave to run a handful, or at least until it stalled, of instructions prior to the next wave scheduling. A matrix multiplication for example being a commonly repeated set of instructions with a lot of data sharing.Correct me if I'm wrong, but didn't NVIDIA go to software scheduling already with Kepler?
http://videocardz.com/71280/amd-vega-10-vega-11-vega-12-and-vega-20-confirmed-by-eec