To say the truth, you are notSo how do you keep all units (2 alu blocks, ld/st, sfu) busy in fermi? I can't see how this should work given the wording in the fermi whitepaper.
You can issue an instruction to 2 functional blocks per (scheduler) cycle (if you have DP instructions only to one). So if you have an instruction for the L/S or the SFU pipe, one of the ALU blocks is not getting one.