Hi all,
I've been trying to understand the new GPU architectures' implementation details, and while I think I grasp the bulk of it, I'm curious about the SMT capabilities.
Say we have a pixel shader full of MAD instructions, and a vertex shader full of special-function instructions, do they execute concurrently and maximize ALU usage? Likewise, if the top half of a pixel shader is full of MAD instructions, and the bottom half is full of special-function instructions, and hard dependencies exist between the parts, do other batches of pixels (i.e. threads) increase ALU utilization?
Or is this form of 'Hyper-Threading' still a CPU-only feature (to be reintroduced by Nehalem)? If it is, what can be excpected for the foreseeable future?
Thanks,
Nicolas
Edit: Since Arun clarified that ADD and MUL use the same ALU (at least on G80), I've changed the example to MAD and special-function.
I've been trying to understand the new GPU architectures' implementation details, and while I think I grasp the bulk of it, I'm curious about the SMT capabilities.
Say we have a pixel shader full of MAD instructions, and a vertex shader full of special-function instructions, do they execute concurrently and maximize ALU usage? Likewise, if the top half of a pixel shader is full of MAD instructions, and the bottom half is full of special-function instructions, and hard dependencies exist between the parts, do other batches of pixels (i.e. threads) increase ALU utilization?
Or is this form of 'Hyper-Threading' still a CPU-only feature (to be reintroduced by Nehalem)? If it is, what can be excpected for the foreseeable future?
Thanks,
Nicolas
Edit: Since Arun clarified that ADD and MUL use the same ALU (at least on G80), I've changed the example to MAD and special-function.
Last edited by a moderator: