This is what I would expect to see from a unified architecture when it is given a workload that is heavily limited by the vertex shader: it becomes limited by triangle setup.There's something wrong with the VS speed tests. Why is 1 point light as fast as 8 point lights?