I'd have thought keeping arrays and reducing size of arrays is preferential. With only one Shader Array, you're either 100% PS or 100% VS at any given moment. With 3 smaller arrays you can balance throughput better. I don't know what the overhead in management logic would be though. Would 1 array of 12 ALUs be smaller and cheaper and more/less effective than 3 arrays of 4 ALUs?