DiGuru said:Most transistors that go into an ALU consist of execution logic. Next to the instruction fetch/decode logic, you need a bunch of transistors to execute each opcode. And you need transistors that determine where to store what. And transistors to keep track of what each part is used for. And caches to store the state of all that when switching fragments and shaders.
Strictly speaking, I dont think most transistors in an ALU are execution logic. With heavy pipelining and synchronization demands, I would bet most transistors in an ALU are used for holding the results at pipeline stage boundaries. True those transistors are in the ALU, but they aren't computational transistors.
Still it is an interesting question.
I would also bet there is very little i-decode logic. Do shaders even have instruction CROMs to translate from the "opcode" to actual wide-word direct hardware codes? Or is everything VLIW (that would be my guess)?