JonWoodruff
Newcomer
John Carmack brought up some interesting points about x86 these days being just fine compared to modern RISC processors, and also Itanium's VLIW architecture. One of the key points was code compression. This didn't make too big of an impression on me until I read that http://www.theinquirer.net/?article=7993 compiled Itanium code for VMS is 3x larger than x86 code. If code can be 3x larger, than that means that data cache is 1/3 as effective! (no wonder Itaniums have huge caches) And bandwidth for instructions is also 1/3 as effective.
So the question is: Are we better off with a pure RISC architecture or a CISC architecture? If the P4 was running on pure native microcode instead of unpacking x86 instructions, would it be faster or slower? Considering that decoding only takes a few(2?) stages of a 20 stage pipeline, could it be that the compression you get from having smaller instructions with more implications is worth it?
Disclaimers:
This is all disregarding the limited number of registers in x86 vs other architectures.
I don't expect all Itanium code would be 3x its x86 equivilent.
So the question is: Are we better off with a pure RISC architecture or a CISC architecture? If the P4 was running on pure native microcode instead of unpacking x86 instructions, would it be faster or slower? Considering that decoding only takes a few(2?) stages of a 20 stage pipeline, could it be that the compression you get from having smaller instructions with more implications is worth it?
Disclaimers:
This is all disregarding the limited number of registers in x86 vs other architectures.
I don't expect all Itanium code would be 3x its x86 equivilent.