Just an historical curiosity... how big (in bits or bytes) was the Pentium 4 cache that contained the decoded x86 instructions? Official documents state it's 12K of microops, with 8 way associativity.
As for the microops size, their format and how many ops per x86 instruction were generated, I don't think I've ever seen a definitive answer. Some sources say each opcode was 100 bits in length. That would mean the equivalent of 150 Kilobytes for 12K. Looking at this picture of a 180nm P4, however, the trace cache (left side) seems smaller than each half of the 256KB L2 cache.
www.tayloredge.com/museum/processor/2000_Pentium4.jpg
It's also not clear to me why the cache contains a number of instructions which is not a "nice" multiple of 2. I would understand it if the cache had 3 or 6 way associativity. Was part of the cache deactivated for sake of improving the yields?
As for the microops size, their format and how many ops per x86 instruction were generated, I don't think I've ever seen a definitive answer. Some sources say each opcode was 100 bits in length. That would mean the equivalent of 150 Kilobytes for 12K. Looking at this picture of a 180nm P4, however, the trace cache (left side) seems smaller than each half of the 256KB L2 cache.
www.tayloredge.com/museum/processor/2000_Pentium4.jpg
It's also not clear to me why the cache contains a number of instructions which is not a "nice" multiple of 2. I would understand it if the cache had 3 or 6 way associativity. Was part of the cache deactivated for sake of improving the yields?