They may not be that big by then, either. By that time, smaller process nodes will have matured. There was all the talk way back when about 65nm production for both XeCPU and CELL, but then, there was also talk further back about CELL having some ridiculous number of cores. If you can get the gaming industry to be in a position where the number of threads they would be comfortable with is far greater than what the next-gen consoles can provide, you'll have the impetus for more cores in the following generation of consoles, but I don't see the complexity of cores jumping that quickly. Main reason being that once you've driven the point home about how effective multi-core can be, you're essentially moving down the TPC line. At that point, which is easier? Putting more complex OOOE in each of your cores or allowing more ways of SMT so that TLP can cover for the lack of ILP? Putting a beefy speculative prefetcher or putting more cache? Making the branch predictor more extensive or allowing predication and relying on good compilers?