What are you going to do with a memory bus?
I am seeing 2 and 4 core ARM Cortex A9's with 1x32bit LPDDR3 and 2x32bit LPDDR3 modules. Now you project a 64 core processor (16x-32x more cores) so now you need to have a memory bus capable of feeding all those cores. You also are now in a situation where you need to coordinate traffic, caches, etc. There is a reason we are seeing beefy L2s and even beefier L3s (and even L4s) on the high end CPUs: you need to keep them fed somehow. A number of developers here have talked about how strong memory subsystems and hierarchies, which are really hard to do, are more important than straight up execution units. You also have an issue of cross traffic and communication with so many cores (hence all the work on latices, grids, and such on the Intel's demo TFLOPs initiative). And then there is
Amdahl's Law. You are going to be looking at IPC per core, if I was being really generous (assuming they beefed up the core and added a full fledged FPU support on part with AVX) at half of a single iCore processor. When you are looking at 6 core (12 threads) with a much better IPC you will see the Intel Core_0 at 100% utilization what will probably take 3 or 4 of these ramped up ARM cores to do because as you spread the workload you get diminishing returns. That is where the cache issues and memory bus really come into play because what Intel is doing on 1 core the ARM chip has 4 cores trying to communicate with each other and contending for memory resources.
There would be certain loads where pretty naive ARM cores would be great; e.g. in a basic webserver where you get a lot of request without a ton of dependencies (think serving pages or logged in users in a very low resource intensive environment) this could be a win. You could basically ramp up 64 virtual machines that are on 64 real processors. But game code? At least SPEs had a memory model (Local Store, basically 256KB of very fast local memory) and a ring bus (EIB) for memory traffic that addressed some issues--and when exploited they were really fast (just hard to maximize). Basically ARM would need a really radical design change to not only compete at the high frequencies you talk about but ALSO need a radical redesign for the entire memory systems. Just throwing 64 cores on a die and calling it good would be a disaster! And the cost of all the fancy upgrades isn't going to be cheap. Color be skeptical that an ARM architecture with a similar mm^2 and power budget as a i5 or i7 would be able to compete without it become something completely different than it is.