The weird thing is that the fundamental reason for why Cell is what it is, and any trouble that may cause, did not go away. Cell designers did not toss the traditional memory model out by accident. It is a willful exchange for more core scalability, which leads to higher throughput per area. It is a very deliberate shuffling of complexity out of the hardware (which incures material costs per piece), into the software (which incurs one-time costs but is then either free or very cheap to replicate in mass). Sony is in the business to mass-manufacture things. They'd typically choose the solution that will approach the lower cost when, and this is the important part, when assuming very high volume runs. They have an observed tendency to assume that they can single-handedly drive volumes of whatever component they integrate straight to viability.
Cell is not a surprising design for Sony to pick at all.
Coherent caches may be "nicer", but the cost of coherency scales exponentially with the number of cores. For one or two or three cores, sure, coherent memory is a no-brainer. When you look at the lengths Intel and AMD are going to for their "many-core" x86s CPUs -- which for posterity are two~three full process nodes beyond the initial CBE design and run at roughly the same clocks --, you can already see it piling up though.
So more cores or "nicer" memory model? They do compete within the same die space, and it only gets worse the more cores you set as your baseline. What is the baseline for a next-gen CPU? 8 cores at 3.2GHz, 8-way-SP-FMA ... again? Or less?