Nintendo
Nintendo has always done their operating systems and libraries themselves. They wouldn't gain that much by using x86 or ARM + an existing operating system (such as Android) as a base of their console OS.
This has been historically true, although is it certain that the situation hasn't changed? Part of Nintendo's slide in relevance can be attributed to parts of the market writing it off due to its reduced device capability and lack of platform services, and a fully featured OS can facilitate that. The Wii U's system software at roll-out was in a very bad state, so if Nintendo wants to take the fight more directly to the PS4 (FreeBSD) and XBox One (Windows), is there evidence that they're more comfortable in this zone than previously? If bringing in MIPS and PowerVR, which have existing kernels and platforms, why not draw from those since the rather non-standard memory hierarchy and old customized PowerPC architecture is being dropped?
I have doubts, given the troubles Nintendo's had of late that its internal platform has any significant insights on working with modern architectures relative to platforms that have been evolving so much more quickly.
Hardware setup:
- 6 core / 12 thread MIPS64 CPU
Is there no way to get to 8 physical cores? There are still some reasons to keep at least once core for the system layer, but taking a reserve core out of 6 is going to have an impact.
I'm also not entirely sold on the performance since they're going with DMIPS as a measure, which is an optimistic one used primarily in the mobile/embedded space.
Jaguar does fine with Dhrystone as well, so there may be small but noticeable deficit if the core count starts lower and if the current consoles decide to unlock one of the reserved cores.
All this aside, I would be curious how Imagination fares at the level of system integration, power, and performance in question.
Mobile does encourage a very large amount of integration, but there are different criteria at the console level and I haven't seen something to compare with the current or previous gen consoles.
Sony
My bet goes for an all NVIDIA solution. NVIDIA wants Tegra to succeed badly (Kepler/Maxwell GPUs and the Denver CPU), and they are lacking partners and market penetration. Sony would be perfect fit for them in the high end. I think Sony would like the Denver CPU over a traditional out-or-order ARM CPUs (as their developers are used to low level hardware access + low level optimization).
Perhaps if Nvidia opens up the internals of their optimizer and provides Sony with a way into the secure memory space for the processor.
Otherwise, I am not sure I would characterize Denver as being lower-access than a regular OoO core. It may be lower-access for software, but it would software that would exist as CPU-only code that exists even below Sony's hypervisor.
Not knowing when whole subroutines are going to spontaneously become optimized, become re-optimised, become mis-optimized, or kicked out of the optimization cache is a far bigger unknown than knowing if an OoO core moved a load past a multiply in a ~100 instruction window. The minimum time budgets are potentially far more wobbly with this, and the optimizer shies away from code that changes privilege levels (something the DRM-crazy console platforms may not like).
While it would always come down to what Sony and Nvidia would get out of the deal, I feel at this point that going with Denver is saddling a console with the cost of Denver's other ambitions outside that space. Denver's not that far ahead of the best custom OoO cores in its general performance range, and the console space is much more willing to use many more cores. I would look forward to seeing how Denver scales to higher core counts, particularly since there may be unusually heavyweight requirements for syncing the instruction path now that there are variable amounts of optimized code and Denver already partially reserves one of two cores for the optimizer.
Frankly, I think Sony would benefit from a more focused and physically developed core that doesn't need a almost fully-expanded uop format in memory and a pipeline that probably could be tightened up if it didn't leave open the possibility for supporting the quirks of an arbitrary architecture. The software optimizer's job could even be helped if at least some OoO functionality were in the core, since it could leave unoptimized and unexpanded routines or not invoke the optimizer as much when the power costs can be higher versus an OoO muddling through mostly optimal code. Other dynamic translation/optimization schemes in the past like Dynamo indicated that they found a benefit with software that can optimize itself with an OoO target.
I was considering posting in the Denver thread and here whether Sony of all the manufacturers would be the most paranoid about an architecture that puts a portion of the hardware execution loop off-die, particularly since an ancestor of Denver was Transmeta's Crusoe, and that chip's secure memory partition's DSA key was compromisable.
(edit:
http://www.realworldtech.com/crusoe-exposed/3/ )
Perhaps if the memory is moved onto an interposer, it would be harder for a less-resourced hacker to bus glitch or use a DRAM analyzer, but I think it would be preferable if that memory were stacked or placed on-die to make it hard even for a well-funded criminal organization or reverse-engineering group to crack a program that would let anything running on the CPU be monitored. The optimizer's ability to evaluate much more code also makes it much more capable of determining the value of code it is monitoring than a similar hack of a standard CPU's firmware (security through obscurity rules the day there, although it's not mentioned much).
I'm assuming Nvidia has already thought of this without disclosing it, but I'm also pondering whether ARM's weak Icache consistency could--without further changes like hardware Icache snooping--allow an idling core's stale instructions to execute after the optimizer has reoptimized code past a branch target, potentially allowing execution results to change if the stale ARM code branches to an optimizer cache address for a different program segment.
Microsoft
AMD already has 64 bit ARM server CPUs available with 8 cores and with their integrated GPUs. Scaling these up to meet the console requirements in the next two or three years would be straightforward.
This isn't exclusive to Microsoft, but I am curious going forward if either x86 APU console maker is going to find a significant win going to ARM while still using AMD. AMD is going to have a mostly equivalent x86, assuming AMD is still around.
However, if AMD's x86 is in the cards, I'm curious if Intel will be interested enough to make another run like it was rumored to have been doing for the current gen. With the gradual expansion into more generalized foundry work and hints at diversifying in the face of maturing markets and increasing low-end competition, it might be able to float an offer.
I think there are enough genuine questions about everyone's ability to provide a workable process scaling trend, with Intel the most likely to keep its schedule, and Intel has just as much or more integration research compared to just about everyone else.
They also continue to push the envelope on interfaces and communication methods, while AMD trades away engineers and IP.
AMD's not provided evidence of sustained and significant improvement in their tech, and I think there may be a long-term criticism that relying on semi-custom to fund R&D means getting paid insufficiently to do R&D whose scope is as limited as the needs of customers whose ambitions do not go as far as AMD needs to go. AMD's leakiness in terms of project disclosures and cross-project comparisons that leaked out concerning the current gen consoles months ahead of everything may indicate that the current console makers might have an interest in making sure they don't share this contractor again.