That's what I was asking about really.
I can see the sense in a secondary processor to deal with apps and UI. I can see the sense in using a secondary processor to output the final image, so the likes of Twitch and Remote Play can operate with no impact to the game.
But I wonder if there are other uses for a semi custom ARM SoC? We see upscalers in consoles, and an anti-aliasing HDMI cable was released fairly recently. Is there anything akin to those, which could conceivably sit outside of the high performance hardware, yet still be useful?
Perhaps this depends on the usage model for the UI. A more traditional console model would have the UI off-screen during gameplay, and bringing it up would involve reducing the focus of the game. An outside HUD or UI for another app would be a minor performance cost to dedicate another SOC for.
As far as a secondary processor to output a final image, the GPUs already have compositors and scalers, which are dedicated processors or processor blocks for such purposes. Encoder/decoder blocks are also frequently present as well, so what else is needed to be offloaded in this scenario is unclear to me.
What some kind of streaming or remote play scenario requires is that another application or client contend for the same output, buffers, IO, and network resources. The lowest-overhead method would be to have one or more cores in the same memory hierarchy and compute domain be involved in the arbitration and data movement.
Increasing separation with something like a lower-performance version of the same architecture needs to be compared to running the streaming app on one of the main cores. If there are points in the workload where there's an expectation that a big core will be well-utilized, shunting that workload onto a slower core reduces performance in a way that could reduce quality or force the main system allocation to stall longer than if a main core was performing it. This may be some portion of the encoding process, the speed of data movement, or some platform synchronization or arbitration function that takes longer and obligates faster cores to spin until the slower core is done.
Using a different microarchitecture, even if still x86 like the idea of using Jaguar cores in a primarily Zen system may be problematic as well. Apps on the secondary cores would have to be compiled without Zen's ISA extensions, and system functions would need to be written separately due to architectural changes. Sharing memory spaces may not be as easily managed. The TLB system has evolved with Zen, and the cache coherence protocol has expanded.
A different architecture entirely like ARM almost always means encapsulating the systems from each other to the point that they only see each other as IO devices if active concurrently, and frequently perform many operations only when the other is not active.
For security and low-power purposes, there can be a stand-off level of integration with a mostly separate system device. I am not sure what parts that may interact with the foreground functionality can straddle that divide.
edit: grammar