The GPU chiplet thing does make sense to me though. The IO chip should have the graphics display engine, so you can power the GPU chiplet down completely while still refreshing the screen; This has been an Achilles heel of AMDs mobile solutions vs Intel since .... forever.
Is the IO die in question an IO die used by all client products, or a variant for the G products? The cost is either two similar chips being engineered, or one incrementally larger one adding cost for all of them.
One departure in this from other AMD proposals is that they've always had the GPU drive some kind of memory bus attached, which makes sense since GPU silicon needs tend to be closer to what the IO domain has than the CPU regions. Architecturally, they are better at utilizing DRAM on a sustained basis, and that same capability usually costs more if it's not on-die. Perhaps the strangled bandwidth of the socket makes that less important in this instance, but some of the video's rumors indicate the hardware is overspecced for that bandwidth already, leaving the question on what the inflection point may be for GPU silicon wastage and link capability versus 7nm dies and chip variants.
AMD's modern APUs can gate most silicon besides the controller and memory. The power domains for Raven Ridge are set up to allow this.
The efficacy of power gating of a GPU on-die would be compared to the power gating of a chiplet.
There are elements neither can turn off completely, and so I'm not certain if the chiplet adds much in the idling scenario besides the link controllers and off-die interconnect that cannot be gated.
I have some question about whether there's a control complex or series of dependences between the command processor on the chiplet and the ancillary hardware now moved to the IO die, but there could be methods to handle it.
The rumored CU counts seem of dubious value with the DDR bandwidth available, and I don't know what to make of the video's using the same name for the Ryzen G products and a discrete product allegedly capable of hitting Vega 56 performance. There's flexibility with chiplets to an extent, but the range between 40GB/s bandwidth and link bandwidth versus matching a product with 512 GB/s seems to stretch what the silicon and link can achieve without some significant gaps in what can be reconciled on a supposedly compact chiplet.
But requiring the presence of an I/O chip makes no sense in mobile/desktop SKUs.
It would make manufacturing more expensive (2 dies per CPU instead of 1) and require the existence of multiple I/O chip variants since the one on Epyc is overkill.
The "dumb chiplets + I/O hub" scheme is worth it only on Epyc and Threadripper CPUs with lots of cores.
Having more than one die has happened before, such as with quad-core Conroe products. There is a yield and assembly cost to this, and three chips can add to it. It might depend on where AMD's projections are for volume and yield for an unknown set of chips. If there were a definite high volume of high-yield silicon for a given combination of features, this might lose out. However, if AMD's being pessimistic about the volume or manufacturability of a given graphics or processor SKU, this might be a sensible but still less than ideal decision.
Performance or power-wise, I question the latency for the CPU and bandwidth for the GPU. An MCM will have a higher floor in terms of power consumption due the links and whether that is countered by the presence of 7nm on some of the chips is unclear. Cost-wise, I am curious how appealing this is at the cheapest and high volume SKUs. The supposedly debunked rumor of a 28nm bargain-basement single-chip product might make sense in this light, particularly if it was a contingency plan if Globalfoundries was somehow not capable of servicing that niche or wanted to hold that range hostage in terms of WSA negotiations at the nodes Zen was on.