I think the main differences will be in board design, possibly memory also.
It wouldn't surprise me if the blade boards have high speed interconnects so that 2 or maybe all 4 boards can be stacked.
Unlike stadia's use, I don't think it would be made available for game streaming though.
I've also been giving chiplet / mcm apu some thought.
I think for ps5 monolith maybe the choice, even though it will also be used for game streaming.
But Scarlett has 2 consoles, game streaming and azure use. I wonder if they will embrace chiplets etc. Allowing slightly more customization and flexibility.
A lot of tech will be ready in time, but consoles are usually based on mature manufacturing.
In the paper the CU to CPU ratio is 8 which would also fit with a 8 core Zen 2 chiplet in combination with a single 64 CU GPU chiplet or two 32 CU GPU chiplets. The latter would be ideal for reuse but seems unlikely considering it would require techniques to make it invisible and present it as a single GPU to developers. On the other hand MS is a software company first and foremost and considering how much experience with graphics APIs they have, MS would probably be the best company to tackle it in combination with AMD
The biggest problem is in the pc space. Server and console not so much.
Servers probably already have to handle mgpu type set ups.
DirectX already has mgpu functionality.
It may not be invisible, but it may not be as big a deal in a static box. Patch unity, unreal, other engines to handle it in a basic way which may leave some performance on the table, but be nice fallback.
For MS the loss in performance may be worth it for the overall benefits.
Just waiting to be told that I'm 100% wrong.
if you really want to maximize the reuse between your consoles and servers then you probably need a GPU which is not specialized to gaming but also has high double precision performance
Is the GPU's actually very different or is it just disabled?
I'm wondering how much actual die saving is there if you don't have double precision?
Edit :
It seems that AI and ML tend to prioritize lower precision not higher, which is one of the use cases Phil gave. So maybe not having FP64 may not be a big deal for the intended work loads.