Too Tricky...
Fewer chips on the board = better board.
Start with a mature die process and design a complimentary internal bus architecture.
Join separate chips into System on Chip units as production and die processes mature.
Ok but look at this generation depending on the silicon spent on the future system it will tough to pack all this little cores together.
Anyway you're right Fewer chips on the board = better board.
But I was missing time yesterday night (had to go to bed working early in the morning
).
I think that implementing two gpu is non trivial.
If we look at how SLI/crossfire systems work right now, it looks clear that manufacturers won't be in a position to afford it:
two vram pools
twice the rop
etc.
How to deal with that and reduce the overheads?
I guess that where it can hurt in regard to R&D.
I guess that at some point the cpu provider and the gpu provider have to work together.
How it could look?
I don't know what is possible in regard to process/r&d cost etc..
I see two possibilities:
1) include most of the gpu fixed function hardware on the cpu as well as the memory controler, and your left with two coprocessors/chip
made mostly of shader cores.
2) Something that would looks more like what AMD could come with.
could be made of only two chips:
1: CPU+shader cores 2: shader cores plus fixed functions hardware.
While the later might lend to a cheaper mobo design but I feel that the former good yield to better results and wouldn't cost that much
(always questionable when we speak of millions and millions of units...).
You could provide the CPU/GPU a huge bandwidth and use two fast serial lane to connect the two coprocessors to the cpu (gpu thread scheduler infact, think about a Y for the datapath)
If the main thread scheduler is on cpu it could make load balacing easier that having two (one on each die) trying to discuss.