I wonder... I was just thinking of an MCM package consisting of a CPU/Shader die and the eDRAM/ROP die...
First a slide is better than my sub part english
In regard to Edram and future GPUs, I think AMD is to pass and MS too.
I still have no proper response about some questions I had around my head (I consider asking my post to be moved) but one of the question could be resume in something like "in the near future what will prevail many core GPU or multicore GPU"? (I still have no response in this regard tho
)
When I say multi-core I mean multiple fully fledge GPU (dedicated CP, thread dispatcher, etc.).
Basically Intel is following the "many-core" path I should reread Fermi presentation (i'm a bit in the dark here), I think AMd will go the "multi cores".
The fusion relies on some memory controller,
Anand goes as far as stating that L3 should shared between APU. My idea after trying to figure out what are GPU actual lacking could be something like this:
*AMD could built "tiny complete GPU" as building block instead of array.
If I compare a RV8xx to some phenom II of the same size (given the healthy amount of memory in GPU it's close to be fair) I would say a redone of RV8xx may be done out of 4 GPUs. I think it's likely for the CPU and the GPU core to be around the same size, modularity seems important to AMD and it should save themselves some troubles by doing so.
So if it were to be done now a tiny GPUs could be made of a Command Processor, a Thread-dispatcher, 4 SIMD arrays and 2 RBE tied to their "local" part of the L2.
In the future I think they should dump the RBE as Intel and move to tile/bin based deferred renderer. They should have more generic caches.
By having less cores ATi could let some "communication headache" that Intel has to face with Larrabee aside for a while, say you have 4 CPUs 8 GPUs and have a coherent memory model it's still a lot but it's still less bothering than dealing with 32 cores (or more).
AMD has hinted that they present unexpected advancements on the multi-GPUs front, I think the new informations we have may make things clearer. In not that far future I could see the Command Processor go through a significant evolution and be able to submit new tasks to himself and act more like a CPU or an autonomic VPU (better glossary/nomenclature wanted...). It will also have to work with CPUs and GPUs.
I could see the CPUs and the GPUs running in a kind of runtime environment, both GPUs and CPUs could update a "task/command buffer" shared and present in L3 cache (not a huge one) or keep the new job for himself, may sets task priority or affinity (CPU/GPU), etc. and even steal job from another overbooked processor.
For the cache structure, I could see CPU and GPU having read/write access to their local share of L1 and L2 and to the L3, read only to other GPUs/CPUs L2.
EDIT
In the new slides (
here slide 2) AMD plan for "heterogeneous computing" for 2012, that could be in line with MS next system launch.
EDIT 2
Repi posted a new presentation on his blog and it's pretty enlightening
It's
here
There is an interesting one in regard ot are current talk page 49 "we don't need a seperate CPU and GPU"
EDIT3
This is great, possibly a quad core (4Mo of L2) packed with possibly 480SP/ 6 SIMD arrays
This could give low end laptop/desktop gaming some fresh air