There is an article about eDRAM at Real World Tech, which mentions IBM talking about these issues.
http://www.realworldtech.com/page.cfm?ArticleID=RWT020705121631&p=2
John Barth - IBM Systems & Technology Group
For embedded purposes, SRAM is the de facto choice for discerning designers. Embedded SRAM provides the fastest cycle times while operating well in a semiconductor logic process. However, one bit of SRAM storage typically requires 6 transistors, whereas a DRAM cell only needs 1 transistor plus one capacitor. Hence, the common argument in favour of eDRAM is that of the 4x density advantage relative to eSRAM.
While not ignoring this point, the presenter saw the problem from another perspective. While it was conceded that eSRAM provides the fastest random access cycle times, eDRAM can come close, and the remaining performance differential between eSRAM and eDRAM can be mitigated through architectural choices if they are considered early enough in the design cycle.
The speaker went on to argue that most high-end designs are more oriented about the memory hierarchy than the logic circuits themselves. Further, he posed an example of where eDRAM may be superior to eSRAM in a conventional logic design. The floorplan for the Itanium2 9M processor was displayed, as can be seen in Figure 2. The furthest L3 subarray was estimated to be 23mm away from the cache controller in Intel's layout. The floorplan for a hypothetical Itanium2 9M which used eDRAM for the L3 cache array was then shown (Figure 3). In this floorplan, the furthest subarray would only be, roughly, 14mm away from the cache controller. Delay approximations were made for the hypothesized array, and the results can be found in table 1 below.
Thus, while the actual eDRAM cells are slower than the corresponding eSRAM cells, the increased density of eDRAM leads to shorter wires in the L3 cache array. The reduction in worst-case wire length (23 to 14mm) corresponded to a 39% reduction in wire delay. It should be noted that the speaker emphasized that they took certain liberties when deriving these figures.
During the question session, it was asked what additional costs were involved with fabricating chips using eDRAM. It was stated that the eDRAM process adds 3 extra mask stages before any of the other logic process steps, and that the typical cost adder is on the order of 20%. Thus, there is a cross-over point between the additional cost of eDRAM processing and the increased density of eDRAM. Presently, this cross-over tends to exist around the 8-16Mb mark.
http://www.realworldtech.com/page.cfm?ArticleID=RWT020705121631&p=2