It should be noted that the 256GB/s of bandwidth between those ROPs and the eDRAM isn't compressed. AFAIK, and somebody correct me if I'm wrong, ATI and nVidia have both invested quite a bit in compression between the ROPs and main memory for all PC GPUs. 256GB/s can essentially be completely used, and as such it isn't "overkill"--because it's exactly the amount necessary. 8 bytes per sample (4 bytes color/4 bytes z), 4 samples (4xAA), times 2 (read-modify-write), with 8 ROPs, running at 500MHz - 8*4*2*8*500000000 = 256GB/s. But, again, there isn't any form of compression there.
And ultimately, the result of eDRAM is that the system isn't so completely, horribly bottlenecked by a SINGLE 128-BIT BUS as it would have been without the eDRAM. The eDRAM was a design choice, just like using two pools of memory in the PS3 was a design choice....without eDRAM, the system would look quite different from the way it is now. eDRAM wasn't ever going to make "brand new effects" possible... just make some of them feasible, assuming they were worked into the engine at the proper time, and the tradeoffs were acceptable. Because there's always some kind of tradeoff.
This topic... I think it would be far better to ask if the tesselation unit, memexport, or the features where Xenos is performant (i.e., pixel shader branching, relative to pre-G8x +/R5xx + GPUs, and vertex texture fetch/filtering) have had any practical use in a game out/in development YET, and if so/if not, if it can go anywhere in the future/where it can go in the future.
(EDIT: and to clarify, I mean if it makes some things that weren't possible/feasible performance wise possible due to a different method, and so on, along those lines)
But that's just me... and I'm as crazy as they come, right?