Why would you copy data from ddr3 to esram before beginning to work on it? That copy operation is limited to 68 GB/s just as working directly from the ddr3 would be. Data should go from ddr3 directly to gpu, with intermediate results possibly written back to esram. Subsequent steps may read from and write to DDR or esram pools as needed depending on where the data resides and whether it would most benefit from high bandwidth or not. This talk of an extra step of copying to esram first, then beginning work, just doesn't make any sense to me.
A more interesting scenario might be copying data from esram to ddr3, say for example, while the gpu is writing the last pixels of the framebuffer the DMEs are already moving the first ones to memory, to reduce the resolve time for instance...
Or better yet, imagine if they use tiling (like on 360), the DMEs could be a bit behind moving the tile to ddr3, and when the gpu finishes writing it it can move immediately to the next tile because everything will be prepared by the DMEs already.
With the esram being capable of simultaneous read and write this copy probably won't mess up with the ROPs bandwidth at all.
But, moving from DDR to esram might be useful if:
- You know in advance you are going to need this data;
- To the moment you realize you need this data, to the moment you are actually needing it, you have enough time (and freed bandwidth) to move it;
- It's not very large in size, but might require lots of bandwidth.
Because if you meet all those criteria, by the time the gpu needs this data it could read from both, effectively doubling the read bandwidth.
I just don't see many data that can be suitable... Perhaps a small texture/vertex buffer for things constantly on screen (like a gun in a fps, or your character in a 3rd person game), but the ddr would probably be enough for these...