Well, lemme start by kicking off some math. You guys are free to tweak variables as needed.... meaning to say I haven't seen anyone work the math out.
Let's assume: 4xAA, 32-bpp texels, 24-bit depth, 8-bit stencil. 720p (1280x720), 60 frames/second.
Let's also assume that we do a depth first pass with average complexity 2, then do a color write pass such that early-z / hier-z culls all non-visible pixels in that pass for free.
Let's also ignore the effects of tiling, and assume everything fits in the eDRAM (ie: the eDRAM is infinitly large)
The initial Z passes will use up: 4 bytes * 1280 * 720 * 60 frames * 2 (avg complexity) * 4 samples = 1.65 GB/sec.
If AA compression is 90% efficient, bandwidth for Z drops to just 0.53 GB/sec.
The color passes would then need: 4 bytes * 1280 * 720 * 60 * 4 samples = 0.82 GB/sec (0.27 GB/sec with 90% effective AA compression).
Let's assume you blend a lot, so let's assume AA compression drops to 80% (more intersections with blending) and the bandwidth requirements otherwise double.
You end up needing 0.66 GB/sec for color.
Total eDRAM bandwidth so far: 2.32 GB/sec.
Now let's throw in tiling due to a limited eDRAM size, but ignore the additional VS work: We need to read the eDRAM (10MB), then write it out to main memory. At 3 tiles * 60 fps, that's 1.8 GB/sec for reads.
Total eDRAM bandwidth so far: 4.12 GB/sec.
The eDRAM has 256 GB/sec.
Even if we factor in inefficiency in the eDRAM implementation, and assume scenes are 10x more complex, it's hard to reduce 256 GB/sec to just ~40 GB/sec.
Clearly, the eDRAM is likely never the bottleneck.
What about the external memory though?
The bandwidth there is 22.4 GB/sec.
If we read 1.8GB/sec from the eDRAM (to page out 3 tiles @ 60fps), then downsample, then write back colors to main memory, we end up with a needed local DRAM bandwidth of:
1280 * 720 * 4 * 60 = 205 MB/sec.
Thus, there is 22.4 - 0.2 = 22.2 GB/sec of bandwidth left (99%) for texturing or other applications.
Assuming a memory efficiency of 85%, we get: 0.85 * 22.4 - 0.2 = 19.04 GB/sec (still 99%).
That number is independent of the scene complexity (assuming you still run at a constant 60fps).