If I understand what you are asking, how can you saturate a 224 GB/s pathway if you RAM is only 10GB? That's easy. You aren't reading the entirety of you RAM every frame, lots of data is sort of just there being used as needed. The main part of 3d rendering, at least when it comes to bandwidth isn't about what you've got stored in ram, it's writing and reading render targets. And that bus isn't 100% efficient. And the bandwidth usage scales linearly with resolution and framerate. IIRC a 1080p framebuffer at 32bit color and 32bit Zbuffer require the GPU to write about half a gig a second at 60fps. HDR, higher resolutions, and higher framerates are all going to multiply this. And that doesn't account for games that do funky things like rendering Z only before rerendering the scene, or full resolution render to texture, or whatever the requirements of raytracing are. And this all in contention with reading textures and other game assets, and on console there is contention with what the CPU is doing. And all of this talk this generation about loading assets from SSD just in time is going to play a role here, too.
Random fact, the Xbox 360's eDRAM was only 10MB and it has 256 GB/s bandwidth, and it was exclusive to the GPU. More than 1000 times smaller and it still had more bandwidth.
I feel like I'm falling on deaf ears.
If there is no impact and doing this was fine, we'd see a lot more of gtX550 situations and Microsoft would see no reason to explicitly say this at all.
If this is fine there would be simply four banks at 56GB/s each and we would just add them. the fact that they didn't tells.
Hypothetical situation:
chips A B C D
A is 4GB, other three are 2GB.
Situation A: I left 1.25GB in each of A,B,C, and D.
That's 5GB of data I'm trying to read across 4 chips.
time used would be 1.25 units, and my effective bandwidth is 4GB/unit of time
Situation B: I left 2G of data in A, and 1GB in B,C,and D.
That's ALSO 5GB of data across 4 chips I'm trying to read
Time used would be 2 units, and my effective bandwidth is 2.5GB/unit of time
My effective bandwidth here is 62.5% of what I would have in situation A normally.
Of course you can say that I can find work for the other chips to do while I read A but that's, again, explicit planning you'd have to do.
Of course the best solution is to avoid 2GB at all costs.
Any utilization of the 2G at the same time of the 8G will lead to bandwidth contention because physically the channels for the B C D chips won't be able to access chip A, and by creating an unbalanced workload across 4 chips will create extra inefficiency on top of the usual situation.