The 176GBs is a theoretical figure, not real world. Real world is less. An interesting article is here
http://archive.arstechnica.com/paedia/b/bandwidth-latency/bandwidth-latency-1.html
Which explains a lot, though despite the authors best efforts, it's not for the feint hearted!
Ms engineers have stated that whilst monitoring real games running they have measured real world 150MB/s from esram and 50GB/s from ddr3. They can be added together. So real world the x1 has been measured running real code at 200Gb/s. The relevant quotes have been included here a page or so ago.
The rest of your points are fair enough, we don't know all the ins and outs, but there would have to be a proper curve ball to throw these figures right out.
I am interested in your comment about thousands of gpu threads vying for the bandwidth with the cpu. Because to my mind the more that happens, the more contention you get, which will not be good for overall bandwidth utilisation or reducing cpu stalls.