Yes, but think the max speed would be 100.4 GB per second. 256x3.2/8=100.4. However, I the press release for DDR4, it was mentioned that initially DDR4 will be only 2.133 Gbs. That would put the bandwidth of a 256bit bus around 68 GB/sec which is actaully really close to a 7770.
To your second point, I think that's how MS will try to reduce cost over the long term. Stacking may not be ready next year, but in 3-5 years, I bet they could use stacking to reduce the costs of the package. In theory, you could stack those 16 chips in just two 8 module stacks on one 32-bit physical bus. I doubt they'll be able to achieve that, but there's probably something in between that will happen.