Well, I hadn't really been paying attention to the rest of the thread. What I meant was that Nehalem and Sandy Bridge seem to have nailed to cache configuration: 32 KB L1 data and integer cache, 256 KB L2, 2MB/core L3. This seems to be the optimal setup. If they do change it though, I have no doubt it will be for the best.Haswell is strongly assumed to support two 256-bit loads per cycle. They could use eight 16-byte cache banks, or sixteen 8-byte banks, or stick with eight 8-byte banks. Note that the first two options likely require doubling the cache line length. So I wouldn't be surprised if Haswell did have 128-byte cache lines.
We'd have to evaluate the advantages/disadvantages of each option to see which one is most likely...