Every single card, from the lowest low to the highest high end is limited by bandwidth in all but corner cases.
At ATI I did performance analysis (albeit a while ago). Not simple benching, but probing the data gathered by the chip about where stalls occur etc., just like the XB360 lets the devs see.
I guarantee you that you are
very wrong in this statement. In fact, you can very easily test it out by down-clocking memory and/or overclocking the core of any chip. If you were right, you'd get 1:1 scaling with the former and no scaling with the latter.
Granted, doing things like halving the per-clock bandwidth of RSX compared to G71 will make a substantial difference in the percent of the workload that's BW limited, but it still won't be "limited by bandwidth in all but corner cases."
The number of pads on the die is what enables the wide busses. It does not drive cost down. Boutique memory on many-layer PCBs is what defines cost at the high end. And it is not getting any cheaper!.
That's exactly my point. There wasn't enough room on a GF2/3/4 for a 256-bit bus without increasing die size, but there was on later chips. High speed memory is expensive, but ATI and NVidia aren't putting it on their high-end cards to maximize performance/cost ratio at launch. It loses its "boutique" status quite quickly. Moreover, much of the board complexity and chip pin-count comes from power delivery. The incremental cost of going to 256-bit is not that big on the PC, especially for the high end.
A perfect example is how the 7600GT is notably faster than the 6800U with much lower bandwidth. I'm very sure that a 20-pipe 6800U with a 128-bit bus would faster than the existing 6800U. I doubt it would be cheaper, though. Eight 32MB chips would barely be more expensive than four 64MB chips, and an extra PCB layer is pocket change for a high end card.
Bandwidth is what defines price points on GPUs today. You build a memory system, and slap a chunk of silicon on it that will saturate it. Look how similar X1800s and X1900s are priced, yet X1900 has three times the shading power.
First of all, the X1900's have only 20% more silicon than the X1800's (despite 3x the
arithmetic shading power), so your logic is severly flawed. Secondly, there are plenty of examples that prove you wrong. The Geforce FX5800 cost about the same as the FX5900. The 256-bit 6800GS cost the same as the 128-bit 7600GT for a while.
On consoles you're right because it hampers scaling down the road. Available memory, however, is a huge cost also. I think MS said going from 256MB to 512MB will cost them 1B, so you don't want to waste a huge chunk of that on a tile buffer.
It's only whack because of the massive performance penalty that is associated with it because of bandwidth limitations. It's a IMR mindset.
Cheers
No, it's whack because it's totally unnecessary. You can see the difference between 1M polys and 10M polys a LOT more than you can see the difference between 32-bpp and 128-bpp, or 8xMSAA and 4xMSAA. Your proposed framebuffer is 5x larger for next to no benefit simply to make a TBDR look better. It's 20x larger than most games today (2xAA, 720p), yet your poly count is the same.