Well,even taking into account that the die needs to be a certain size to physically accomodate the pads needed to use a 512 bit bus,consider the following..
An RV670 GPU is only 192mm^ in terms of die size, yet it handles a 256 bit memory bus just fine,and the original R600 GPU at the 80nm fab process,was just over the 420mm^ mark in terms of die size,yet it has a 512 bit memory bus anyhow..
So by that token,we know that you don't need a 500mm^+ die to accomodate a 512bit memory bus like the current GT200 GPU at 65nm uses,and even once the 55nm revision is released,it still yeilds a GPU die over the 400mm^ mark anyhow,so given the original R600 example,there's still enough room in there to keep the 512bit memory bus.
The only time a serious change might have to be made is once you reach the point where the 40nm fab process is ready to handle a 1.4 billion transistor chip,wich would drop the overall die size right around the 300mm^ mark,and that's when an interesting possibility comes to mind.....
What about the possibilty of using a 384 bit bus with on a GPU die at roughly 300mm^ in size and combine that with the use of GDDR5,since we know that a sub 200mm^ die can handle a 256 bit bus(RV670),while a Die just over the 400mm^ mark can handle a 512bit bus(original R600),and Nvidia has released cards with a 384 bit memory bus(8800GTX cards).
Overall memory bandwith would be 50% faster compared with a 256 bit bus and both using memory modules clocked at the same speed in both cases,and i'm sure that there's going to be even faster clocked GDDR 5 by next year and with a 384 bit bus,it's pretty much a given that well in excess of 200GB/sec in terms of local memory bandwith will be exceeded quite easily....