I think a die size increase of some sort for this top end part was likely anyways, still staying on the (for the most part) same node.My hypothesis is that if the rumors are true, they've scaled back some of the SRAM allocated to on-die cache, choosing to use more of that area for other things, leaning on the external memory bus to do the heavy lifting, Ampere-style. Maybe GDDR7 is cheap and power efficient enough that it makes sense this generation, or maybe they're going close to the reticle limit and there simply wasn't enough room for the giant on-chip cache (for GB202 anyway), especially with how poorly SRAM scales on cutting edge processes.
512-bit bus kind of necessitates a giant chip, if for no other reason than you need the space around the edges of the die to fit the 16 memory controller channels.
AD102 at 609mm^2 on N4 doesn't look like it has space in the floorplan for 4 more memory channels, even if you omit the NVLink bits entirely.
And some have speculated that the 5090 is being primed as an AI GPU for non-data center applications. 4090 had success in this market, and 5090 looks like it's really catering strongly to that. If they want to sell this thing for $2500-3000, they probably can.
They could then perhaps offer a 5080Ti that's cut down with 448-bit bus/28GB for like $1500 six months from now or something.