Fury is 596mm2 ( taken on Anandtech review )... But at least at 28nm, both nvidia and AMD have hit the reticule limit of TSMC.. I dont know what is the limit on 14-16nm. ( But will they really want to hit the bigger sized chip on the first series of this new process ? ... I dont think... )
Im not sure the interposer size is a limiter factor there.
For the purposes of manufacturing the GPUs, the limit is an optical one related to the equipment. It's not going to change much.
The interposer for Fiji did a few things to allow the whole assembly to exceed the reticle limit for the interposer. The interposer itself is larger, but the patterned area subject to optical limits does not cover the whole interposer. The GPU runs right up to the limit on one dimension, and the HBM stacks partially extend past the edge of the patterned region, taking some spacing pressure off of the GPU.
There are ways to expand the area of the interposer, with varying degrees of complication and risk.
Some of TSMC's planned products for 2.5D integration might be expanding the limit, and others might be doing so as well as they get past the early implementations of the concept.
Reticle size should stay the same for all intends and purposes. But I haven't read anywhere, why HBM2 should be significantly bigger, quite the contrary, they're within the same JEDEC spec, not even officially named HBM2.
What was called HBM2 is a separate revision of the spec, but it the package is defined to be larger than the early version of the memory used by Fiji.
http://www.anandtech.com/show/9969/jedec-publishes-hbm2-specification
HBM2 memory stacks are not only faster and more capacious than HBM1 KGSDs, but they are also larger. SK Hynix’s HBM1 package has dimensions of 5.48 mm × 7.29 mm (39.94 mm2). The company’s HBM2 chip will have dimensions of 7.75 mm × 11.87 mm (91.99 mm2). Besides, HBM2 stacks will also be higher (0.695 mm/0.72 mm/0.745 mm vs. 0.49 mm) than HBM1 KGSDs, which may require developers of ASICs (e.g., GPUs) to install a heat-spreader on their SiPs to compensate for any differences in height between the memory stacks and GPU die, to protect the DRAM, and to guarantee sufficient cooling for high bandwidth memory.