So this is standard stacks with standard base layers, that are themselves stacked on a die, that is set on the interposer rather than some kind of custom larger base die.As the separate stacks would sit in very close proximity on the same logic die, I wouldn't expect much of a timing difference.
It is physically a different chip in a standard that does not promise that independent channels are necessarily in sync. The link die could impose some additional synchronization between stacks that operate as if they are alone, such as a possible corner case with refresh timings shifting between halves of the same channel.And the signals from the GPU have to run through the PHYs on the base die anyway. Routing the data and address lines to the TSV contacts of the second stack just a few millimeters (the size of a 2 GBit die) away, is probably costing not much more time than the signals would travel inside a larger 4 GBit die if you adress another bank there.
The standard shouldn't care, but the memory controller would stand to benefit from being aware of possible differences in things like bank activation limits.