The 5700 have the same butterfly design, which proves navi was made by sony for BC (/s), or maybe it was just the most efficient way to route the data paths. The Pro was launched quite early, Cerny said it was a simple solution but that doesn't mean they couldn't do something else with more effort.
Aren't all AMD GPUs in the 36+ CU range "butterfly" like the 5700, with shader engines and CU arrays distributed symmetrically to either side of the command processor and front end logic?
The maximum CU count with two shader engines is 32, and once a design needs additional shader engines they're placed on the other side.
Even if it was somewhat related, there is nothing that logically prevents them doing a three panel design, 18, 36, 54 CU which powers down sections the same way the whole point was to switch between 18 and 36 on the Pro.
With GCN, the GPU is structured to load balance by dividing screen space into tiles that an SE and its RBEs have exclusive responsibility for. The logic for how the SEs and RBEs discard or route gemoetry seems to be counting on a straightforward set of rules for determining the coverage of a triangle.
The Vega ISA doc actually has an instruction that explicitly recognizes the differing behavior of GPUs with 1, 2, and 4 SEs, based on the bounding box of a given triangle (possibly for the primitive shaders that were never released).
1 SE is trivial, and 2 SEs can be handled in software, and 4 SEs had a lookup table and a few arithmetic limits to the instruction.
Is the math straightforward and distributed evenly across screen space for higher counts, or for odd numbers?
AMD's method skipped over 3.
(edit: corrected truncated sentence above)
Why didn't they do likewise on 4Pro, but keep more CUs active in "Boost mode" so the software gets a bigger boost even when not patched? That's what Microsoft did on OneX so all older unpatched games run at 3TF from the start. Games with newer SDK then get use of the full 6TF even without explicitly targeting OneX.
In some ways, I think the PS4 Pro technically could have, since if we think of it in terms of what the hardware can do, there are inactive CUs in addition to what is active.
From the point of view of the hardware, it could have happened if it weren't for outside considerations such as yield recovery.
Die size and limited bandwidth might have capped the Pro from having enough CUs to bother with 36+.
From that standpoint, the CU count was artificially lowered even with the original PS4. The hardware or platform decided what the software could see. I think there were examples of AMD GPUs reflashed from non-XT to XT BIOS versions that enabled CUs--albeit at the risk of bringing in whatever issue disabled the CUs in the first place.