For me this is the greatest question why? If the silicon or logic was broken they had enough time to test it with Vega 10 and change it in Vega 20. So the hole situation is strange.
The reasons for the shift were to my knowledge not given by AMD.
Implementation bugs may not be the only reasons why the feature was delayed or cancelled. The original direction was that the new path would have been transparent to programmers, being managed by the hardware and driver.
A late abandonment of that direction may have left any API design effort in a position where resources were allocated elsewhere, or even if they were freely available the time it would have taken to pull together a robust API would have taken too long.
Also, not every flaw or weakness needs to be a bug that can be fixed with a correction to some logic.
A design might make decisions that in hindsight hindered the concept. If the next generation were to make fundamental changes to how the design worked, pouring resources into a design that is being replaced might hinder the new direction that learned from the lessons of the initial concept. If the time it took to get an API for Vega's methods would have dragged into Navi's launch period, it may have deprecated to make way.
There have been silicon designs with early versions of functionality that were never exposed to the general market. Intel's Williamette had the first version of hyperthreading, but it would not be exposed until the second-gen Northwood.
Possibly, elements of NGG were disclosed on a more aggressive schedule than could be sustained, and it would take a more fleshed-out version of the technology to be considered acceptable.