Radeon 7970 was quite popular. I still use one frequently for testing. 7970 GE is very close to RX 470 in compute performance. Geometry performance of course is much worse, but our game doesn't render triangles. I wonder whether Vulkan async compute still works on GCN 1.0. AMD themselves recommend to use just one compute queue (on PC). GCN 1.0 queue count isn't the limiting factor here. IIRC somebody in B3D forums said that GCN 1.0 can't run the same microcode for ACEs as there's not enough room. Maybe there's load balancing issues or some other bugs and the new code just doesn't fit in GCN 1.0. This is very unfortunate if true, since 7970 is still widely used. Async compute would have extended its lifetime a bit, especially in compute heavy games like ours.
The capacity constraint for microcode I'm thinking of came up in the context of allowing the microcode engines to support the standard command packet types, HWS, and the AQL packets for HSA at the same time.
At least the standard compute being used for the synthetic benchmark from this thread wouldn't be purposefully involving those extra sets of microcode since that's beyond what the program could control, and it hasn't been changed.
Losing the exact same functionality now might be that something changed the driver's threshold for serialization, a choice to back off on AC for 1.0, or a bug.
There are features that AMD has introduced that may not filter back to anything older than Sea Islands .
The front-end hardware for GCN 1.0 doesn't seem to have the foundation shared by the next revisions, in terms of the ability to update and the ability/hardware features for the microcode engine's interaction with the GPU back end. The underlying hardware of the front end for GCN 1.0 might have had more of a shared basis with Northern Islands and its introduction of compute.
From posts in the following thread, it seems that going forward AMD's overall compute platform is based on 1.1 and higher.
https://www.phoronix.com/forums/for...te-1-3-platform-brings-polaris-other-features
I wouldn't think that would necessitate scrapping 1.0 support for basic AC, although perhaps there are details on how the stack communicates with the hardware that might explain why this gets more difficult, or the moving onto bigger and better things can increase the chance of corner cases coming up and forcing a fallback when no additional hotfix is forthcoming. Applications that do try to use more recent features could prompt a drop back to standard execution rather than trying to infer how they can be massaged into 1.0 at runtime.