I'm unconvinced. GCN has evolved but just hasn't got a name change. GCN stopped at GCN 5, which includes things like scheduling and tessellation changes. The core GCN architecture is the SIMD CUs and wavefronts, so we're counting two core architecture changes, VLIW and GCN. In that time, hasn't nVidia had effectively one arch, the CUDA core? So nVidia introduced CUDA with Tesla and have stuck with it, and AMD have used GCN. nVidia has named their different CUDA based generations with different family names, whereas AMD has just named theirs GCN x.
Is there really a difference in behaviour? Both have a long-term architectural DNA as the basis for their GPUs, with refinements happening in scheduling and features across the evolution of that core DNA.
From the standpoint of the ISA, Nvidia's had one major transition when going from Fermi to Kepler. The former was described as a more CISC-like architecture with reg/mem ALU operations, and the microarchitecture had an operand collector, register scoreboarding, and hot-clocked ALUs. Kepler's ISA became load/store and exposed ALU register dependences to static scheduling by the compiler.
The hardware lost the operand collector, scoreboarding for ALU ops, and separate clock domains for the SIMD hardware.
Maxwell introduced the register operand reuse cache, which was a software-visible control for result forwarding.
Volta made a change to the threading model, and also introduced a broader encoding change where each instruction is 128 bits and tracks its own dependence and wait cycle information, versus the way prior generations had control words that governed small packets of instructions. The tensor units introduced a different set of memory and scheduling concerns, and the RT cores are new functionality. There's a little-discussed addition to the architecture for uniform operations, which seem to indicate dedicated data paths for some of the warp-wide value calculations that might align with one of the uses of the scalar unit in GCN.
Nvidia has also gone back and forth in its implementation on the L1 vs shared memory location, the use(s) of the texture cache, how the L1's cache policy works, and other architectural changes outside the SMs.
GCN has added subsets of instructions, and moved things around a bit. The overall encoding has stuck to similar themes, and RDNA does a fair amount to align with it--although a point of churn with opcodes is that it reversed the unexplained remapping Vega did to the vector ISA back to something more similar to earlier GCN.
I'd say the pipeline execution loop, register file, caches, LDS, threading model, and ISA philosophy has been pretty consistent until RDNA changed some if them.
Perhaps some of the debate with ISA changes is how much an alteration should be considered a "change" in terms of semantics. Is altering a few bits for all instructions while they behave similarly a big change versus a small addition of new ops? Both vendors have had this sort of ISA shift, which Nvidia abstracts more with its CUDA layer.
Nope. Its not as simple as that.
With semi-custom, clients commission chips which allows AMD far greater latitude than how the PC markets works. You ask for a chip with certain features and performance levels, AMD then bases pricing on R&D, manufacturing costs and volume. With MS and Sony that means AMD has two clients locked in for 7-8 years with anywhere from 40-120 million in chips sales for each client. Profit margins aren't super fat but they are there, guaranteed for the most part and those clients are locked in for the long term.
The predictable volume also avoids AMD's historically poor handling of inventory, channel vs OEM balance, and product mix, although recent times have shown AMD isn't alone. The most recent crypto glut affected Nvidia as well, though I recall Nvidia hinted that part of its difficulty clearing the channel was related to some unnamed competitor's significant glut being a factor
Semi-custom's more consistent volume does carry risk for AMD, if its cost improvements cannot keep up with the guaranteed reduction in payment schedules over time.
With PC market, AMD goes through R&D investment, puts in mass productions orders and then hopes that its products can compete well enough to sell at profitable margins. AMD competes with Nvidia in the discrete market while competing with Intel in markets where IGPs are applicable. AMD is continually working to win contracts for each new year's iterations of products (laptops, desktops, etc.). Furthermore, AMD has to iterate every 2-3 years and revamp parts of their lineup with no guarantee that the ASP will cover all the effort.
If you have the clients, the semicustom business is like a leisurely jog through the park while the PC market is more like a sprint through a obstacle course.
Having a development pipeline sized for a stroll also means not having the robustness to take advantage of new opportunities or recover from missteps. Nvidia was able to leverage its leadership into professional, HPC, AI, and automotive. While some of this is still speculative or now facing more competition, several of them would be massive revenue streams for AMD if it weren't strolling so far behind.
The software gap in particular is large enough that AMD has no choice but to acknowledge its sub-ordinate position to Nvidia's initiatives in compute, and this anemic support model flows through all its products. Which I suppose in large part is the other weakness semi-custom can offload onto the customer.