There is no set industry wide agreed upon definition of what constitutes an "architecture", so the whole thing is an exercise in personal opinion and aesthetics.
I'd agree with you that if I took a chip, and simply cut-pasted in more units, or moved blocks around, but the blocks themselves didn't change, it would not constitute a new architecture. For one, it doesn't alter efficiency, all it does is alter scale. Imagine a had a factory that could produce 2 widgets per machine per input, and I added 10 new machines. I can now produce 20 widgets per input, it's scaling up, but it's not a new architecture. If on the other hand, by arranging the machines differently I could get a super-linear increase, or, if I put in machines that could do 3 widgets per machine per input, I'd call it a new architecture.
Where I disagree is when the blocks themselves changed. If AMD adds an another SSE unit, it's a stretch to call it a new architecture. If they redesigned the SSE unit to have significantly more functionality, new instructions, new behavior, I'd call it a new architecture.
And when virtually every major block is getting tweaked new features, it's absurd not to call it new. NVidia, from what we can gather, Rev'ed EVERYTHING. Until the G200, which was a cut-paste job for the most part, they have new register file functionality, new cache architecture, new ECC, new FP functionality (DP, denorms, exceptions, etc), new scheduler functionality, new tessellation units, new setup unit architecture, new TMU clocking arrangement, new ROP CSAA functionality, and on and on.
If you look at the new cache architecture alone, it has potential major implications for the efficiency and performance (in contradiction to your claims PSU-Failure), even Jawed recognized that the cache changes alone could be some rocketsauce for Fermi on certain kinds of algorithms.
Let's put it this way, if clock for clock, when normalized for differing amount of SP units and onboard memory, Fermi beats GT200 because algorithms run more efficiently on it's cache architecture, would you admit it's a new architecture?
Or rather, if Intel switched the L1/L2 caches in the Core architecture to support the kind of software management/partitioning that Fermi supports, would you consider it a new architecture?
For me, whether or not something is new depends on whether it executes a *different algorithm* or different logic, rather than simply higher clocked, or parallel-cut-and-pasted versions of the same logic. Taking the same logic blocks, and moving them around using a clone brush == not a new architecture. Changing the memory controller, cache, scheduler, ALUs, to have altered implementations == new architecture.