It's just that I compared the block diagram I linked to the latest ATi one and this looks a lot less scalable and less CPU style multicore where adding and removing 'cores' is relatively much easier than adding SP blocks to a single 'core' and redesigning the architecture to accommodate.
Basically what I am trying to find out is whether Nvidia could ship this with just a single GPC in a tiny package for notebooks or embedded devices or increase the GPC count to 6/8 in the next gen. I mean Intel came back from the dead in 2005 in part because Conroe could easily be scaled to 4 cores for HPC and down to 1 core for notebooks, and we are still seeing that type of easy scalability now in the CPU space while it hasn't been available on the GPU side, well until now if Nvidia have done it.
That's not entirely accurate.
The term "core" is a bit fuzzy at the moment, and PR people (from all sides) are doing their utmost to twist it out of shape. If you take the historically accepted definition of core, then 1 SM in Fermi is like one module of Bulldozer (upcoming AMD cpu's). IOW, 1 SM in fermi is almost like 2 tightly-coupled "classic" cores. And one module of bulldozer is sorta like 2 cores. So, I'd say GF100 has 16 modules or 32 cores.