Jaguar is the name of both the CPU core and the combination of four Jaguar CPU cores and shared L2 cache into a 'Jaguar Compute Unit'.
Right, well the underlying point here is that you'd either be looking at 8 cores sharing the same L2 cache, which would definitely be a customization/deviation, or you'd be looking at two compute units. While the latter might seem obvious it implies that the compute units are basically SMP capable and can stay coherent. Such capability, if it exists, is probably not going to be exposed on any commercial Jaguar SKUs. It could still be part of the design but unexposed. But if not it's going to be mean customization.
It would not surprise me at all if we are looking at customization here, and more extensive than that. For instance, XBox 360's CPU cores may have used Cell's PPE as a base but the SIMD pipelines and architectural resources were pretty majorly rebalanced and MS didn't just get a new type of encoding with more registers but plenty of custom instructions solely for their purposes. I'm sure IBM's bill for this didn't run that cheap, and I'm sure AMD would love to receive something similar (and if MS was willing to pay it for XBox 360 I don't see why they wouldn't be willing to pay the same amount or more for the successor).
Such customizations could include instructions and functional blocks to make BC more efficient.
Maybe Durango CPU is a 4 Jaguar CU, with 4 jaguar cores each...
4 physical cores with 4 "logical cores" doesn't fit this description at all, it pretty unambiguously refers to SMT. 16 physical cores is also pretty batshit insane. 8 is on the high end of feasible.
There are a bunch of conflicting rumors out there. People have to come to terms with the fact that they can't all be true.
How big is one Jaguar CU in 28nm? Charlie mentioned that nextgen main chip is huuuge 500mm2+.
I'm not aware of any numbers or die shots, but a Bobcat core with 512KB L2 cache is allegedly about 8mm^2 on TSMC 40nm. Jaguar isn't hugely different although it has 128-bit SIMD units and can have twice as much cache per core, but it's on a new process node so I'd expect it wouldn't be that much bigger. Maybe a similar figure per core w/512KB, so an 8-core with 4MB of cache may be < 70mm^2, assuming good scaling to 28nm (which AMD has demonstrated so far). Not a huge amount by any means.
500mm^2 sounds ridiculous though. I'm not aware of any consumer mainstream chip, even GPUs, reaching these sizes. And the ones that get up there cost a few hundred dollars.