Dodeca-core: The Megahertz Race is Now Officially the Multi-core Race

AMD engineers reveal details about the company's upcoming 45nm processor roadmap, including plans for 12-core processors

"Shanghai! Shanghai!" the reporters cry during the AMD's financial analyst day today. Despite the fact that the company will lay off nearly 5% of its work force this week, followed by another 5% next month, most employees interviewed by DailyTech continue to convey an optimistic outlook.

The next major milestone for the CPU engineers comes late this year, with the debut of 45nm Shanghai. Shanghai, for all intents and purposes, is nearly identical to the B3 stepping of Socket 1207 Opteron (Barcelona) shipping today. However, where as Barcelona had its HyperTransport 3.0 clock generator fused off, Shanghai will once again attempt to get HT3.0 right.

Original roadmaps anticipated that HT3.0 would be used for socket-to-socket communication, but also for communication to the Southbridge controllers. Motherboard manufacturers have confirmed that this is no longer the case, and that HT3.0 will only be used for inter-CPU communication.

"Don't be disappointed, AMD is making up for it," hints one engineer. Further conversations revealed that inter-CPU communication is going to be a big deal with the 45nm refresh. The first breadcrumb comes with a new "native six-core" Shanghai derivative, currently codenamed Istanbul. This processor is clearly targeted at Intel's recently announced six-core, 45nm Dunnington processor.

But sextuple-core processors have been done, or at least we'll see the first ones this year. The real neat stuff comes a few months after, where AMD will finally ditch the "native-core" rhetoric. Two separate reports sent to DailyTech from AMD partners indicate that Shanghai and its derivatives will also get twin-die per package treatment.

AMD planned twin-die configurations as far back as the K8 architecture, though abandoned those efforts. The company never explained why those processors were nixed, but just weeks later "native quad-core" became a major marketing campaign for the AMD in anticipation of Barcelona.

A twin-die Istanbul processor could enable 12 cores in a single package. Each of these cores will communicate to each other via the now-enabled HT3.0 interconnect on the processor.

The rabbit hole gets deeper. Since each of these processors will contain a dual-channel memory controller, a single-core can emulate quad-channel memory functions by accessing the other dual-channel memory controller on the same socket. This move is likely a preemptive strike against Intel's Nehalem tri-channel memory controller.

Motherboard manufacturers claim Shanghai and its many-core derivatives will be backwards compatible with existing Socket 1207 motherboards. However, processor-to-processor communication will downgrade to lower HyperTransport frequencies on these older motherboards. The newest 1207+ motherboards will officially support the HyperTransport 3.0 frequencies.

Shanghai is currently taped out and running Windows at AMD.

News Source: DailyTech - Dodeca-core: The Megahertz Race is Now Officially the Multi-core Race
 
I miss the good old days when you could actually count on a new processor running your code measurably and significantly faster than its predecessor. Too bad more cores doesn't equal more performance :(
 
I dont think more cores is the way to go because software cannot take advantage of it for the most part. I am more interested in clock speeds and IPC improvements. Multithreaded apps are just a bunch of single threads after all so if you want the time to drop overall, you would need IPC and single thread performance improvements.
 
I dont think more cores is the way to go because software cannot take advantage of it for the most part. I am more interested in clock speeds and IPC improvements. Multithreaded apps are just a bunch of single threads after all so if you want the time to drop overall, you would need IPC and single thread performance improvements.

This isn't exactly a desktop part, for starters the total die-size would make it uneconomical. Server and workstation code has been being optimised for multi-core for years for the simple fact that multi-socket systems (and multi-node clusters) aren't uncommon.

If you can't pump up the IPC then increasing the number of cores is the alternative, just make sure you have the bandwidth to feed them all.
 
On a related subject, I attended a seminar yesterday by the main Cilk guy. They're commercializing it (as Cilk++), and most of the crappy restrictions in Cilk are gone (you can call Cilk++ functions from C/C++, you don't have to mark all predecessors of a spawnable function as spawnable, there are wacky thread-specific items called hyperobjects that are really cool). Plus, they're apparently working on speculative parallelism, which is terrifying and awesome.

So anyway, it seems as though the tools are moving in the right direction. We'll see if anyone actually uses them.
 
Well, for servers it could prove interesting... But somehow I think for desktops, more than 4-6 cores will probably not be very interesting for most users. Especially if/when GPGPU takes off, where massively parallel tasks will simply run on the GPU, leaving only 'everyday' tasks for the CPU.
We'll have to see. Servers generally just require a lot of threads 'in the air' for quick response time... These threads generally aren't very heavy in a computational sense. In which case, HyperThreading might be good enough, and it doesn't look like AMD will have that technology anytime soon.
So even if we assume that Intel only has 8 cores by the time this 12-core would arrive... Intel will have HyperThreading (perhaps even 4-way?)... So then we'd have 16 (or even 32?) logical cores vs 12 real cores.
 
Back
Top