Next-gen Opteron: The optimal supercomputer CPU

CoolAsAMoose

Newcomer
I really would like to see 4 hypertransport links for inter-CPU communication in the next-generation AMD Opteron. I believe this would be the optimal supercomputer CPU with virtually an unlimited number of CPUs without any extra logic needed - enabling top-range supercomputers at a whole new price level. Imagine a system with 1000+ Opterons each with its own 5-6 GB/s (or better) bandwidth to local memory.

And I do believe that the open source community really would like the challenge of developing Linux and OpenBSD (as well as the applications) to make the most out of such an architecture.

Note:
Add a hardware process scheduler and some other multi-CPU-specific HW and you have the Transputer concept, but at a high-volume price point.


Link to some Transputer info: http://www.ee.surrey.ac.uk/Personal/R.Webb/l3a15/sld206.htm
 
But you would need one hell of a big ass motherboard :) (There is no specification for HT links over cable's.)
 
The next generation super computer MPU is the Alpha EV7. It has CPU <-> CPU high speed links, just like the hammers, but scale to more than 500 CPUs. Each CPU has 9.6GB/s bandwidth to RAM.

To bad EV8 was cancelled when Compaq offloaded Alpha to Intel.

Cheers
Gubbi
 
I understand that Dirk Meyer and other of the original Alpha team are now at AMD working on K8, K9, and future chips. I wouldn't be surprised if they engineer future AMD chips along the lines of EV8, with massive multiprocessor support. With Hammer's built-in memory controller, it appears they're heading in that direction already.
 
It depends on your usage. The main strength of Opteron is price. I think they'll be quite cheap (especially regarding to memory). However, for a custom built supercomputer, the price of CPUs and memories are actually minor part. However, for clusters they are more important. Actually, the biggest cluster in the latest TOP500 list is Athlon based, providing 825GFLOPS performance running Linpack (peak at 1.43TFLOPS).

On the other hand, high bandwidth is still very important. Vector processors are generally designed with much more bandwidth than scalar processors. Vector processors generally has tens of GB/s of dedicated memory bandwidth.
 
Of course they have chips with 5K+ pins ...

If a connector is defined for HT and it finds it way to "normal" SMP mobos it would present a quantum leap for interconnect bandwith for clusters build from COTS parts. If that happens performance gets a big boost and they will farther encroach on the traditional supercomputer territory.
 
Back
Top