AMD launches its Quad-Core Opteron; Phenom expected in December

B3D News

Beyond3D News
Regular
After months of fevered doubt and speculation, Sunnyvale CPU maker AMD finally announced today the release of its latest server-oriented processor, and first quad-core architecture, the Quad-Core Opteron, a processor most know by its codename "Barcelona."


Read the full news item
 
There's not much surprise that Barcelona lags Core2 in integer IPC, which is compounded by the low clock speeds.
The FP on a per-core basis also lags somewhat.

The platform's scalability keeps Barcelona competitive in the 2S space, and Intel's FSB bottleneck is more telling at 4S, at least for bandwidth-hungry FP apps.

AMD's hold on integer apps is waning, though, for both single-socket and multisocket. Thanks to the L3, the latency at these low clocks is closer to a Xeon than it is to K8, and the cache per core is very lean.

Phenom's desktop fortunes from these numbers don't look too hot.
As has been pointed out before when there weren't hard numbers to back it up, almost none of Barcelona's scalability advantages over Core2 apply in single-socket.
FB-DIMMs also don't hinder Conroe on the desktop either.

I guess we'll see what happens when AMD's memory controller is clocked higher in later speed grades, since the higher speed should reduce (but never eliminate) the latency penalty of the L3.

It does look like AMD's competitive niche is getting smaller.
I bet Shanghai would have been nice to have out right about now.
 
Why would there be a penalty because of the L3?

Surely data can go straight tothe CPU without first having to take a detour through the various levels of cache?
Peace.
 
Apparently cache level latencies are cumulative. The CPU goes to each level of cache looking for data first before it gets to RAM. Go look at Tech Report's results.

Unless they nearly double the clock speed, I don't see this being exciting on the desktop.
 
Why would there be a penalty because of the L3?

Surely data can go straight tothe CPU without first having to take a detour through the various levels of cache?
Peace.
I would have thought so as well, but the tests shows otherwise. I guess they don't want to waste bandwidth fetching data already present in L3 cache or maybe it's too difficult when more cores are involved. Either way it looks like the latency advantage is gone compared to Core.
 
It will look much better if (as rumoured) the recent new stepping came back very well and will be hitting 3.0 ghz for the Phenom launch in December. Supposedly AMD committed to a launch with what they had, and now have to sell the 2.0 ghz Barcelonas even though what's now coming out of the FAB is much better.

Have to wait and see if that's true or more marketing BS, but AMD will have to ramp up speed very quickly indeed, or Intel will crush them with Penryn.
 
Seeing that Core 2 Quads G0-steppings are hitting 3.4ghz on stock voltage and cooling, I'm not entirely convinced that AMD will win any sort of speed contest. Even if AMD can pull 3Ghz on their chips within any usable thermal envelope, even now Intel wouldn't have any problems releasing a 3Ghz C2D on 65nm.

Now combine all of Intel's new SOI gate technology on their 45nm node with Penryn, and I'd wager Intel could out-megahertz AMD at every turn if they need to. At the same clocks, Penryn doesn't outperform C2D drastically in most situations (10-15% from the last set of benchies I saw, minus the SSE4-optimized stuff).

But the real big stick that Penryn brings is another major drop in power draw and equivalent drop in thermal envelope. So while AMD might be able to compete at some level with Intel in performance (and that's an awfully big might), they'll still be quite far away from Intel's power and heat efficiency.
 
Now combine all of Intel's new SOI gate technology on their 45nm node with Penryn, and I'd wager Intel could out-megahertz AMD at every turn if they need to. At the same clocks, Penryn doesn't outperform C2D drastically in most situations (10-15% from the last set of benchies I saw, minus the SSE4-optimized stuff).
Intel doesn't use SOI.
It has metalized gates and a new high K dielectric, but Intel has repeatedly stated it does not use partially depleted SOI used by AMD and others, and is currently only exploring the option for fully-depleted SOI.
 
Whatever they use, their new CPU's on smaller process are right around the corner, they perform better, need less power and less cooling AND they can be clocked higher anytime - it's really a no-brainer. I can't think of any sort of magic that could help AMD now.
 
Barcelona performs about 15% faster than K8 per clock according to the reviews. Conroe already has a greater than 15% per clock performance advantage and Penryn should extend that by another 5%. So AMD actually needs to be beating Intel in clock speed by a good 10% just to be competitive but it looks like they will be lucky to get within 90% of Intel.

To make matters worse K10 is more expensive to produce than K8 while Penyrn is cheaper than Conroe. Things are looking very bad IMO.
 
would help if you compared like with like,

say Barcelona with Xeon
and
phenon with Conroe.

2ghz Barcelona seems about par (maybe a tiny bit under) with a 2.4ghz Xeon.


remember most of these reviews are with ECC 677 ram so a compairison to conroe isn't really fair. Also the higher Barcelona is clocked the higher the memory controller is clocked and will hopefully bring down memory latencies so hopefully phenon and barcelona will scale a little bit better then Core2.

the problem for AMD is most definitely clocking, its oblivious that they are having issues in that regard thus they are releasing "low power" cpu's first. But hey K8 wasn't that much different.

mileage over time between Core2 and K10 will be very interesting to see.
 
The performance picture per core is clearly in Intel's favor.
Per socket, the picture does not change.
Any desktop comparison will also remove the latency and power penalties inflicted by FB-DIMMs from the equation.

Scalability helps Barcelona, but it is not that helpful for Phenom.
There's only so much a higher-clocked memory controller can do when each core is weaker than the competition.

There's the mostly ignored 4x4 platform (or Intel can dust of V8 or whatever), which might help with something, but the wins will be just as uncertain as they are for Barcelona.
 
After Core 2 was introduced, I knew that AMD was boned, and most people probably felt the same way.

Intel was dominating marketshare with their pathetic P4, so when they release an architecture with twice the IPC across the board and lower power consumption, you knew AMD's life was only going to get much harder.

Now that Intel's hardware engineers are back on top of their game, I think the best AMD can hope for is architectural parity, but they still remain at a big disadvantage when it comes to process technology. It'll take a miracle for AMD to return to profitability, especially in the main consumer CPU market.
 
I'm pretty pessimistic when it comes to AMD's chances to compete with Intel in high-end CPU's, seeing that Barcelona even is struggling keeping up with the current Core 2 architecture at the same clock speed (and do we really believe AMD will be able to scale well as they still seems to have problems delivering high-clocked version on Ahtlon X2 on 65 nm?). If Intel's high-k process and the new Nehalem architecture lives up to their promises I can only see AMD ending up even further behind Intel. And only selling low-end and mid-end CPU's is not really an option as that won't be enough to pay the bills....

No, what will save AMD is Fusion. While I'm not that excited when in comes to integrating a full graphics core in the CPU, I'd gladly see a stream processing unit integrated into every single AMD x86 processor sold. Imagine what Phenom could be if it integrated the 320 stream processors from the R600.

Is it possible to answer the questions how many transistors just the stream processors from R600 use? If integrated in a Phenom, do you think it would be possible to run it at full speed (2.5+ GHz)?

Per
 
No, what will save AMD is Fusion. While I'm not that excited when in comes to integrating a full graphics core in the CPU, I'd gladly see a stream processing unit integrated into every single AMD x86 processor sold. Imagine what Phenom could be if it integrated the 320 stream processors from the R600.
It would be way bigger, much more expensive, and even harder to clock high. And just what the benefit of those stream processors would even be is questionable. More low-end integrated graphics? Very specific computing benefits (stream processors are limited in what they can compute well).

We can't even get x64 support going and at least that is highly friendly to the existing software base. Specialized hardware like stream processors would be so niche and unsupported..... Uhg decades away heh.
 
Last edited by a moderator:
Back
Top