The AMD Execution Thread [2007 - 2017]

Status
Not open for further replies.
Why is GloFo lagging so far behind? Effectively every other contract fab has delivered 28nm in volume except for them and Intel is already hard at work on their 2nd generation 22nm processors.
 
Why is GloFo lagging so far behind? Effectively every other contract fab has delivered 28nm in volume except for them and Intel is already hard at work on their 2nd generation 22nm processors.

GF had to make a jump from SOI to bulk while trying to meet the yield and cost requirements for low-margin devices.
Whether or not it's the only problem, the decision by IBM to go with gate-first has in hindsight to have been a bad call for AMD-now-GF and also probably Samsung. All those trying for volume out of it got hit with significant delays.

That GF's leading-edge capacity used to be AMD's fab arm was not ringing endorsment for either high-performance silicon or cost-effective foundry work. Manufacturability seemed to be a consistent theme about using it, and AMD paying hundreds of millions of dollars to not use it is the most recent news blurb I can recall about it.


That aside AMD has announced it is going to create an Opteron-branded Seamicro server chip using an ARM-designed core.
http://www.anandtech.com/show/6418/amd-will-build-64bit-arm-based-opteron-cpus-for-servers-production-in-2014

Without an architecture license right now, AMD doesn't have the flexibility to design a core that would reflect any of its experience designing high-speed CPUs, but time to market seems to be the primary concern. If AMD went for a soft macro, it may be able to leverage some of its implementation expertise, or it could have cut that corner, too.

Jaguar has not been brought up. Possibly it's because there is no current effort to build one without the GPU that's far enough along. There may be some features missing compared to the AMD server chips that the tablet design lacks.
It shouldn't be an interface problem since Seamicro just needs a PCIe connection to work. In terms of having a coherent system, Seamicro doesn't offer a solution.


One thing I'm curious about is whether the Seamicro ASIC currently--or the next-gen chip it probably had in the pipeline--already had some ARM microcontroller logic or an ARM core already planned to go into it. A lot of specialized controllers already have one or more ARM cores in them, although there may have not been a provision for a V8 core. The existing infrastructure might make it easier to stick one or more on the same die, though.
I've not seen much info on the ASIC to say what's in it, although it included an FPGA in its presentations. The presence of the FPGA isn't really an argument for or against ARM, as FPGAs can house ARM cores as well.
The irony in this scenario would be AMD marketing putting the cart before the horse by now mentioning the ARM core and the incidental fabric logic.
That's more for an unsubstantiated joke, since a Seamicro chip probably wouldn't have gone for V8 core if it was content to sit in the background.
 
it seems generally accepted that most ARM SOC's have poor memory performance, i wonder how much SOC performance could be gained just from AMD 's IMC, maybe a jaguar/kentsfield type cache design.


seems very niche market but could have quite good perf/watt with seamicro's interconnect and AMD IMC as points of differentiation to the other ARM server systems .
 
Unless AMD tweak the ARM64 v8 design for lower power, better clocks, it's not going to be very compelling. If AMD wanted mixed big/little they could have mixed BD family and BC family, but they picked ARM instead...
 
I'm not going to knock the ARM core in terms of power. AMD doesn't set a very high bar, currently.
ARM made a point to keep the big and little cores the same with regards to their ISA support.
AMD high-power and low-power chip lines did not match in the beginning, and Jaguar does not change this.
A thread migrating from a BD-derived core to a Jaguar core stands a good chance of blowing up.

AMD's announcement doesn't give much detail on what it's doing to support what appears to be a me-too initiative. There isn't an Intel that AMD can ride the coattails of in terms of software ecosystem.
It also appears that there is a firm split between the x86 and ARM sides, since there is a separate x86APU only section for anything using media. This ARM initiative appears to be disconnected from anything else Opteron. It remains to be seen what it might share with its more established brethren.
 
The ecosystem doesn't yet exist. In contrast to x86, Intel built the much of the playing field AMD tried to compete on with compilers, software development and evangelism, and in the case of servers it built the x86 server market before AMD dipped its toes in.
AMD64 was one instance where things were reversed, and a change that helped expand the x86 server market as a whole.

The ARM server market hasn't totally justified its existence yet, and in this case AMD doesn't look to have an x86-64 sort of opportunity to influence the direction of the platform, or even the opportunity to try anything with the ISA.
There's a decent chance that AMD's ARM server chip will hit a market past its flash-in-the-pan stage, like Bobcat and netbooks.

Another unknown is the nature of this ARM market, should it succeed. The industry is starting out consolidated and becoming more vertically integrated, whereas the era of the IBM PC-compatible was marked by a breaking down of those barriers.
Of course, if AMD finds a big sugar daddy to cater to, it might work out fine. Otherwise there could be a market where whole sections are immune to evangelism from AMD because they already have a pet supplier or in-house design team.
The whole idea of AMD making an x86 and that almost every non-big iron server room could be a buyer--even with the reality of an anti-competitive Intel, doesn't look like it will ever happen for ARM. Additionally, the Seamicro interconnect basically rules out everything but the super-dense shared-nothing market, which is a niche that hasn't justified its existence quite yet in a niche that hasn't justified its existence.
 
Frankly I still haven't understood where the value of ARM servers is. Perhaps they are going to be cheaper (but that's a matter of pricing) but when I read that they are going to be more power efficient because of their ISA I can only roll my eyes..
 
Frankly I still haven't understood where the value of ARM servers is. Perhaps they are going to be cheaper (but that's a matter of pricing) but when I read that they are going to be more power efficient because of their ISA I can only roll my eyes..

A lot of servers massively underutilize the CPU, so using less powerful, more power efficient ones make a lot of sense. When you build datacenters, things like how much copper you have to run to the building to get power to the servers actually matter.

I don't think it's anything to do with the ISA specifically, likely more the design has been predominantly aimed at power sensitive applications for so long it's probably an easier starting point.

I worked at Bing for a year at one point, you'd be surprised how low CPU utilization was even on the compute clusters, you were almost always constrained by how efficiently you could move the data, rather than any processing you might be doing.
 
Additionally, the Seamicro interconnect basically rules out everything but the super-dense shared-nothing market, which is a niche that hasn't justified its existence quite yet in a niche that hasn't justified its existence.

you mean 99.9% of the internet isn't a justifiable niche? Now weather the market thinks this is a valuable product is something we will have to wait and see.
 
AMD is trying to cash in on a movement towards physicalization as opposed to virtualization. This ARM chip in particular is likely to push it even more to an extreme with shared-nothing physical machines that will have serious limitations with demand spikes at any given instance.

The question for the niche AMD is trying to cater to is whether the incumbent methods or some other technological advance won't come along and be sufficient for enough of this market that it doesn't justify its own hardware effort. If virtualized or shared systems can improve power consumption and still provide their inherent advantages in providing more peak for those spikes, then very few takers may materialize.
This is on top of the fact that AMD's interconnect is by design much less flexibile in what it can be applied to, and whether there will be enough customers that won't be satisfied with other low-margin players or in-house designs.

Other concerns like the lack of trust in AMD's execution, its being spread across yet another platform, or survivability compound the problem.
A lot more needs to be done to prove that this is more than a desperate ploy by AMD to create buzz in part by compromising whatever credence the Opteron brand has in servers by sharing it between a mature, inferior, but stable half and a powerpoint slide.
 
It's also quite likely that SeaMicro's development on ARM was well under way before they were acquired by AMD.

Yeah, apparently that was on their roadmap. I don't know if they were planning to make actual chips themselves, but they were planning something.
 
AMD is trying to cash in on a movement towards physicalization as opposed to virtualization. This ARM chip in particular is likely to push it even more to an extreme with shared-nothing physical machines that will have serious limitations with demand spikes at any given instance.

this is a problem that is already long solved and having 10,000 hosts in your server pools vs 1,000 isn't a problem.
 
Is it solved by less than six years, or roughly how far back in x86 terms you have to go to match the likely performance of the ARM chip that won't be out until 2014?
 
Is it solved by less than six years, or roughly how far back in x86 terms you have to go to match the likely performance of the ARM chip that won't be out until 2014?

i dont actually get what your saying, but its going ot come down to density/power/perf/io/TCO. the average internet facing X86 machine hasn't been that great across those metrics. I wouldn't be so quick to judge "performance" in either direction without systems in hand.
 
A lot of servers massively underutilize the CPU, so using less powerful, more power efficient ones make a lot of sense. When you build datacenters, things like how much copper you have to run to the building to get power to the servers actually matter.

I worked at Bing for a year at one point, you'd be surprised how low CPU utilization was even on the compute clusters, you were almost always constrained by how efficiently you could move the data, rather than any processing you might be doing.
But isn't that a matter of estimating and managing workloads and better virtualization techniques? I mean, when you're having an order of magnitude more physical installations on the server side, how much more complex would the data highways need to be in order to even reach the levels of the classic environments - not to speak of any improvements.
 
Last edited by a moderator:
i dont actually get what your saying, but its going ot come down to density/power/perf/io/TCO. the average internet facing X86 machine hasn't been that great across those metrics. I wouldn't be so quick to judge "performance" in either direction without systems in hand.
Providing a data service is frequently latency sensitive. Web-facing servers are frequently idle, but that sliver of time they are active they are expected to not exceed some ceiling in latency, particularly for user-facing services. A platform that consumes 1 Watt is not interesting for a big chunk of the workloads if it takes 5 hours to serve a simple request.
The subset of the market that has relaxed time constraints is a subset of a subset, however, AMD may be promising this product in some potentially latency-constrained environments as well.

What solutions are there for a lack of peak capability? Existing servers have performance in spades, can partition out resources as jobs ramp up and down, and are making continual strides in power management to bring them closer to Seamicro's isolated pools of weak processors that perform like Intel CPUs from 2005-2006. Some of the ARM competition AMD will have might have a similar performance deficit, but at least some are not going the shared-nothing route. Rather, they plan to have capable interconnects that have the luxury of going shared-nothing while not writing off the rest of the market.
 
Providing a data service is frequently latency sensitive. Web-facing servers are frequently idle, but that sliver of time they are active they are expected to not exceed some ceiling in latency, particularly for user-facing services.
i dont see how this is a problem for a BBOSN full of ARM cores. First your network latency is going to be the very vast majority of a web server latency. Second why is the ARM core going to be high latency, assuming a suitable memory sub system? Your going to have far more execution resources per clock on a 16 core arm vs a comparably priced X86 server.

A platform that consumes 1 Watt is not interesting for a big chunk of the workloads if it takes 5 hours to serve a simple request.
thats a straw man and nothing more.

The subset of the market that has relaxed time constraints is a subset of a subset, however, AMD may be promising this product in some potentially latency-constrained environments as well.

why do you keep making up this latency will be high? Have you ever played with anything like a raspberry pie? i've run quite complex web apps on one like cacti, its a memory bound 700mhz arm 11 platform but even that will a full blown LAMP install runs pretty well. Let alone a 16 core A-57 @ 2-3ghz with 16/32/64 gigs of ram. You have done nothing to justify the notion that application latency is going to suffer with arm.


What solutions are there for a lack of peak capability? Existing servers have performance in spades, can partition out resources as jobs ramp up and down, and are making continual strides in power management to bring them closer to Seamicro's isolated pools of weak processors that perform like Intel CPUs from 2005-2006.

web servers and there application servers are isolated workloads, peak load is handled the exact same way it is handled right now, by a few of these guys running combination of global and local traffic services.

You also have to consider the way web compute providers provide elasticity and scaling of provisioning, its pretty much per core/thread based which makes the arm based solution more granular and depending on density far more scalable (ignoring any ATOM/Jaguar based solutions for the moment).


Some of the ARM competition AMD will have might have a similar performance deficit, but at least some are not going the shared-nothing route. Rather, they plan to have capable interconnects that have the luxury of going shared-nothing while not writing off the rest of the market.

i dont think your understanding the workloads or the business drivers of this market. big scalable coherent compute is far more niche then what a BBOSN is targeting. i have never seen a workload that can scale across 16 threads but not across boxes. It why 1 and 2 socket boxes massively out strip 4+ in sales.

im not saying that this play by AMD is going to work, but your argument isn't very convincing and a broken clock is right two times a day.
 
Status
Not open for further replies.
Back
Top