AMD RyZen CPU Architecture for 2017

Kaotik · Aug 19, 2016

xEx said:
Not really. the 775 socket last many years and when Intel wanted to change board every year users claim and Intel changed its mind and we can last some years with the same socket. nevertheless the thing is: Why would I instead of getting an 4790k would like to spend much more money into getting a new Mobo, RAM and CPU? IDK how many years 1151 will last but I'm sure Intel wont be changing it anytime soon ether. Even if you come from an semi-old Intel platform you could still reuse your old DDR3 RAM if you want to move to 1150 and have a very big after market CPUs on Ebay.

I'm not talking about AMD shouldn't use the new tech in ZEN Im talking about that they are fighting against Intel Brand and they need to give a very good reason to ppl buy their platform and if they offer same performance on same or higher price than Intel then I dont think they could have any real success.

Socket 775 was an exception to the rule, ever since Intel has changed sockets every 2nd CPU generation, if not every generation.

pTmdfx · Aug 19, 2016

Gubbi said:
Raven Ridge is the most interesting from a business POV, IMHO. If they can get 4 well performing CPU cores and XB1 level graphics into a 15W power envelope they'll carve out a chunk of Intel's high margin mobile business.

Cheers

Only if they can somehow force OEMs to use dual-channel memory, even at the slowest speed...

LPDDR4 would be a nice bet too.

Sent from my iPhone using Tapatalk

xEx · Aug 19, 2016

Alexko said:
Well, an AMD CPU with the same IPC as Broadwell-E when running Blender. I don't want to be a party-pooper, and I want to believe more than Fox Mulder, but my experience with AMD has made me careful, to say the least.

I completely understand you but this looks better than all of AMD comparatives in the past few years.

Alexko · Aug 19, 2016

xEx said:
I completely understand you but this looks better than all of AMD comparatives in the past few years.

Undoubtedly, and I hope it's representative of Zen's overall performance, I'm just skeptical that it is.

Raqia · Aug 19, 2016

I think performance looks great, but what's uncertain is the quality of the southbridge and its drivers. Hopefully the enthusiast launch won't be seen a debugging expedition for the enterprise launch.

eastmen · Aug 19, 2016

Alexko said:
Well, an AMD CPU with the same IPC as Broadwell-E when running Blender. I don't want to be a party-pooper, and I want to believe more than Fox Mulder, but my experience with AMD has made me careful, to say the least.

What about before the core line ? Intel was a mess failing to deliever for years but AMD was right on track. We could see AMD get on track again.

For me , I don't need an intergrated gpu on my gaming pc. So the fact that intel wastes space with that instead of adding more cpu cores is a reason that AMD may get my money. If its similar single thread performance (even 10% or so difference) but I get twice the physical cores at the same price or less .... Yes please .

On mobile intel's gpus suck so again if I get 10% single thread performance and the same or more cores + a better gpu then yes please.

A surface pro with a quad core zen and a gpu capable of playing the latest games at 720p + ... yes please !

Alexko · Aug 19, 2016

eastmen said:
What about before the core line ? Intel was a mess failing to deliever for years but AMD was right on track. We could see AMD get on track again.

For me , I don't need an intergrated gpu on my gaming pc. So the fact that intel wastes space with that instead of adding more cpu cores is a reason that AMD may get my money. If its similar single thread performance (even 10% or so difference) but I get twice the physical cores at the same price or less .... Yes please .

On mobile intel's gpus suck so again if I get 10% single thread performance and the same or more cores + a better gpu then yes please.

A surface pro with a quad core zen and a gpu capable of playing the latest games at 720p + ... yes please !

I think it's reasonable to assume that for this demo, AMD chose an application that was rather favorable to Zen. In practice I would expect that, on average, Zen would have a measurable IPC deficit against Broadwell, and a slightly larger one against Skylake/Kaby Lake. Additionally, I doubt AMD will be able to match Intel's clock speeds.

As a reminder, the i7 6700K hits 4.0GHz with a turbo clock of 4.2GHz. Let's say Kaby Lake only brings that up to 4.1GHz and 4.3GHz.

Let's also assume (wild-ass, somewhat optimistic guesses, but that's the best we can do for now) that on average, Zen has a bit over 80% of Broadwell's IPC, so maybe 80% of Kaby Lake's, rounding things a bit. Let's also assume that commercial Zen CPUs will be clocked at 3.4GHz with a turbo clock of 3.7GHz.

Single-threaded performance, relative to Kaby Lake's: 80% × 3.6/4.3 = 0.69 = 69%.
Multi-threaded performance, relative to Kaby Lake's: 80% × 3.4/4.1 × 2 = 1.33 = 133%, assuming ideal scaling and 8 cores vs. 4.

If AMD can pull this off, their competitive position will be considerably improved, since they're currently hopelessly outmatched regardless of the number of threads, but I would be (very pleasantly) shocked if Zen managed to come within 10% of Kaby Lake's single-threaded performance. Still, it's a brand new architecture, so we can also reasonably expect that there's more low-hanging fruit in Zen than Kaby Lake, therefore that AMD will manage relatively larger incremental performance improvements for the next few years, potentially narrowing the gap a bit.

eastmen · Aug 19, 2016

I think its foolish to compare clock speeds. Their current cpu's are on 32nm and clock at over 4ghz

Comparing intel and amd is just as silly as comparing the new amd stuff to the old .

hoom · Aug 20, 2016

Well AMD did release various official clocked chips over 4Ghz already up to FX-9590 with 4.7Ghz base clock, 5.0 turbo... (but with 220W TDP)
If they can release up in those ranges with this IPC & low TDPs Intel may have to do some serious work.

xEx · Aug 20, 2016

You cant compare Bulldozer with Zen, they are 2 completely different architectures. BZ were a good concept but a horrible implementation, it was so bad that AMD desired to throw in the trash the whole architecture and start a new one. Just for reference, BZ needed to start at 5GHz to be competitive against Intel, of course AMD couldn't get enough money to make it happen and we all know how they ended.

I think that if we have ES running at 3GHz about 6 months before its realize is a very good sign. I don't know if they would be able to get to the 4Ghz but their level of performance its pretty good still. One thing to notice is that AMD said they are not matching TDP for TDP so spec AMD chip to consume more, how much? we will have to find out.

To me the more interesting thing is to see how well and how long it will take Intel to react to this, since they have been giving the minor evolution in their designs in order to save money so now they will have to respond to AMD or they could be in serious trouble in 1 or 2 years. CPU race looks interesting once more. Lets hope this time it last.

hoom · Aug 20, 2016

Well I guess the point is they have direct experience with getting a CPU to run reliably for commercial release at those kind of clocks, I'd hope some lessons learned would help with getting Zen clocks decently high.

fehu · Aug 20, 2016

Bulldozer was built on the concept that integer instructions are more common and important than float, what changed now?

Alexko · Aug 20, 2016

hoom said:
Well I guess the point is they have direct experience with getting a CPU to run reliably for commercial release at those kind of clocks, I'd hope some lessons learned would help with getting Zen clocks decently high.

Yeah but Bulldozer/Piledriver was designed specifically for it (and even so, fell short). It also ran on a process that was tuned for high clock speeds. At 28nm, Steamroller is clocked lower. At 28nm with high-density libraries (HDL), Excavator is clocked lower still.

Since HDL are as much about power efficiency (at moderate clock speeds) as they are about density, and since Zen is supposed to replace both Excavator and Puma, I fully expect AMD to keep using them at 14nm. Further, this 14nm process is not tuned for high clock speeds, it's a generic process aimed at SoCs for handheld devices, mobile CPUs/APUs, GPUs, etc.

I certainly hope I'm wrong, but I don't see AMD hitting 4GHz with Zen, at least not initially.

pTmdfx · Aug 20, 2016

fehu said:
Bulldozer was built on the concept that integer instructions are more common and important than float, what changed now?

Not exactly. At least for the integer pipeline, it is built on the concept that multi threading is enough to outpace the potential loss caused by sharing "non-critical" resources. It did deliver in this regard (better scaling than SMT), but the scaling base was too low that it was backwards from K10. Then when they want to scale the single thread performance, they might have hit a point of standalone cores with SMT making better sense, e.g. trying to scale your decoder to more than 4 ways, etc.

Or maybe they just want to play safe.

IMO the sharing of the "FP" unit (mind you, it runs packed integer too) should be considered separately. I would say the race of multithreading with "good single thread performance" probably had driven this sharing for higher area efficiency and lower cost, just that it turns out to be hurting too.

sebbbi · Aug 20, 2016

fehu said:
Bulldozer was built on the concept that integer instructions are more common and important than float, what changed now?

Sharing the FP/vector unit (or anything) between two cores has a cost (syncronization, latency and distance). Dedicated FP/vector unit can be powered down when there are idle periods. It costs a bit more area, but is way more power efficient. Similar trend can be seen on GPUs (esp Nvidia).

sebbbi · Aug 21, 2016

itsmydamnation said:
Why do people say this?

Does google care about 8 let alone 16-wide AVX?
Does facebook care about 8 let alone 16-wide AVX
does IaaS care about 8 let alone 16 wide AVX
does Enterprise care about 8 let alone 16 wide AVX

Now lets look at the revenue of that market its what its around 15-20 billion a year......

Now i cant say for certain about google and facebook but i have direct experience in designing Datacentres for the last two. No one cares about AVX, lots of "solid IPC cores" with lots of memory is a small form factor, its key to the pricing model. It looks like Zen delivers that in spades , has lots of onboard 10Gbe and possibly ( i hope they do) crypto/compress engines.

Zen doesn't have to be everything to everyone and the silent majority of customers and workloads dont care about AVX. I stake my 10 years of design work in the IaaS and Enterprise Datacentre space on that.

Im far more concerned about the previously mentioned perf impact to 2P over 1P, thats something to actually care about...............

AVX-512 has native support for both scatter and gather. It has also lane masks. It was very hard to automatically compile loops to SSE/AVX(1&2), because scatter, gather and lane masking was not supported. Only simple loops with no index/pointer data reads/writes were supported. AVX-512 makes autovectorization much more simple and more efficient. It also allows efficient execution of simpler SPMD-style programming models instead of writing machine intrinsics. Instruction sets must evolve to make programmer's life easier. AVX-512 is a great example of this. It's not super relevant today, but as compilers will evolve, it will become more important also for this segment.

TSX is another great extension. TSX makes shared data structures (such as memory allocators) possible with no need of any locking (CPU tags cache lines to transactions and supports roll back). In many cases this provides nice performance boosts. There is still no proof that Zen supports TSX. The forthcoming Skylake Xeons will.

I agree with you that 8 core Zen will sell like hot cakes if it compares favorably in perf/watt to Intel's dense power optimized 8 core Xeon D (designed to match Facebook's needs). Limited clock ceiling doesn't matter in this segment as Xeon D is running at 2.1 Ghz.

Google is investing heavily in algorithm design. I have experimented with some of their open source code. They certainly do embrace vector instructions, even the newest ones. One example of AVX-2 usage: https://github.com/google/highwayhash.

sebbbi · Aug 22, 2016

itsmydamnation said:
Does google care about 8 let alone 16-wide AVX?
Does facebook care about 8 let alone 16-wide AVX
does IaaS care about 8 let alone 16 wide AVX
does Enterprise care about 8 let alone 16 wide AVX

ARM Scalable Vector Extensions (SVE) announced today. Up to 2048 bit vector width. Target = HPC and data centers. Good to see AVX-512 getting some competition. Now both ARM and x86 have wide vector support.
http://www.anandtech.com/show/10586...tor-extensions-aiming-for-hpc-and-data-center

itsmydamnation · Aug 22, 2016

sebbbi said:
AVX-512 has native support for both scatter and gather. It has also lane masks. It was very hard to automatically compile loops to SSE/AVX(1&2), because scatter, gather and lane masking was not supported. Only simple loops with no index/pointer data reads/writes were supported. AVX-512 makes autovectorization much more simple and more efficient. It also allows efficient execution of simpler SPMD-style programming models instead of writing machine intrinsics. Instruction sets must evolve to make programmer's life easier. AVX-512 is a great example of this. It's not super relevant today, but as compilers will evolve, it will become more important also for this segment.

Yes i am aware of the improvements from scatter, gather. What i dont think you high end engine/gfx people understand is just what enterprise applications look like, for example what instruction sets does MS exchange, AD, objective, Trim, HP CSA, the farms of Zen app or Zen desktop etc etc let alone the in house app's or the sea of just rubbish enterprise apps. They are horrible coded loops in loops with conditions, jumps etc everywhere let alone using an SIMD instruction set. Let's not even pretend that the average code monkey working on the average code base well threaded etc.

TSX is another great extension. TSX makes shared data structures (such as memory allocators) possible with no need of any locking (CPU tags cache lines to transactions and supports roll back). In many cases this provides nice performance boosts. There is still no proof that Zen supports TSX. The forthcoming Skylake Xeons will.

So AMD might have something, we will probably know in a day or two as core counts grow you would expect them to have something, with the follow on to Zen rumored to be 48 cores . But every *aaS, Vmware farm etc is full of guests with low core counts, it makes up the very fast majority of server farms and cloud instances ( out side big DB). With things like Docker you dont run bigger VM's you just spin up more.

I agree with you that 8 core Zen will sell like hot cakes if it compares favorably in perf/watt to Intel's dense power optimized 8 core Xeon D. Limited clock ceiling doesn't matter as Xeon D is running at 2.1 Ghz.

i think across the board from 8 to 32 cores its looking pretty good, we have people with access to hardware in this thread saying so

.

itsmydamnation · Aug 22, 2016

https://www.computerbase.de/2016-08/amd-zen-architektur/

slides inside, nothing about tsx, but there isn't a lot of specific detail either.......

fehu · Aug 22, 2016

http://dresdenboy.blogspot.it/2016/08/two-days-to-go-until-amds-hot-chips.html

AMD RyZen CPU Architecture for 2017

Kaotik

Drunk Member

pTmdfx

xEx

Alexko

Raqia

eastmen

Alexko

eastmen

hoom

xEx

hoom

fehu

Alexko

pTmdfx

sebbbi

sebbbi

sebbbi

itsmydamnation

itsmydamnation

fehu

Similar threads