AMD RyZen CPU Architecture for 2017

The passmark Zen has terrible memory 2400 17-17-17. On my ivb @4.3 going from ddr 3 2000 10-11-10 to ddr 3 1066 7-7-7. ( still faster latency then Zen) drops my Cpu score by 800 point mainly from prime number and physics subtests. Going to 1066 10-11-10 drops it another 800 points so it's definitely latency not throughput related.
 
http://pc.watch.impress.co.jp/docs/column/kaigai/1043349.html

from everyone fav Japanese man, new ZEN details from ISSCC
I feel like this post needs some more attention.

Google translate seems to be doing a good job for Japanese & the slides are in English.
Those aren't even normal PR gumf/fake leaked slides, they're official 2017 AMD slides for a presentation to IEEE & with core/module shots :D

4.jpg

5.jpg


If AMD has really pulled off smaller cores, similar clocks, similar IPC & similar TDP vs Intel despite weaker process tech that's a hell of a comeback! :yes:
 
If AMD has really pulled off smaller cores, similar clocks, similar IPC & similar TDP vs Intel despite weaker process tech that's a hell of a comeback!

I think that's the big picture, yeah. However, judging bythe reputed armchair die pic analysists' opinions on the web, FPU performance (AVX only?) seems to suffer a bit more on Zen. So theres still no free lunch, unfortunately
 
Probably the most interesting part of this leak is the existence of two TDP levels for the whole stack of Ryzen CPUs. No matter if we take Ryzen 3 1100 or Ryzen 7 1700, both will have a TDP at 65W. However, it’s important to note that TDP does not stand for power consumption, as TDP stands for Thermal Design Power.

Three processors: Ryzen 7 1800X, Ryzen 7 1700X and Ryzen 5 1600X will not only have 95W TDP by specs, but also a special kind of cooler as a requirement (HS81). According to the leaked datasheet, this cooler will ensure that the temperature will not exceed 60C (in most cases). This, as a result, should increase the overclocking potential thanks to new technology called Extended Frequency Range (XFR), which basically allows clock speeds to scale with cooling solutions, without a requirement of user intervention. What we don’t know is whether this technology will be enabled just for the SKUs with the “X”, or for other models as well.
AMD-Ryzen-CPU-List-1000x750.jpg


AMD-Ryzen-Cooler-Models-1000x750.jpg
https://videocardz.com/65892/amd-ryzen-7-1800x-1700x-and-ryzen-5-1600x-will-require-special-coolers
 
OPN numbers in the leak above are legitimate. I can see them in the listings from one of the biggest computer wholesaler in Europe.

Also regarding yesterday results from PassMark, Physics and Prime relatively low scores are due to high memory latency of Ryzen test machine (14ns). A member on another forum played with timings on his Ivy processor (keeping CPU clock constant @4.3GHz):

Code:
##1066 10-11-10 (18.7ns)
CPU Mark This Computer 9230
Integer Math This Computer 19408
Floating Point Math This Computer 8121
Prime Numbers This Computer 19.8
Extended Instructions (SSE) This Computer 225.8
Compression This Computer 14193
Encryption This Computer 2024
Physics This Computer 359.7
Sorting This Computer 8723
CPU Single Threaded This Computer 2370




##1066 7-7-7 (13ns)
CPU Mark This Computer    9932
Integer Math This Computer    19862
Floating Point Math This Computer    8305
Prime Numbers This Computer    25.5
Extended Instructions (SSE) This Computer    227.5
Compression This Computer    15206
Encryption This Computer    2078
Physics This Computer    413.9
Sorting This Computer    8589
CPU Single Threaded This Computer    2327

###1333 8-8-8 (12ns)
CPU Mark This Computer    10140
Integer Math This Computer    17076
Floating Point Math This Computer    8182
Prime Numbers This Computer    29.1
Extended Instructions (SSE) This Computer    220.0
Compression This Computer    14959
Encryption This Computer    2040
Physics This Computer    476.2
Sorting This Computer    8557
CPU Single Threaded This Computer    2367

###1333 7-7-7 (10.5ns)
CPU Mark This Computer    10516
Integer Math This Computer    19445
Floating Point Math This Computer    8457
Prime Numbers This Computer    29.9
Extended Instructions (SSE) This Computer    228.5
Compression This Computer    15119
Encryption This Computer    2074
Physics This Computer    503
Sorting This Computer    8761
CPU Single Threaded This Computer    2375

##2000 10-11-10 (10ns)
CPU Mark 10673
Integer Math 19504
Floating Point Math 8336
Prime Numbers 31.5
Extended Instructions (SSE) 219.1
Compression 14806
Encryption 1944
Physics 597
Sorting 8577
CPU Single Threaded 2358

Tested by itsmydamnation over at Anand's
 
However, judging bythe reputed armchair die pic analysists' opinions on the web, FPU performance (AVX only?) seems to suffer a bit more on Zen
I guess the big question then is if that is something likely to be a common load for Consumers/Gamers/Supercomputers & to what extent GPU compute can be used to offset it.
 
Fxs are not in the same category. In ryzen case I think a ryzen core will be more powerful than an Smt part of intels. What you are getting is an i5 unlocked for the price of an i3

Enviado desde mi HTC One mediante Tapatalk
 
This would be a useful post if your point wasn't so stupid...........
Ayy, I'm not the one equating cores.

$182 CPU from Intel is going to compete with "$199" AMD CPU, "$175" - at best, not with the "$129" one.

And my point was, lacking(non-existent) low-end is a problem, 64$ Intel is more than sufficient for most people right now, why would they pay 2 times as much for something that isn't even 1,5 times better?
 
Ayy, I'm not the one equating cores.

$182 CPU from Intel is going to compete with "$199" AMD CPU, "$175" - at best, not with the "$129" one.

And my point was, lacking(non-existent) low-end is a problem, 64$ Intel is more than sufficient for most people right now, why would they pay 2 times as much for something that isn't even 1,5 times better?
Feel free to post the benchmarks that back your statement.

Hell, Intel's own quads are 3x the price for "not even 1.5 better" compared to an oced Pentium, I am not sure what your point is.
 
Ayy, I'm not the one equating cores.

$182 CPU from Intel is going to compete with "$199" AMD CPU, "$175" - at best, not with the "$129" one.
Maybe you should educate yourself on what the Zen Core looks like.
heres a run down.
L1D latency/ size/way = skylake
L1i latency/size/way = twice size of skylake 1/2 the ways
L2 latency/size/way = twice size of skylake, same latency, dont know the ways off top of my head
L3 latency/size/way = low latency within CCX, same size (4 core to 4 core ) same way, L3 exclusive in Zen vs inclusive of skylake
decode = Aprox same size ( 4 wide), reads 32 bytes a cycle vs 16? for skylake? ,
op cache = both around the same size, both 6 uops to dispatch
register files/ load store queue/schedule/ retirement = all approx the same as skylake some sizes more like broadwell, both can retire 8 ops a cycle
both Cores have SMT
ALU = both have 4 int alu Zen doesn't share ports with FPU like skylake
AGU = skylake can generate one more address
load store = both can do 2loads and 1 store a cycle, Zen max size being 128bit while skylakes is 256bit
branch = both can do two branches a cycle , doesn't share port with FPU like skylake
FPU = Zen has 2x 128bit FMA/MUL & 2x 128bit ADD , skylake 2x 256bit FMA 1x 256bit misc ops port

Then we are in to the harder to gauge parts, both have high quality prefetch, predict, L2 based prefetchers etc.
Zen has an integer checkpoint/rollback unit that i haven't found any detail on
Zen has a stack engine and local storage for calculating/holding the stack address
Zen miss predict penality is listed as 3 cycles shorter ( i assume against Con cores) make it comparable to skylake @ 17-19 cycles
95watt 8 core part consumes around the same power as 6900k at around the same clocks for around the same performance in Bf1, blender demo ( what you can download and run yourself) and x264.

So why exactly can't we equate cores, be specific please!
Before you try and say they would price like intel if it was performing like intel RV770 would like to say hello, this is the exact same position the recovery chip after the previous disaster ( R600 / CON cores) .



And my point was, lacking(non-existent) low-end is a problem, 64$ Intel is more than sufficient for most people right now, why would they pay 2 times as much for something that isn't even 1,5 times better?
Whats that got to do with anything? that part comes later. This 1 AMD chip it doing the job of 4! Intel chips, mid and high side of mainstream (sky/kaby lake) Xeon D, Xeon E and Xeon EP. Im sure your the kind of person who would argue the 1070 & 1080 aren't making stupid amount of money for NV because most people only need a $50 GPU......................
 
I think that's the big picture, yeah. However, judging bythe reputed armchair die pic analysists' opinions on the web, FPU performance (AVX only?) seems to suffer a bit more on Zen. So theres still no free lunch, unfortunately
To add to this point, a rough check of the pixel area of what I think is the vector unit of the 6700K takes possibly around 23% of the core+L2 area. Zen's rough labeling of what I think is equivalent is around 16%.
I think getting double throughput could at a minimum increase the area of the Zen block by a third. Depending on how that goes, that can add back ~1.2 mm2 for the overall block.

The integer and load/store block may be another area of area savings. It doesn't sport the additional store AGU, and the data paths are half of Skylake's. Unknown is the contribution of some of the other Intel features like transactional memory on the pipeline and L1 cache.

The L2 density comparison probably includes the interface/control logic for both processors. This does have an outsized effect for the smaller L2 since the more dense SRAM arrays can scale more readily while the logic overhead is less flexible. This is on top of the higher bandwidth of the Intel L2 (more burden on the non-SRAM component), and possibly the removal of sources of bank conflicts in the cache per Agner's optimization doc. It is fair to say that AMD's choice appears to give more capacity per unit of area, but at a minimum it comes at the cost of bandwidth, and possibly there are banking conflicts. Bulldozer cores at least up to Steamroller had frequent L1 bank conflicts (Excavator was not documented), the L2 was generally terrible so I am not sure if there were issues with banking in that mass of problems.

The L3 comparison is also interesting because the L3 of the 6700K appears to have a packing problem. The lower side of the quad-core area appears to take up more room than the L3 arrays can fill. The comparison between the more tightly packed Zen module and Skylake may be including some of that dead space. That's perhaps a fair point to make, but the reason would be that Intel's LLC and interconnect are designed to scale readily beyond 4 cores per ring. What that means for performance or the higher CCX counts remains to be seen.

The area win may be a bit more modest than advertised, and it does come as a trade-off in terms of throughput in various sections of the CCX, and possibly a different slope to the multicore scaling curve.
 
Back
Top