AMD RyZen CPU Architecture for 2017

Discussion in 'PC Industry' started by fellix, Oct 20, 2014.

Tags:
  1. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    18,992
    Likes Received:
    3,532
    Location:
    Winfield, IN USA
    Shhhh! I wanted him to think that Raja was just REALLY passionate about the Ryzen!
     
  2. itsmydamnation

    Veteran

    Joined:
    Apr 29, 2007
    Messages:
    1,349
    Likes Received:
    470
    Location:
    Australia
    The passmark Zen has terrible memory 2400 17-17-17. On my ivb @4.3 going from ddr 3 2000 10-11-10 to ddr 3 1066 7-7-7. ( still faster latency then Zen) drops my Cpu score by 800 point mainly from prime number and physics subtests. Going to 1066 10-11-10 drops it another 800 points so it's definitely latency not throughput related.
     
    BRiT likes this.
  3. hoom

    Veteran

    Joined:
    Sep 23, 2003
    Messages:
    3,264
    Likes Received:
    813
    I feel like this post needs some more attention.

    Google translate seems to be doing a good job for Japanese & the slides are in English.
    Those aren't even normal PR gumf/fake leaked slides, they're official 2017 AMD slides for a presentation to IEEE & with core/module shots :grin:

    [​IMG]
    [​IMG]

    If AMD has really pulled off smaller cores, similar clocks, similar IPC & similar TDP vs Intel despite weaker process tech that's a hell of a comeback! :yes:
     
    BRiT likes this.
  4. kalelovil

    Regular

    Joined:
    Sep 8, 2011
    Messages:
    568
    Likes Received:
    104
    The Bullet Physics library it is testing similarly under performs on Apple's A7 and newer, despite those being otherwise being very capable cores.
    https://www.futuremark.com/pressrel...results-from-the-apple-iphone-5s-and-ipad-air
     
  5. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    832
    Likes Received:
    505
    He is even trying to hide Vega.
     
  6. entity279

    Veteran Subscriber

    Joined:
    May 12, 2008
    Messages:
    1,332
    Likes Received:
    500
    Location:
    Romania
    I think that's the big picture, yeah. However, judging bythe reputed armchair die pic analysists' opinions on the web, FPU performance (AVX only?) seems to suffer a bit more on Zen. So theres still no free lunch, unfortunately
     
  7. pharma

    Veteran

    Joined:
    Mar 29, 2004
    Messages:
    4,894
    Likes Received:
    4,548
    https://videocardz.com/65892/amd-ryzen-7-1800x-1700x-and-ryzen-5-1600x-will-require-special-coolers
     
    Lightman likes this.
  8. Lightman

    Veteran Subscriber

    Joined:
    Jun 9, 2008
    Messages:
    1,969
    Likes Received:
    963
    Location:
    Torquay, UK
    OPN numbers in the leak above are legitimate. I can see them in the listings from one of the biggest computer wholesaler in Europe.

    Also regarding yesterday results from PassMark, Physics and Prime relatively low scores are due to high memory latency of Ryzen test machine (14ns). A member on another forum played with timings on his Ivy processor (keeping CPU clock constant @4.3GHz):

    Code:
    ##1066 10-11-10 (18.7ns)
    CPU Mark This Computer 9230
    Integer Math This Computer 19408
    Floating Point Math This Computer 8121
    Prime Numbers This Computer 19.8
    Extended Instructions (SSE) This Computer 225.8
    Compression This Computer 14193
    Encryption This Computer 2024
    Physics This Computer 359.7
    Sorting This Computer 8723
    CPU Single Threaded This Computer 2370
    
    
    
    
    ##1066 7-7-7 (13ns)
    CPU Mark This Computer    9932
    Integer Math This Computer    19862
    Floating Point Math This Computer    8305
    Prime Numbers This Computer    25.5
    Extended Instructions (SSE) This Computer    227.5
    Compression This Computer    15206
    Encryption This Computer    2078
    Physics This Computer    413.9
    Sorting This Computer    8589
    CPU Single Threaded This Computer    2327
    
    ###1333 8-8-8 (12ns)
    CPU Mark This Computer    10140
    Integer Math This Computer    17076
    Floating Point Math This Computer    8182
    Prime Numbers This Computer    29.1
    Extended Instructions (SSE) This Computer    220.0
    Compression This Computer    14959
    Encryption This Computer    2040
    Physics This Computer    476.2
    Sorting This Computer    8557
    CPU Single Threaded This Computer    2367
    
    ###1333 7-7-7 (10.5ns)
    CPU Mark This Computer    10516
    Integer Math This Computer    19445
    Floating Point Math This Computer    8457
    Prime Numbers This Computer    29.9
    Extended Instructions (SSE) This Computer    228.5
    Compression This Computer    15119
    Encryption This Computer    2074
    Physics This Computer    503
    Sorting This Computer    8761
    CPU Single Threaded This Computer    2375
    
    ##2000 10-11-10 (10ns)
    CPU Mark 10673
    Integer Math 19504
    Floating Point Math 8336
    Prime Numbers 31.5
    Extended Instructions (SSE) 219.1
    Compression 14806
    Encryption 1944
    Physics 597
    Sorting 8577
    CPU Single Threaded 2358
    Tested by itsmydamnation over at Anand's
     
    Kej and pharma like this.
  9. xEx

    xEx
    Veteran

    Joined:
    Feb 2, 2012
    Messages:
    1,060
    Likes Received:
    543
  10. hoom

    Veteran

    Joined:
    Sep 23, 2003
    Messages:
    3,264
    Likes Received:
    813
    I guess the big question then is if that is something likely to be a common load for Consumers/Gamers/Supercomputers & to what extent GPU compute can be used to offset it.
     
  11. Transponster

    Newcomer

    Joined:
    Feb 24, 2016
    Messages:
    74
    Likes Received:
    13
  12. Arzachel

    Newcomer

    Joined:
    Jul 23, 2013
    Messages:
    28
    Likes Received:
    22
    Cheapest Zen - $129, cheapest Intel with 4 cores - $182, might be a tough sell.
     
    Alexko, Malo and xEx like this.
  13. xEx

    xEx
    Veteran

    Joined:
    Feb 2, 2012
    Messages:
    1,060
    Likes Received:
    543
    Plus the overlock and cheaper platform.

    Enviado desde mi HTC One mediante Tapatalk
     
  14. Transponster

    Newcomer

    Joined:
    Feb 24, 2016
    Messages:
    74
    Likes Received:
    13
    That would have been funny, if it wasn't so FX.
     
    #874 Transponster, Feb 13, 2017
    Last edited: Feb 13, 2017
  15. xEx

    xEx
    Veteran

    Joined:
    Feb 2, 2012
    Messages:
    1,060
    Likes Received:
    543
    Fxs are not in the same category. In ryzen case I think a ryzen core will be more powerful than an Smt part of intels. What you are getting is an i5 unlocked for the price of an i3

    Enviado desde mi HTC One mediante Tapatalk
     
  16. itsmydamnation

    Veteran

    Joined:
    Apr 29, 2007
    Messages:
    1,349
    Likes Received:
    470
    Location:
    Australia
    This would be a useful post if your point wasn't so stupid...........
     
  17. Transponster

    Newcomer

    Joined:
    Feb 24, 2016
    Messages:
    74
    Likes Received:
    13
    Ayy, I'm not the one equating cores.

    $182 CPU from Intel is going to compete with "$199" AMD CPU, "$175" - at best, not with the "$129" one.

    And my point was, lacking(non-existent) low-end is a problem, 64$ Intel is more than sufficient for most people right now, why would they pay 2 times as much for something that isn't even 1,5 times better?
     
  18. Arzachel

    Newcomer

    Joined:
    Jul 23, 2013
    Messages:
    28
    Likes Received:
    22
    Feel free to post the benchmarks that back your statement.

    Hell, Intel's own quads are 3x the price for "not even 1.5 better" compared to an oced Pentium, I am not sure what your point is.
     
    digitalwanderer and RootKit like this.
  19. itsmydamnation

    Veteran

    Joined:
    Apr 29, 2007
    Messages:
    1,349
    Likes Received:
    470
    Location:
    Australia
    Maybe you should educate yourself on what the Zen Core looks like.
    heres a run down.
    L1D latency/ size/way = skylake
    L1i latency/size/way = twice size of skylake 1/2 the ways
    L2 latency/size/way = twice size of skylake, same latency, dont know the ways off top of my head
    L3 latency/size/way = low latency within CCX, same size (4 core to 4 core ) same way, L3 exclusive in Zen vs inclusive of skylake
    decode = Aprox same size ( 4 wide), reads 32 bytes a cycle vs 16? for skylake? ,
    op cache = both around the same size, both 6 uops to dispatch
    register files/ load store queue/schedule/ retirement = all approx the same as skylake some sizes more like broadwell, both can retire 8 ops a cycle
    both Cores have SMT
    ALU = both have 4 int alu Zen doesn't share ports with FPU like skylake
    AGU = skylake can generate one more address
    load store = both can do 2loads and 1 store a cycle, Zen max size being 128bit while skylakes is 256bit
    branch = both can do two branches a cycle , doesn't share port with FPU like skylake
    FPU = Zen has 2x 128bit FMA/MUL & 2x 128bit ADD , skylake 2x 256bit FMA 1x 256bit misc ops port

    Then we are in to the harder to gauge parts, both have high quality prefetch, predict, L2 based prefetchers etc.
    Zen has an integer checkpoint/rollback unit that i haven't found any detail on
    Zen has a stack engine and local storage for calculating/holding the stack address
    Zen miss predict penality is listed as 3 cycles shorter ( i assume against Con cores) make it comparable to skylake @ 17-19 cycles
    95watt 8 core part consumes around the same power as 6900k at around the same clocks for around the same performance in Bf1, blender demo ( what you can download and run yourself) and x264.

    So why exactly can't we equate cores, be specific please!
    Before you try and say they would price like intel if it was performing like intel RV770 would like to say hello, this is the exact same position the recovery chip after the previous disaster ( R600 / CON cores) .



    Whats that got to do with anything? that part comes later. This 1 AMD chip it doing the job of 4! Intel chips, mid and high side of mainstream (sky/kaby lake) Xeon D, Xeon E and Xeon EP. Im sure your the kind of person who would argue the 1070 & 1080 aren't making stupid amount of money for NV because most people only need a $50 GPU......................
     
  20. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    To add to this point, a rough check of the pixel area of what I think is the vector unit of the 6700K takes possibly around 23% of the core+L2 area. Zen's rough labeling of what I think is equivalent is around 16%.
    I think getting double throughput could at a minimum increase the area of the Zen block by a third. Depending on how that goes, that can add back ~1.2 mm2 for the overall block.

    The integer and load/store block may be another area of area savings. It doesn't sport the additional store AGU, and the data paths are half of Skylake's. Unknown is the contribution of some of the other Intel features like transactional memory on the pipeline and L1 cache.

    The L2 density comparison probably includes the interface/control logic for both processors. This does have an outsized effect for the smaller L2 since the more dense SRAM arrays can scale more readily while the logic overhead is less flexible. This is on top of the higher bandwidth of the Intel L2 (more burden on the non-SRAM component), and possibly the removal of sources of bank conflicts in the cache per Agner's optimization doc. It is fair to say that AMD's choice appears to give more capacity per unit of area, but at a minimum it comes at the cost of bandwidth, and possibly there are banking conflicts. Bulldozer cores at least up to Steamroller had frequent L1 bank conflicts (Excavator was not documented), the L2 was generally terrible so I am not sure if there were issues with banking in that mass of problems.

    The L3 comparison is also interesting because the L3 of the 6700K appears to have a packing problem. The lower side of the quad-core area appears to take up more room than the L3 arrays can fill. The comparison between the more tightly packed Zen module and Skylake may be including some of that dead space. That's perhaps a fair point to make, but the reason would be that Intel's LLC and interconnect are designed to scale readily beyond 4 cores per ring. What that means for performance or the higher CCX counts remains to be seen.

    The area win may be a bit more modest than advertised, and it does come as a trade-off in terms of throughput in various sections of the CCX, and possibly a different slope to the multicore scaling curve.
     
    entity279 and Alexko like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...