AMD RyZen CPU Architecture for 2017

That is spot on. I remember AMD arguing, when certain ops had lower throughput on Bulldozer than K8, that "we'll make it up on frequency". That was four years after Prescott. Apparently some lessons are hard to learn.
And latency. Jaguar has so much lower latencies in SIMD/FP instructions than Bulldozer.

IBM and Intel had already abandoned their high clock designs when Bulldozer was released. But everyone must try it once. At least AMD didn't do their own VLIW CPU like Intel and many others did... they did a VLIW GPU instead :)
 
More like Nahelem since its SMT and high clocks.

Also figners crossed for the 4 core 8 threats with vega igp APUs. not that I'm interested in buying one but for the users who are it would be a gigantic step up.
 
Indeed, but at the cost of throughput rate. Jaguar was a low-latency design in general and that reflected on its clock-rate range.
Yes... but Jaguar still had awfully slow shared L2 cache. Zen cores introduced similar fast dedicated L2 cache than Intel has had for long time. But the shared L3 cache still seems to be much slower than Intel's and it doesn't cover all cores. Cross cluster data movement is still a problem. Zen is a huge improvement over AMDs previous designs, but still a bit behind Intel in some uncore features.
 
Yes... but Jaguar still had awfully slow shared L2 cache. Zen cores introduced similar fast dedicated L2 cache than Intel has had for long time. But the shared L3 cache still seems to be much slower than Intel's and it doesn't cover all cores. Cross cluster data movement is still a problem. Zen is a huge improvement over AMDs previous designs, but still a bit behind Intel in some uncore features.

There is also something else unusual about the L3 cache beyond the 8MB per CCX and how it should had been represented by AMD rather than as a 16MB shared L3.
Hardware.fr has shown that it is not behaving as expected when trying to use the L3 cache beyond certain thresholds on same CCX, and this is applicable to all CPUs tested apart from the 1600X.
As an example, the 4-Core 8MB (2 CCX) cache has a latency jump when the L3 cache is over 2MB rather the expected 4MB, and the others with 16MB (2 CCX) has a latency jump after 4MB rather than after 8MB.

So something unsual is happening but the plus side is the 1600X is not following the same quirk trends and so maybe this is fixable or the circumstances were perfect for the 1600X.
This suggests either:
- further partitioning/segmentation-mesh of the L3 Cache for same CCX.
- An unusual quirk of the 2x1MB per core.
- An unusual quirk-bug with how applications can use the L3, which needs clarification from AMD and hardware.fr is still waiting for a response on the subject of the L3 behaviour.
- The 'mostly reserved' L3 cache means that part of it is reserved permanently or dynamic in some way (which may also apply to point 1).

getgraphimg.php



Separately the high latency jump after 4MB (green) and 8MB (other SKUs) comes down to using system memory, as victim cache for core and with thread affinity to be fixed the other CCX in this situation would not have the data, and it is unknown whether the 'shared' data can be pulled cross CCX with a penalty or must revert always to system memory.
Red 1600X is how one expects it to be as 2x8MB L3 caches in context mentioned, but this behaves different on Intel with its decoupled ring bus L3 cache.

getgraphimg.php


Here is the simplified 1800X.

getgraphimg.php


But the 1600X suggests a solution may be possible.
Cheers
 
Last edited:
It seems odd that the 1600X's L3 latency behavior would differ from the 1600 in such a measurable fashion. The main difference would be the TDP and XFR, if all else were equal.
 
It seems odd that the 1600X's L3 latency behavior would differ from the 1600 in such a measurable fashion. The main difference would be the TDP and XFR, if all else were equal.
Yeah.
TBH the situation with the 1600X is rather confusing without more details from AMD regarding the Cache structure and operation, as you say not only does it ignore the trend of all other Ryzen SKUs but also the similar core and L3 of the 1600.
Would be interesting if they investigated another 1600X and also try an older Microcode/BIOS for the current 1600X they own, but the others for now are suffering from one of the points I mentioned.
Cheers
 
Are these both PCIe and RAID0-able?
If so, we're looking at ridiculous drive speeds here.

That table says so wrt to both being PCIe 4x, right? So yes. RAID0 is a software `issue` so i'm guessing it should work so long the OS is fine with it (possibly being a boot RAID drive).

Personally I'd use this for a dual drive ssd config : fast system drive for OSs and big TLC ssd for games and other storage
 
The table shows that the second M2 slots are PCIe 4x, but Gen 2. So half the bandwidth or the same as Gen 3. 2x. I don't think Raid 0 would be possible or desirable. This is unavoidable with all current AM4 chipsets unfortunately.
 
Keep us informed please.
I'm interested to know how well it works and if you get into any trouble at any point.
I'm considering an ASUS B350 + Ryzen 7 1700 + 16GiB of RAM, but the later seems to be difficult to choose wisely, or at all ^^
 
I can boot at 3.75GHz without too much effort + 3200MHz mem, need to tighten the timings or push for more but I don't want to fiddle with BCLK OC just yet:

8u1a66.png
 
Keep us informed please.
I'm interested to know how well it works and if you get into any trouble at any point.
I'm considering an ASUS B350 + Ryzen 7 1700 + 16GiB of RAM, but the later seems to be difficult to choose wisely, or at all ^^
16gb is easy. Just get 2x8 g.skill TridentZ 3200CL14 and you'll be fine. If you're paying less than $180-200 US then it's probably not the right stuff.
 
Second update, everything is stable so far and temperatures are great with the stock cooler on this thing:

3.8ryzentrxby.png


core voltage: 1.3 at LLC1
vsoc: 1.1 at LLC1
mem voltage: 1.38 + 1.38 boot voltage
 
Back
Top