AMD RyZen CPU Architecture for 2017

AMD is on the board of OpenCAPI. Doesn't mean they can't play politics with it, but OpenCAPI should be agnostic of the HSA stack. Shared memory and low latency transactions are all that's required.
I did not see Intel on that membership list, so why should research undertaken by Intel be available to OpenCAPI? I don't think AMD can play politics with technology it doesn't own.

I'm not suggesting to move commands, but alter the threading model slightly. Not unlike how GPUs currently work, but with the CPU side waiting and more efficient hardware synchronization mechanisms added.
Perhaps we are discussing different situations. Can you clarify which parts of how things currently work that you are referencing, and whether HSA's model covers them?

All I'm saying is that the signaling standards are the same.
Is there a description for this? GMI uses non-standard speeds in inter-socket and on-package, and the on-package links were described by the Stilt in the anandtech forums (generally accurate thus far, based on MB or BIOS-related review I think) as having a non-standard width.
The on-package link bandwidth doesn't list the same drop in effective bandwidth as the socket links that have the encoding needs of a PCIe transmission.

Infinity being a superset of whatever PCIe standard gets adopted.
AMD is the party stating the Infinity Fabric is a superset of Hypertransport. I do not see where this is being inverted to say that this makes it dependent on PCIe, or that some variant of PCIe is going to conform to AMD's proprietary protocol, or how that limits the fabric to current PCIe.
AMD has been on record indicating they are keeping that to themselves.
 
This leaves uncertainty as to what the situation is when outlets like Extremetech and OC3D relayed communications with AMD indicating they were inserts.

I would assume from a mechanical standpoint that there would be no closer match to a Zen die than another silicon die (and a Zen die to be extra specific), if one wanted to eliminate any possible mismatches. It's also potentially cheaper to reach into the bin of discarded dies rather than slicing up a $7k wafer just for that.

I suppose other items to check are whether the dead dies could have been conceivably active at some point, or if the package is set up such that they are cut off.

If those are just non-functional dies, then yes, it's a sensible, pragmatic solution to the problem. If they're functional, well, it's another story.
 
I did not see Intel on that membership list, so why should research undertaken by Intel be available to OpenCAPI? I don't think AMD can play politics with technology it doesn't own.
Different parts of the equation. OpenCAPI providing the hardware/network layer and a technology parallel to that Intel queuing one I linked the software synchronization layer. Not those specific IPs, but similar mechanisms. I couldn't say if it applies to Ryzen as any HSA implementation might be internal. Both technologies I'd consider highly likely in the near future.

Perhaps we are discussing different situations. Can you clarify which parts of how things currently work that you are referencing, and whether HSA's model covers them?
The current models only cover part of what I've described. Current model being to basically use OpenCL to create kernels for execution on a discrete device. The HSA model would do this more transparently, with a compiler unrolling loops into ISA for an applicable processor. All current models involving a significantly large amount of work. What I'm suggesting would be the next step, where more efficient hardware and software allow the same techniques, but with much finer granularity. Using HSA for short inline functions within heavily threaded applications. The result being a tightly integrated Ryzen+Vega12 as an example. So a single context would seem to be jumping processors based on current instructions, but really be two interwoven threads as the context wouldn't migrate.

Is there a description for this? GMI uses non-standard speeds in inter-socket and on-package, and the on-package links were described by the Stilt in the anandtech forums (generally accurate thus far, based on MB or BIOS-related review I think) as having a non-standard width.
You have what I said backwards. GMI would be a superset or the current PCIe version. PCIe 5 looking at multi-level signaling. Therefore Infinity on that standard would use the same signaling, but driven faster based on tighter timing requirements. Infinity would provide more bandwidth while repurposing the actual lanes. Epyc for example where half the PCIe lanes become Infinity to connect the chips. The hardware driving the signals is the same, just ran at different speeds and PCIe falling within the capabilities of Infinity links. The width the result of bonding links for the purpose. The control and data fabrics using essentially the same lanes based on configuration.

I do not see where this is being inverted to say that this makes it dependent on PCIe, or that some variant of PCIe is going to conform to AMD's proprietary protocol, or how that limits the fabric to current PCIe.
As I said above, the opposite. It would always encompass the PCIe standard from a hardware aspect. Think infinity switch and PCIe switch on each chip with a configuration setting determining what lanes went where. That may only apply to half the links on any given chip to conserve space.
 
Different parts of the equation. OpenCAPI providing the hardware/network layer and a technology parallel to that Intel queuing one I linked the software synchronization layer.
Is this something mentioned in the documents for OpenCAPI, or is this something you predict will happen? I find it more difficult to discuss statements whose tense and grounding in fact/assertion are ambiguous.

You have what I said backwards.
The statement was that the signalling standards are the same.
I would accept having a true/false backwards if there's a reference to this, perhaps one that explains the common basis not showing through their differing behavior.

The control and data fabrics using essentially the same lanes based on configuration.
I'd have to dredge up the EPYC presentation/video where they indicate the control and data fabrics do not share lanes.
The bandwidth and latency needs are more relaxed, meaning that there's a minor connection carried between EPYC dies that carries control information. I don't quite recall the statement, but its connectivity requirements are such that it isn't a fully-connected setup like the MCM or a mesh. Could be a ring or potentially daisy-chained since the time period for functions like the on-die DVFS controller is in the millisecond range.
It's minor enough that they don't clutter the big-ticket data fabric diagrams with it.

As I said above, the opposite. It would always encompass the PCIe standard from a hardware aspect. Think infinity switch and PCIe switch on each chip with a configuration setting determining what lanes went where. That may only apply to half the links on any given chip to conserve space.
Is there a reference to Infinity Fabric being a superset of the PCIe standard? There appear to be different physical restrictions and portions of the PCIe standard AMD doesn't claim to offer for its fabric. The Processor Programming Reference for family 17h diagrams separate elements for PCIe and xGMI at the physical layer, so there's a physical partition just before the shared muxed PHY (third-party Synopsis E12G IP, I think). Prior to that there are PCIe specific paths that hang from the southbridge, whereas the xGMI physical layer hangs from the data fabric.
While Hypertransport was originally defined to use the PCI (the original) ordering model and maintains its conventions, I haven't seen it described that PCIe has become a subset of either HT or IF despite PCIe being a superset of PCI.
 
Amazing gains from memory optimization! There is 20% to 30% gain at the same core clock!
I'm lucky I have B-dies in my Ryzen system ;)
Yes, too bad AMD did not have the chance to specify Ryzen for faster memories. As it is, all the responsibility is in user realm. Great opportunity for experienced tinkers.
 
Next video I'm uploading is about RotTR:

1gvzl8.jpg


And I think I hit the limits of the GPU with 3466 memory, the difference should be higher If I lower the resolution.
 
Clukos, can you make a comparision between 3200MHz auto and 3466MHz tight. 3200MHz auto is used in most "up to date" reviews by various sites. Would give better comparision against them.
 
Next video I'm uploading is about RotTR:
Interesting... Your CPU and GPU are both working harder in the rightmost picture (and the GPU is clocking faster too), yet both are running cooler...

That's some bizarro shit right there. :D
 
Interesting... Your CPU and GPU are both working harder in the rightmost picture (and the GPU is clocking faster too), yet both are running cooler...

That's some bizarro shit right there. :D

I forgot to mention, I did run through the same route 3 times with the 2666 config because I was getting some nasty stutter, it turns out the HDD was doing something during the first two and it was messing up the results. So essentially it got a prolonged warm up time :)
Clukos, can you make a comparision between 3200MHz auto and 3466MHz tight. 3200MHz auto is used in most "up to date" reviews by various sites. Would give better comparision against them.

I'll definitely check for that but I'm not sure I'll upload a video about it unless it's something worth uploading.
 
And everyone will blame AMD for bad performance.
It depends whether this is a common practice, or whether just some laptop manufacturers will do it. If popular models from other big laptop manufacturers use dual channel memory, then reviewers will notice this and blame the laptop manufacturer, not AMD. I do recall that single channel memory config was a problem with older laptops with AMD chips, but these chips were competing in entry level segment, where price is what matters, performance is not that important. Raven Ridge performance should be enough to challenge Intel high end. Raven Ridge will be fighting in a completely different market segment. If a consumer is buying a Raven Ridge APU with the fattest GPU (11 CU), he/she is definitely interested about performance. There are many cheaper chips available that are fine if you don't need that much GPU performance.
 
Looking at @Arnold Beckenbauer 's post:

Ryzen 5 2500U:

- 4 Zen cores, 8 threads
- 2GHz
- probably 15W, if it's going after Intel's "U" chips
- Using Carrizo's Southbridge
...

Memory bandwidth in this Ryzen 2500U test is getting below 15GB/s, whereas those Intel results are getting 20-25GB/s and a Ryzen 3 with DDR4 2400MHz gets close 30GB/s in multi-core.
This test might have been done using a single SO-DIMM.

I hope, my link works: https://browser.geekbench.com/v4/cpu/compare/3989386?baseline=4056111

It's AMD Ryzen 5 2500U vs. AMD Ryzen 5 1400 @ 3.20 GHz Looking at single core performance R5 2500U is almost everywhere faster. Here sometimes https://browser.geekbench.com/v4/cpu/compare/3354848?baseline=3989386


Smach Z with Ryzen 5 2500U would be a dream https://www.indiegogo.com/projects/smach-z-the-handheld-gaming-pc-games#/
 
Last edited:
I hope, my link works: https://browser.geekbench.com/v4/cpu/compare/3989386?baseline=4056111

It's AMD Ryzen 5 2500U vs. AMD Ryzen 5 1400 @ 3.20 GHz Looking at single core performance R5 2500U is almost everywhere faster. Here sometimes https://browser.geekbench.com/v4/cpu/compare/3354848?baseline=3989386

Being a single core benchmark it could simply mean that the Ryzen U has room to max its turbo clock, which could be above Ryzen 1400's 3.4GHz.
But it could also mean that the "Zen+" that's supposedly coming in Raven Ridge could be already showing some IPC improvements.
 
When I was looking for an Ultrabook, I read countless of reviews that included 4+ Ultrabooks with identical Intel CPU. Some brands performed consistently better than others, because some manufacturers skimped on cooling (common on Ultrabooks) making the CPU throttle or installed slower memory (or cheap slower SSD), etc. Most of the time reviewers noticed these differences and didn't recommend the laptop with poor setup. This made some of the laptop manufacturers look bad compared to others. Of course the same isn't true when you are comparing low end laptops (such as old AMD chips). I was talking about the Raven Ridge APU with the fattest 11 CU GPU (fastest iGPU in the market) + high clocked quad core (8 thread) CPU. These will be competing against the top end i7 models. Intel prices their laptop quad cores with hyperthreading + Iris GPU very high.

This is AMDs second coming to laptops. Ryzen was highly popular news topic when it launched. Every tech site reviewed all the Ryzen models. Raven Ridge will be scrutinized by the websites. If some laptop implementations are bad (for example single channel memory), the reviewers will notice it and will tell the customers to avoid these laptop models. This is not a nameless "A- followed by a random number" CPU anymore that nobody cares about.
 

Actually, it should be coming with a Ryzen APU (if it ever releases, that is).
From the August update:

The project has been delayed, next estimated completion date is Q1/Q2 2018.
(...)
To compensate backers for the delays and to offer an updated device, we’re working on introducing improvements in the console hardware to make SMACH Z the most powerful handheld console in the market. We’re working together with AMD to bring the best performance to SMACH Z, so it will be the most powerful handheld console in the market. At this point we cannot give any more information, but we hope that the new performance will help to alleviate the long waiting.

The Merlin Falcon APU they had in the original prototype is now 2 years old, based on Carrizo. AMD hasn't updated their R-Series embedded APUs ever since (no Bristol Ridge update), meaning the next update in their embedded series will be using Raven Ridge solutions.

Also it looks like one of the private updates they had back in July mentioned they were changing the solution to "a new SoC based on Ryzen and Vega" (basically Raven Ridge), after a falling out with the maker of the original motherboard:





What's really strange is how they keep promising a modular solution without Rhomb.io's (former ClickARM) technology. And for 2018..
 
Back
Top