AMD RyZen CPU Architecture for 2017

The 8-core Threadripper might be an interesting comparison to the 1800X. There's half as many core resources available within a CCX, and I guess it should be treated by the OS as having two NUMA nodes.
There's more bandwidth, power delivery and dissipation, and IO, but the latency cliffs can be steep and with reduced margin. Going by the picture of an EPYC substrate, the two dies aren't going to enjoy full bandwidth between them.
 
The 8-core Threadripper might be an interesting comparison to the 1800X. There's half as many core resources available within a CCX, and I guess it should be treated by the OS as having two NUMA nodes.
There's more bandwidth, power delivery and dissipation, and IO, but the latency cliffs can be steep and with reduced margin. Going by the picture of an EPYC substrate, the two dies aren't going to enjoy full bandwidth between them.
The 8c Tr will be 1 CCX per die or just one die active?
 
The 8c Tr will be 1 CCX per die or just one die active?
I thought I read it was all active CCXs, but I might be wrong. It would need to have both dies active for the memory channels. Other chips needed to balance the number of active cores across running CCXs, not sure about fully turning one CCX off, however.
 
Was there any update after the Der8auer delidding and the 4 dies?
 
I'm not aware of a statement from Der8auer, but other statements from writers indicated that answers were given to the effect that they were placeholders.

I've seen some additional speculation that seem convincing to me, such as the lack of passive components on the substrate that would have gone to the other two dies if they were ever going to be active.
If they were to undergo testing, the components would have been added. If those dies subsequently failed, there would be little motivation to pop them back off.
 
My understanding is that the two active dice are positioned diagonally with two metal shims in place of the other two dice for mechanical purposes.
 
Many thought that Threadripper simply used 4 CPU cores from each die, a point that has been long disproven by AMD, while others thought that these were "failed" Ryzen CPU dies that simply remained inactive. This left many to assume that Threadripper was simply "failed EPYC" though, in reality, nothing can be further than the truth, as AMD has now explained.

It is true that AMD's Threadripper TR4/SP3r2 socket has the same layout as AMD's EPYC SP3 socket, using the same substrate design and having an identical external aesthetic. Beyond that things become very different, as EPYC uses four active CPU dies whereas Threadripper only has two, which are in a diagonal configuration to allow heat to more evenly dissipate on the CPU's IHS.

The question that many have been asking is what the purpose of the other two dies are, with the answer simply being structural. These extra dies prevent imbalance and allow for simple cooler mounting without any chance of damaging the CPU. These two extra dies are not active or even contain working transistors, they are blank and as such are not "wasted Ryzen CPU dies".

https://www.overclock3d.net/news/cpu_mainboard/amd_clarifies_why_threadripper_uses_4_silicon_dies/1
 
Thanks for the final word. I assumed they would have the working dies as diagonal specifically for cooling.
 
Interesting, TR parts will have more than twice the cache of R7 CPUs: https://videocardz.com/newz/amd-confirms-tdp-and-cache-of-ryzen-threadripper

AMD-Ryzen-Threadripper-TDP.png


How is that possible when using two Zeppelin dies? 2x would put them at 32mb, 40mb is a 25% increase over that.

Edit: They must be counting L2 + L3 for that to make sense. 32mb L3 + 8/6 mb L2
Yep, it's L2+L3.
1950X has 40MB (16x0.5 + 32MB), 1920X 38MB (12x0.5 + 32MB) and 1900X 20MB (8x0.5 + 16MB)
 
BSD team discovered a new hardware issue with Ryzen
Hi, Matt Dillon here. Yes, I did find what I believe to be a hardware issue with Ryzen related to concurrent operations. In a nutshell, for any given hyperthread pair, if one hyperthread is in a cpu-bound loop of any kind (can be in user mode), and the other hyperthread is returning from an interrupt via IRETQ, the hyperthread issuing the IRETQ can stall indefinitely until the other hyperthread with the cpu-bound loop pauses (aka HLT until next interrupt). After this situation occurs, the system appears to destabilize. The situation does not occur if the cpu-bound loop is on a different core than the core doing the IRETQ. The %rip the IRETQ returns to (e.g. userland %rip address) matters a *LOT*. The problem occurs more often with high %rip addresses such as near the top of the user stack, which is where DragonFly's signal trampoline traditionally resides. So a user program taking a signal on one thread while another thread is cpu-bound can cause this behavior. Changing the location of the signal trampoline makes it more difficult to reproduce the problem. I have not been because the able to completely mitigate it. When a cpu-thread stalls in this manner it appears to stall INSIDE the microcode for IRETQ. It doesn't make it to the return pc, and the cpu thread cannot take any IPIs or other hardware interrupts while in this state.
https://svnweb.freebsd.org/base?view=revision&revision=321899
 
Back
Top