AMD RyZen CPU Architecture for 2017

So has anyone seen an interface at looks remotely like the "GMI" interfaces before? I haven't? I have heard that there have been big improvements for PHY on organic sub strait but i dont remember where i read it.......


Ryzen%20Die%20Shot.jpg
You can see three sets of replicated PHY at the bottom center and next to the top-right probably-SerDes PHY. Ideally, three links would enable 4-chip MCM of Naples to be fully connected. Then the one of the two huge SerDes PHYs (which look like two blocks of 16 lanes) can be optionally an off-chip SMP link, which would allow 4-chip SKUs to have four for glueless 4S, while 2-chip SKUs to have two to match E5.

That's said IMO they don't seem to be big enough to be even an on-package >8 Gbps SerDes link to match the bandwidth of three nodes of dual-channel DDR4-2400 or -2667 in one direction. (need 8-10.5 GT/s) I tried to look for die shots with Intel's OPIO but found none. So while perhaps AMD can do some magically-fast-but-miniature on-package I/O (esp. when the off-chip variant seems to be called xGMI instead of just GMI, kudos to VideoCardz), the huge supposedly-PCIe PHY next to them makes it less persuasive.

The top-left huge PHY seems more like the 144-bit DDR PHY, since you can clearly see a pattern of 5-4 blocks which can be seen in Carrizo too.
 
Last edited:
I ignore all newer postings.
This confirms that 6800K and 6900K were only equipped with two memory modules -> dual channel memory mode. Quad channel doubles peak memory bandwidth. Ryzen 8-core and Intel 4-core both have dual channel memory bus (and Intel Xeon D as well). But it seems odd that the Intel quad channel chips are not equipped with optimal memory config.

PCWorld also noticed this:
http://www.pcworld.com/article/3172...-preview-ryzen-7-outperforms-intels-best.html

So we can't yet conclude whether these are realistic results. Nobody would equip their 6900K with only two memory modules. As Cinebench R15 doesn't use AVX, both advantages of Intel chips (double memory bandwidth, double vector FLOPS) couldn't be seen. Would be interesting to see how Ryzen stacks up in software taking advantage of maximum memory bandwidth and/or AVX.

What about this:

AMD uses QC for their Ryzen presentation, but it lowers Perf/W for Intel processors (more memory modules -> more watts). And we would have then another disussion: "AMD uses QC config to decrease Perf/W for Intel CPUs, but QC is useless in most cases."
 
You can see three sets of replicated PHY at the bottom center and next to the top-right probably-SerDes PHY. Ideally, three links would enable 4-chip MCM of Naples to be fully connected. Then the one of the two huge SerDes PHYs (which look like two blocks of 16 lanes) can be optionally an off-chip SMP link, which would allow 4-chip SKUs to have four for glueless 4S, while 2-chip SKUs to have two to match E5.

That's said IMO they don't seem to be big enough to be even an on-package >8 Gbps SerDes link to match the bandwidth of three nodes of dual-channel DDR4-2400 or -2667 in one direction. (need 8-10.5 GT/s) I tried to look for die shots with Intel's OPIO but found none. So while perhaps AMD can do some magically-fast-but-miniature on-package I/O (esp. when the off-chip variant seems to be called xGMI instead of just GMI, kudos to VideoCardz), the huge supposedly-PCIe PHY next to them makes it less persuasive.

The top-left huge PHY seems more like the 144-bit DDR PHY, since you can clearly see a pattern of 5-4 blocks which can be seen in Carrizo too.
I have gone a looked a lots of die shots, All 14nm DDR interfaces look far more like the bottom left / top right then the top left does to me. All have that big fat reoccurring kind of pattern.

we have
A9 with 1x 64 bit LPDDR4 http://images.anandtech.com/doci/9686/A9PNG.png ( samsung fab)
with 4x 72bit DDR4 Haswell E https://www.extremetech.com/wp-content/uploads/2014/08/haswell-e-die-shot-high-res.jpg

So i think the top left is GMI
bottom left and top right is DDR
right hand edge PCI-E
bottom rigith misc i/o network etc.


The more i stare at the area i think is GMI the more i see repetitions of the i/o thats on the right hand edge of the die.

edit: also having those two big blocks of SRAM near the GMI links makes sense from a its a directory or something for the inter GMI links , doesn't make much sense being near the DDR interfaces.
 
The original Athlon (K7) was clearly leading Intel Pentium 3. Intel was desperate enough to clock their P3 design too high (1.13 GHz) :)
Tualatin was good in performance, clock and tdp, so good that intel had to limit the frequencies to not overshadow the new P4 1,4GHz.

I agree with anyone: nice memories!
 
I have gone a looked a lots of die shots, All 14nm DDR interfaces look far more like the bottom left / top right then the top left does to me. All have that big fat reoccurring kind of pattern.

we have
A9 with 1x 64 bit LPDDR4 http://images.anandtech.com/doci/9686/A9PNG.png ( samsung fab)
with 4x 72bit DDR4 Haswell E https://www.extremetech.com/wp-content/uploads/2014/08/haswell-e-die-shot-high-res.jpg

So i think the top left is GMI
bottom left and top right is DDR
right hand edge PCI-E
bottom rigith misc i/o network etc.


The more i stare at the area i think is GMI the more i see repetitions of the i/o thats on the right hand edge of the die.

edit: also having those two big blocks of SRAM near the GMI links makes sense from a its a directory or something for the inter GMI links , doesn't make much sense being near the DDR interfaces.
So interesting to see how different the 8 core HW is from the 8 core zen. 2 architecture very similar in performance yet very different in configuration.
 
Bits And Chips said this:
'We have to consider that BD die and Ryzen die cost the same. So, AMD could sell a R7-1800X @ $200. Intel can't afford a such price war '

I don't believe it but the gurus here knows better than me.
 
Bits And Chips said this:
'We have to consider that BD die and Ryzen die cost the same. So, AMD could sell a R7-1800X @ $200. Intel can't afford a such price war '

I don't believe it but the gurus here knows better than me.
I wouldn't count on that, BD (or rather, Piledriver), while larger, is made on ancient 32nm process which is dirt cheap compared to 14nm
 
Bits And Chips said this:
'We have to consider that BD die and Ryzen die cost the same. So, AMD could sell a R7-1800X @ $200. Intel can't afford a such price war '

I don't believe it but the gurus here knows better than me.

The manufacture of the die is somewhere in the 25-40 USD depending on yields assuming 7k a wafer ( according to amd transistor cost of 14nm is lower then 28nm and quite a bit lower then 32nm (remember SOI is expensive) http://i.imgur.com/eezbRGE.jpg) . So you have testing/binning/packaging/shipping/amd margin/retail and disty margin. My ball park guess is @ 30 Gross margin the average Zepplin die has to sell for around 90-110 USD. That would also align to P10 costs/sale price considering AMD right now has low 30% margins.
 
'We have to consider that BD die and Ryzen die cost the same. So, AMD could sell a R7-1800X @ $200. Intel can't afford a such price war '

Yes, and no.

Intel's gross margins are insane, so they can afford it in the sense they still turn a profit when dumping prices to AMD levels.

But dumping prices means collapsing margins/profits, - and subsequently share price. So blood is likely to be spilled in upper management/board of directors.

Cheers
 
Regarding old AMD chips: my father is still running an Athlon XP 2800+ (or maybe 3000+) on one of his computers, mostly dedicated to light(ish) music editing.
 
I have gone a looked a lots of die shots, All 14nm DDR interfaces look far more like the bottom left / top right then the top left does to me. All have that big fat reoccurring kind of pattern.
You should have looked at how AMD layouts its DDR PHY and PCIe PHY in Carrizo then. The DDR PHY in Carrizo is asymmetric like how the top-left one in Summit Ridge does. The bottom-left and top-right PHYs are closer to Carrizo's PCIe PHY in terms of observed block counts (eight x4 blocks, eight subblocks within it).

edit: also having those two big blocks of SRAM near the GMI links makes sense from a its a directory or something for the inter GMI links , doesn't make much sense being near the DDR interfaces.
If it is a directory, it should be closer to the memory controller (which may/may not be close to the DDR PHY) which manages the node's coherent memory, not the outgoing SMP links.
 
Last edited:
You should have looked at how AMD layouts its DDR PHY and PCIe PHY in Carrizo then. The DDR PHY in Carrizo is asymmetric like how the top-left one in Summit Ridge does. The bottom-left and top-right PHYs are closer to Carrizo's PCIe PHY in terms of observed block counts (eight x4 blocks, eight subblocks within it).
if your talking about http://i.imgur.com/OEmIyQb.jpg i cant see the pci well enough to see any whats going on and i couldn't find a better/ higher rez image.
If it is a directory, it should be closer to the memory controller (which may/may not be close to the DDR PHY) which manages the node's coherent memory, not the outgoing SMP links.
Why? The physical address in memory is known, what isn't known is what data is in the cache, why would you make the memory controller in the path for that?
 
You can see three sets of replicated PHY at the bottom center and next to the top-right probably-SerDes PHY. Ideally, three links would enable 4-chip MCM of Naples to be fully connected.
There is actually a fourth one in the very top left corner (split into two rows) to the left of the memory interface. ;)
Then the one of the two huge SerDes PHYs (which look like two blocks of 16 lanes) can be optionally an off-chip SMP link, which would allow 4-chip SKUs to have four for glueless 4S, while 2-chip SKUs to have two to match E5.
The die is supposed to have 32 PCIe lanes. And the two sets look identical. I would have expected that for a two socket solution one set would be dedicated/optimized for that off-package communication. I don't know.
And I wonder if the additional two PCIe lanes (?) at the bottom center are used for the X300 chipset which allegedly uses a dedicated link and frees up the 4 lanes usually used to connect the chipset.
That's said IMO they don't seem to be big enough to be even an on-package >8 Gbps SerDes link to match the bandwidth of three nodes of dual-channel DDR4-2400 or -2667 in one direction. (need 8-10.5 GT/s) I tried to look for die shots with Intel's OPIO but found none. So while perhaps AMD can do some magically-fast-but-miniature on-package I/O (esp. when the off-chip variant seems to be called xGMI instead of just GMI, kudos to VideoCardz), the huge supposedly-PCIe PHY next to them makes it less persuasive.
Are we maybe looking at an interposer solution for Naples?
The top-left huge PHY seems more like the 144-bit DDR PHY, since you can clearly see a pattern of 5-4 blocks which can be seen in Carrizo too.
I came to the same conclusion.
 
There is a rumor that Ryzen Pro will employ an ARM co-processor (Coretex A5) to enable SME (Secure Memory Encryption) and SEV (Secure Encrypted Virtualization) on DDR memory.
 
The top-left huge PHY seems more like the 144-bit DDR PHY, since you can clearly see a pattern of 5-4 blocks which can be seen in Carrizo too.
If this is the case, it would be an answer of sorts to my questions from the earlier fuzzy wafer shot of Summit Ridge, where that block was interpreted as being a GMI section and the other sections as DDR. I wasn't sure what AMD would gain in splitting the memory interface like that, and how that would play in an MCM scenario where multiple DRAM interfaces would be clustered into the interior of an MCM.
Just not doing that would the straightforward answer, and would allow the innermost IO blocks to serve as a link. Something like PCIe or something using its signalling would be more tolerant to the gymnastics of fitting into or escaping the center of an MCM than a DRAM bus at any rate.


You should have looked at how AMD layouts its DDR PHY and PCIe PHY in Carrizo then. The DDR PHY in Carrizo is asymmetric like how the top-left one in Summit Ridge does. The bottom-left and top-right PHYs are closer to Carrizo's PCIe PHY in terms of observed block counts (eight x4 blocks, eight subblocks within it).
It's something of a visual reversal this time around for the actual PHY, which makes me understand why the initial fuzzy pic was interpreted as it was. I thought the DDR4 interface would have been more consistent between generations, given that it has rolled out in other products without such a shift.
Even now, I think there are examples of more finely packed interconnect PHY, like the HT sections of the Opterons or various Xeon EX/EP die. Knights Landing has a rather compact PCIe interface and blocky DDR4 PHY as well.

If it is a directory, it should be closer to the memory controller (which may/may not be close to the DDR PHY) which manages the node's coherent memory, not the outgoing SMP links.
I was wondering if they were per-CCX, although given their positioning they could also be partitioned between the two halves of that PHY block.
There's an irregularly shaped "blank" area that touches them and their nearest CCX, although I suppose that could be structured that way for other reasons. I wondered if the two small rectangles just above the L3 at the top of the CCXs was possibly where they hooked into the interconnect.
 
Back
Top