AMD: Zen 2 (Ryzen/Threadripper 3000?, Epyc 8000?) Speculation, Rumours and Discussion

Considering it's still a 14/12nm chip, the most important thing about Picasso is if it supports LPDDR4.
 
https://www.anandtech.com/show/13829/amd-ryzen-3rd-generation-zen-2-pcie-4-eight-core

OXifHds.jpg


Same 8-core, 80mm^2 chiplet as Rome, and a new 123mm^2 I/O die.
There's a clear space (almost with a big X marks the spot) that would/will receive a second CPU chiplet in higher end models. We might see 16 cores with AM4, though I wonder if a dual-channel memory bandwidth will ever be enough for that kind of compute power.

The IO is made on GF 14nm again, since it's around a quarter of the size of Rome's huge IO die. It's still a very large die just for IO, so the chances of it having some kind of cache (L3? L4?) have increased IMO.


Anandtech also mentions that dual-sourcing should make lower priced chips harder, so we shouldn't expect APUs to use these chiplets. Maybe they're going with the 200mm^2 form factor for the APUs.
 
My guess is no cache on the IO die, the IF link can't handle that kind of throughput, it's sized for RAM/PCI-E interface width and low power. The more efficient approach would be to double the L3 on the chiplet where you have the bandwidth and low latency, in addition to 7nm SRAM scaling and reduction of traffic through the IO die.

Area is another issue. If you look at the Zen1 die shot, the CCXs are ~1/2 of the area so it's not surprising the new IO die is also ~1/2. There's no room in 120mm for the uncore, an additional IF interface for second chiplet, and a cache large enough to make a difference.
 
Last edited:
I wonder if cooling would be affected in any way with uneven distribution to heat pipes/fins? Probably negligible depending on the cooler design.
 
I hope we get more details on the memory latency of the setup. AMD's choice of Cinebench may not be as representative of the workloads that saw benefits from tighter timings in prior Ryzen products.
I wonder in this case of asymmetric layout if it's solder or TIM under the heatspreader.
 
Any good reason to place the dies like that except for the mentioned 1+2 speculation? (which could also explain the relatively large size of the io die).
If it was for heat distribution or easier edge-edge connections a regular diagonal layout would have been more obvious.
 
Any good reason to place the dies like that except for the mentioned 1+2 speculation? (which could also explain the relatively large size of the io die).
If it was for heat distribution or easier edge-edge connections a regular diagonal layout would have been more obvious.
Considering the constrained package area, I don't think AMD had much of a choice for different layouts.
 
Interesting.
Not directly a 1/4 IO chip :( but looks like this one can at least be re-used for the 16 core version :smile2:
 
16 core Ryzen 3000 (or at least dual chiplet for >8 cores) confirmed by Lisa Su.



It's the same amount of bandwidth per core as 64c Rome ;)
You're right, 64c Rome is 8 channels, so 1 channel per 8 cores.
And a consumer Ryzen would probably get substantially higher bandwidth per-core because they can be paired with DDR4 >3200MT/s memory, whereas Rome is limited to 2666 MT/s.

My concern would be the fact that 32c Threadripper 2990WX seemed to be starving for bandwidth in many tests, but it seems that was the fault of a windows bug all the time.
 
For EPYC, there are workloads where the bandwidth is a constraint. AMD made some statements about doing more to increase bandwidth over the first generation, and with current products there are different SKUs that emphasize cores, clocks, or memory over other factors. A customer with a bandwidth-limited workload may opt for one of the lower-count variants in order to reap cost and power savings, or possibly get some better single-threaded performance.
A high occupancy VM host might opt for having more instances with the understanding that their bandwidth requirements will be modest.

The consumer line may be a little constrained in order to avoid complicating the analysis for consumers, although there may be some applications that would have benefited from a product with a similarly targeted mix.
 
https://www.digitimes.com/news/a20190111PD207.html

TSMC with its CoWoS (chip-on-wafer-on-substrate) packaging has grabbed orders for AMD's 7nm datacenter CPU, while SPIL and TFME share the flip-chip packaging orders placed by AMD for its new 7nm CPU and GPU designed for desktops and notebooks, the sources indicated.

TSMC does the packaging for Rome, whereas SPIL and TFME do the packaging for Ryzen 3000 and Vega VII.
 
Back
Top