Thats interesting actually...So 4core ryzen will have a huge jump in performance?
If the two CCXs are only tied together by a 22GB/s link, that's just appalling.
Imagine if at the start of this thread someone seriously suggested that the 1st iteration of Zen would be right up there with the latest Intel Expensive Edition on a lot of benchmarks, beating it in more than a few & about even in power while doing it:I think the biggest thing about ryzen is that it is AMD base for the future, yes it may not be perfect but damn AMD couldn't wish for a better base to have. It will be very interesting to see a ryzen+ with all the problems solved, better support from games and some improvements.
Note that Aida64 apparently haven't had a chance to update -> their cache numbers are wrong.
This is what Ryzen currently looks like to Windows:
Logical Processor to Cache Map:
But it should look more like this:Code:*--------------- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 *--------------- Instruction Cache 0, Level 1, 64 KB, Assoc 4, LineSize 64 *--------------- Unified Cache 0, Level 2, 512 KB, Assoc 8, LineSize 64 *--------------- Unified Cache 1, Level 3, 16 MB, Assoc 16, LineSize 64 -*-------------- Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 -*-------------- Instruction Cache 1, Level 1, 64 KB, Assoc 4, LineSize 64 -*-------------- Unified Cache 2, Level 2, 512 KB, Assoc 8, LineSize 64 -*-------------- Unified Cache 3, Level 3, 16 MB, Assoc 16, LineSize 64 --*------------- Data Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 --*------------- Instruction Cache 2, Level 1, 64 KB, Assoc 4, LineSize 64 --*------------- Unified Cache 4, Level 2, 512 KB, Assoc 8, LineSize 64 --*------------- Unified Cache 5, Level 3, 16 MB, Assoc 16, LineSize 64 ---*------------ Data Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 ---*------------ Instruction Cache 3, Level 1, 64 KB, Assoc 4, LineSize 64 ---*------------ Unified Cache 6, Level 2, 512 KB, Assoc 8, LineSize 64 ---*------------ Unified Cache 7, Level 3, 16 MB, Assoc 16, LineSize 64 ----*----------- Data Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 ----*----------- Instruction Cache 4, Level 1, 64 KB, Assoc 4, LineSize 64 ----*----------- Unified Cache 8, Level 2, 512 KB, Assoc 8, LineSize 64 ----*----------- Unified Cache 9, Level 3, 16 MB, Assoc 16, LineSize 64 -----*---------- Data Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 -----*---------- Instruction Cache 5, Level 1, 64 KB, Assoc 4, LineSize 64 -----*---------- Unified Cache 10, Level 2, 512 KB, Assoc 8, LineSize 64 -----*---------- Unified Cache 11, Level 3, 16 MB, Assoc 16, LineSize 64 ------*--------- Data Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 ------*--------- Instruction Cache 6, Level 1, 64 KB, Assoc 4, LineSize 64 ------*--------- Unified Cache 12, Level 2, 512 KB, Assoc 8, LineSize 64 ------*--------- Unified Cache 13, Level 3, 16 MB, Assoc 16, LineSize 64 -------*-------- Data Cache 7, Level 1, 32 KB, Assoc 8, LineSize 64 -------*-------- Instruction Cache 7, Level 1, 64 KB, Assoc 4, LineSize 64 -------*-------- Unified Cache 14, Level 2, 512 KB, Assoc 8, LineSize 64 -------*-------- Unified Cache 15, Level 3, 16 MB, Assoc 16, LineSize 64 --------*------- Data Cache 8, Level 1, 32 KB, Assoc 8, LineSize 64 --------*------- Instruction Cache 8, Level 1, 64 KB, Assoc 4, LineSize 64 --------*------- Unified Cache 16, Level 2, 512 KB, Assoc 8, LineSize 64 --------*------- Unified Cache 17, Level 3, 16 MB, Assoc 16, LineSize 64 ---------*------ Data Cache 9, Level 1, 32 KB, Assoc 8, LineSize 64 ---------*------ Instruction Cache 9, Level 1, 64 KB, Assoc 4, LineSize 64 ---------*------ Unified Cache 18, Level 2, 512 KB, Assoc 8, LineSize 64 ---------*------ Unified Cache 19, Level 3, 16 MB, Assoc 16, LineSize 64 ----------*----- Data Cache 10, Level 1, 32 KB, Assoc 8, LineSize 64 ----------*----- Instruction Cache 10, Level 1, 64 KB, Assoc 4, LineSize 64 ----------*----- Unified Cache 20, Level 2, 512 KB, Assoc 8, LineSize 64 ----------*----- Unified Cache 21, Level 3, 16 MB, Assoc 16, LineSize 64 -----------*---- Data Cache 11, Level 1, 32 KB, Assoc 8, LineSize 64 -----------*---- Instruction Cache 11, Level 1, 64 KB, Assoc 4, LineSize 64 -----------*---- Unified Cache 22, Level 2, 512 KB, Assoc 8, LineSize 64 -----------*---- Unified Cache 23, Level 3, 16 MB, Assoc 16, LineSize 64 ------------*--- Data Cache 12, Level 1, 32 KB, Assoc 8, LineSize 64 ------------*--- Instruction Cache 12, Level 1, 64 KB, Assoc 4, LineSize 64 ------------*--- Unified Cache 24, Level 2, 512 KB, Assoc 8, LineSize 64 ------------*--- Unified Cache 25, Level 3, 16 MB, Assoc 16, LineSize 64 -------------*-- Data Cache 13, Level 1, 32 KB, Assoc 8, LineSize 64 -------------*-- Instruction Cache 13, Level 1, 64 KB, Assoc 4, LineSize 64 -------------*-- Unified Cache 26, Level 2, 512 KB, Assoc 8, LineSize 64 -------------*-- Unified Cache 27, Level 3, 16 MB, Assoc 16, LineSize 64 --------------*- Data Cache 14, Level 1, 32 KB, Assoc 8, LineSize 64 --------------*- Instruction Cache 14, Level 1, 64 KB, Assoc 4, LineSize 64 --------------*- Unified Cache 28, Level 2, 512 KB, Assoc 8, LineSize 64 --------------*- Unified Cache 29, Level 3, 16 MB, Assoc 16, LineSize 64 ---------------* Data Cache 15, Level 1, 32 KB, Assoc 8, LineSize 64 ---------------* Instruction Cache 15, Level 1, 64 KB, Assoc 4, LineSize 64 ---------------* Unified Cache 30, Level 2, 512 KB, Assoc 8, LineSize 64 ---------------* Unified Cache 31, Level 3, 16 MB, Assoc 16, LineSize 64
Code:Logical Processor to Cache Map: **-------------- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 **-------------- Instruction Cache 0, Level 1, 64 KB, Assoc 4, LineSize 64 **-------------- Unified Cache 0, Level 2, 512 KB, Assoc 8, LineSize 64 ********-------- Unified Cache 1, Level 3, 8 MB, Assoc 16, LineSize 64 --**------------ Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 --**------------ Instruction Cache 1, Level 1, 64 KB, Assoc 4, LineSize 64 --**------------ Unified Cache 2, Level 2, 512 KB, Assoc 8, LineSize 64 ----**---------- Data Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 ----**---------- Instruction Cache 2, Level 1, 64 KB, Assoc 4, LineSize 64 ----**---------- Unified Cache 3, Level 2, 512 KB, Assoc 8, LineSize 64 ------**-------- Data Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 ------**-------- Instruction Cache 3, Level 1, 64 KB, Assoc 4, LineSize 64 ------**-------- Unified Cache 4, Level 2, 512 KB, Assoc 8, LineSize 64 --------**------ Data Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 --------**------ Instruction Cache 5, Level 1, 64 KB, Assoc 4, LineSize 64 --------**------ Unified Cache 5, Level 2, 512 KB, Assoc 8, LineSize 64 --------******** Unified Cache 6, Level 3, 8 MB, Assoc 16, LineSize 64 ----------**---- Data Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 ----------**---- Instruction Cache 6, Level 1, 64 KB, Assoc 4, LineSize 64 ----------**---- Unified Cache 7, Level 2, 512 KB, Assoc 8, LineSize 64 ------------**-- Data Cache 7, Level 1, 32 KB, Assoc 8, LineSize 64 ------------**-- Instruction Cache 7, Level 1, 64 KB, Assoc 4, LineSize 64 ------------**-- Unified Cache 8, Level 2, 512 KB, Assoc 8, LineSize 64 --------------** Data Cache 8, Level 1, 32 KB, Assoc 8, LineSize 64 --------------** Instruction Cache 8, Level 1, 64 KB, Assoc 4, LineSize 64 --------------** Unified Cache 9, Level 2, 512 KB, Assoc 8, LineSize 64
I think the question might be can AMD do that?I am surprised that AMD didn't release a 6-core Ryzen 7 with higher clocks and same TDP. They could have released a 6-core with 3.8 GHz base clock / 4.2 GHz turbo with 95W.
For both, the definition has been pretty consistent for some time. I think it might not have been that different since perhaps the 90nm or 65nm generations, basically when either one hit the point where power consumption became crippling to future scaling and reliability.Keep in mind that these are whole system power consumption figures (including memory, motherboard, etc). But it seems to be that AMD measures TDP differently than Intel. AMDs TDP seems to be closer to Intel's SDP (scenario design power), which roughly means common power usage.
That hasn't been true for some time, and Intel likely broke from that first. AMD followed perhaps a little later, once it was able to get more than rudimentary power/thermal monitoring.TDP for Intel means absolute peak power consumption.
Not just TDP, since this seems to hit AVX clocks in some products if one core so much as sees one AVX2 instruction, which cannot reasonably overwhelm a cooler. This might go to power delivery/dissipation for specific parts of the chip rather than a global measure of the cooling solution.Intel CPUs need to run heavy AVX2 (FMA) code on all cores to reach TDP (AVX2 code is known to reduce clocks to maintain TDP).
It seems reasonable to count on GF's process being inferior, but absent that Zen's designers specifically noted that they've done things like use low-leakage and high-density cells for most of Zen. That means they've tuned the implementation with slower cells in areas that they reason won't need the significant area and leakage penalties of the cells that have half the linear delay--which holds true as long as the CPU is in some specified clock range. Pushing those efficient cells beyond that requires boosting the current enough to wreck power efficiency, or it takes them to the point of risking rapid physical degradation.AMD lead engineer already said that they have a list of low hanging fruit to improve Zen IPC in the future. It is first iteration of a brand new architecture. Also many sources tell that Global Foundries 14mm process is significantly inferior to Intel's new 14mm+. Equivalent process would increase performance and lower power consumption a lot.
If the two CCXs are only tied together by a 22GB/s link, that's just appalling.
What would the test be to isolate this bandwidth? The auto-translation of the discussion is a little rough for me, was this measured or disclosed by AMD? In order to profile this, the lines in the other CCX would need to be dirty, otherwise they wouldn't respond (shared does not respond in MOESI). Perhaps this is subject to some kind of mandatory write-back of dirty data if leaving the visibility of a CCX, putting a ceiling on bandwidth+overhead?
So the software fix is to treat each CCX like a quasi NUMA node?
That number is also half the measured the bandwidth of Ryzen's dual-channel memory controller.Apparently, they got the word directly from AMD.