The AMD Execution Thread [2020]

Status
Not open for further replies.
The added power benefits of having an IO die on a more advanced process are also compelling, especially for the server IO dies which are pretty large and power hungry.
The problem is, PHYs don't really scale that well with smaller processes and the chip needs to be certain size to house all the required PHYs. Both of AMDs current IO dies sizes are already dictated by PHYs (they fill the edges and there seems to be "empty silicon" quite a bit, shots for example here https://www.anandtech.com/show/1504...60x-and-3970x-review-24-and-32-cores-on-7nm/3 )
 
The problem is, PHYs don't really scale that well with smaller processes and the chip needs to be certain size to house all the required PHYs. Both of AMDs current IO dies sizes are already dictated by PHYs (they fill the edges and there seems to be "empty silicon" quite a bit, shots for example here https://www.anandtech.com/show/1504...60x-and-3970x-review-24-and-32-cores-on-7nm/3 )

Not entirely sure whether those bits are "empty silicon" but given that large parts of the chip are IO and PHYs which don't scale well, the chip size may not shrink much anyway. And don't forget they'll have a bunch of new tech like next gen Infinity architecture, DDR5, PCIe5, USB4, CXL, etc which would also increase the amount of logic required. Even if they don't shrink the size, the benefits are for power, and power is going to be even more important with all the aforementioned new high speed interfaces. If they still have some "free" space to use, there's always Infinity cache they can fill it with!
 
The problem is, PHYs don't really scale that well with smaller processes and the chip needs to be certain size to house all the required PHYs. Both of AMDs current IO dies sizes are already dictated by PHYs (they fill the edges and there seems to be "empty silicon" quite a bit, shots for example here https://www.anandtech.com/show/1504...60x-and-3970x-review-24-and-32-cores-on-7nm/3 )
I'm not sure I see much empty silicon. There are lines and less distinct features in the darker regions. There's a lot of interconnect between the various clients and IO that goes in the areas between the obvious blocks.
 
I'm not sure I see much empty silicon. There are lines and less distinct features in the darker regions. There's a lot of interconnect between the various clients and IO that goes in the areas between the obvious blocks.
Obviously interconnects have to be there no matter if it's "empty silicon" or not to get data from A to B through it, what I meant is that there's no logic and if the PHYs didn't already dictate the die size to be what it is, those spaces could have probably been trimmed off the chip making it cheaper (so no, I don't think the space there is necessary for the routing, but it's there only because the PHYs are making the chip so big).
The new standards will need new logic for sure, but how much? Getting new stuff might mean more and/or bigger PHYs to fit in too. Moving to smaller process, while it would save power, might make them too expensive too, if it requires "wasted silicon" to fill the void. And then, in context on desktop and servers, do we even need the power savings? I mean, that IO die is 20 watts - how much scraping, say, 10 watts from it to be used by cores instead would help? In mobile where power savings matter more they're continuing to use monolithic designs for now anyway.
 
Obviously interconnects have to be there no matter if it's "empty silicon" or not to get data from A to B through it, what I meant is that there's no logic and if the PHYs didn't already dictate the die size to be what it is, those spaces could have probably been trimmed off the chip making it cheaper (so no, I don't think the space there is necessary for the routing, but it's there only because the PHYs are making the chip so big).
The new standards will need new logic for sure, but how much? Getting new stuff might mean more and/or bigger PHYs to fit in too. Moving to smaller process, while it would save power, might make them too expensive too, if it requires "wasted silicon" to fill the void. And then, in context on desktop and servers, do we even need the power savings? I mean, that IO die is 20 watts - how much scraping, say, 10 watts from it to be used by cores instead would help? In mobile where power savings matter more they're continuing to use monolithic designs for now anyway.
Would a similar complaint of wasted die space be leveled at the original Zen layout? The IOD area seems close to what the area would be from an equivalent number of uncores.
 
Obviously interconnects have to be there no matter if it's "empty silicon" or not to get data from A to B through it, what I meant is that there's no logic and if the PHYs didn't already dictate the die size to be what it is, those spaces could have probably been trimmed off the chip making it cheaper (so no, I don't think the space there is necessary for the routing, but it's there only because the PHYs are making the chip so big).
The new standards will need new logic for sure, but how much? Getting new stuff might mean more and/or bigger PHYs to fit in too. Moving to smaller process, while it would save power, might make them too expensive too, if it requires "wasted silicon" to fill the void. And then, in context on desktop and servers, do we even need the power savings? I mean, that IO die is 20 watts - how much scraping, say, 10 watts from it to be used by cores instead would help? In mobile where power savings matter more they're continuing to use monolithic designs for now anyway.

The consumer IO die is rumored to be ~20W. For the Server IOD with >= 4x memory channels, PCIE lanes, IF links, etc the power would certainly be a lot higher (50W+ maybe?). A significant reduction there would allow more power (and therefore performance) for the CCDs. Given the higher margins on the EPYC parts, the additional costs of 7nm (which will be lower in 2022 anyway) could easily be absorbed, especially if it increases performance. Like Milan, if they reuse the IO die for the generation after Genoa, we're looking at a part which will be used through 2024 at least.
 
Like Milan, if they reuse the IO die for the generation after Genoa, we're looking at a part which will be used through 2024 at least.
Isn't Milan the last gen to use DDR4? Either they have double-standards (pun intended) in the memory controller, or they would need a new IO-Die after Milan anyway (and hopefully, with moving a lot of chips over to 5 nm, there'd be free 7 nm capacity).
 
Isn't Milan the last gen to use DDR4? Either they have double-standards (pun intended) in the memory controller, or they would need a new IO-Die after Milan anyway (and hopefully, with moving a lot of chips over to 5 nm, there'd be free 7 nm capacity).

Yes I was referring to the new IO die of Genoa (Zen 4) and the generation after (Zen 5), both of which will use DDR5. So if they follow the same strategy as with Rome and Milan, the IO die will remain the same. So then it would make even less sense to be on a 14nm class process. The one major difference is that Rome came out when DDR4 was already mature and supported the max 3200 Mhz JEDEC spec. Given that Genoa is the beginning of the DDR5 transition, I expect the IO die to only have support for DDR5-4800 or 5500 Mhz initially. So if they want to bump up the speed for Zen 5, then they'd need a new or revised IO die.
 
photo019_l.jpg

I came across this slide presented by AMD at ISSCC 2020. Note the wording "N-1 generation silicon" for the IO, instead of explicitly mentioning either 14/12nm.
 
True, but then one needs to be quite flexible with definitions to get GloFo 12/14nm to be N-1 from TSMC N7

Yes it is a bit obfuscated with all the half nodes in between, but only 7nm is a "full node" ahead of 14/16nm. And 5nm is the next so called "full node", which would make 7nm the N-1 (again slightly obfuscated by the variants such as N7+ and N6). There is also the possibility that the slide could be referring specifically to Zen 2, so the N-1 strategy is certainly not guaranteed for future generations.
 
Yes it is a bit obfuscated with all the half nodes in between, but only 7nm is a "full node" ahead of 14/16nm. And 5nm is the next so called "full node", which would make 7nm the N-1 (again slightly obfuscated by the variants such as N7+ and N6). There is also the possibility that the slide could be referring specifically to Zen 2, so the N-1 strategy is certainly not guaranteed for future generations.
Wasn't 10nm full node too, even if AMD never used it?
 
Wasn't 10nm full node too, even if AMD never used it?

10nm was a bit more than a half node, with about 30-40% area scaling, and was only meant for SoCs, somewhat like 20nm was. But 14nm to 7nm gives you around 70% scaling, which is around what it traditionally has been for full nodes.
 
Status
Not open for further replies.
Back
Top