Esram in Scorpio as L3 cache in non-BC mode.

Proelite

Veteran
Supporter
Thread is motivated by this article about Haswell's decoupled l3 cache.
Since an always powered console don't have to worry about CPU frequency dropping below GPU frequency, I don't think a ring bus will be necessary.

For BC modes, the GPU will have exclusive access to 32mb of esram. In non-BC modes, the CPU and GPU can share the esram in all possible configurations.

Thoughts?
 
32mb Esram was around 100mm in 28nm. In 16nm it should be around 50mm.

I read somewhere that ESRAM doesn't scale down as well as other components. It could have been a random post somewhere so don't take this at face value.
We could try to find the x-ray of a 14-core Broadwell EP which has 35MB L3 cache and measures 246mm^2 using Intel's 14FF.
 
SRAM is the easiest thing to scale down due to highly repetitive and ordered structure. That's why new nodes have test chips solely done with SRAM.

Density between designs can differ based on I/O of course.

32mb Esram was around 100mm in 28nm. In 16nm it should be around 50mm.
It's still a large chunk, and only 32MB for a manually managed scratchpad would be awful for larger buffers.
 
Last edited:
I can't find it, but I remember an article saying there was an issue scaling down esram arrays while also scaling up the bandwidth? Like it needed more space if one needed more speed?
 
Well, more I/O does require space, but we're talking about just maintaining it here?
 
I read somewhere that ESRAM doesn't scale down as well as other components. It could have been a random post somewhere so don't take this at face value.
We could try to find the x-ray of a 14-core Broadwell EP which has 35MB L3 cache and measures 246mm^2 using Intel's 14FF.

Intel's 14FF is actually much denser than TSMC's 16nm, which is closer in size to Intel's 22nm.

If ChipWorks do a die shot of the Xbox One S die we should have a definite answer. The Xbox One S SOC is 33% smaller, but obviously not all the parts will shrink uniformly.
 
Apple a9 with 4mb esram on 16nm is 104.5mm2.
The 4mb esram takes around ~4% of the die: approximately 4mm2.

32mb esram should be between 32 and 40mm2 on Scorpio assuming the A9 L3 cache is a good approximation.
 
PS4 Slim die still smaller than Xbox One S

Which is in itself very interesting as the X1 CPU is the same area, the CUs are smaller area, and the memory interface is smaller area. Even taking off 40 mm^2 for the esram, the X1 chip would be roughly the same area as the PS4 Slim. Which (from an outsider perspective) makes no apparent sense. Though obviously there will be an engineering reason for it. Different node? Additional logic and memory blocks (X1 has some, and X1S may have integrated more components)? Removal of redundant CUs on PS4?

Regarding esram on Scorpio, especially with the new cache arrangement for Vega (ROPs now go through level 2) I wonder how necessary the esram is to meeting X1 performance. If you were going to spend extra silicon on esram it might be of more benefit to the chip in general to put it into a larger L2 to absorb more of the traffic that would have gone to and from esram. Then everything could benefit from it.
 
SRAM is the easiest thing to scale down due to highly repetitive and ordered structure. That's why new nodes have test chips solely done with SRAM.

Density between designs can differ based on I/O of course.

But.. I thought eSRAM was DRAM that masquerades as SRAM, so that it can be much easier to interface with what uses it.
What I'm describing there does exist. But wikipedia and JEDEC call that pseudo-static RAM, or PSRAM.

So.. is ESRAM or eSRAM just SRAM? Why didn't they call it just SRAM then :). Did they feel like adding an "E" made for better press? SRAM has been embedded into things for 40 years lol. Or is it some 1T-SRAM variant perhaps.

As to the original question. I understand that there's a difference between just an SRAM memory and SRAM cache (cache structures, associativity). Complicated and plugs into the CPU's memory hierarchy. So, if Xbox Scorpio uses a Zen APU and a Zen's CPU or APU L3 cache can work like that or it can have an L4 cache that works life that I'd say why not, else they wouldn't bother and there's a dumb 32MB ESRAM for backwards compatibility. Which can be used as you see fit anyhow.
i.e. I doubt very much that AMD will take special measures to make the CPU different just for one console.
Now, does the new Xbox use HBM2 and can stomach the loss of ESRAM for BC? Why not. Is that too expensive and it uses GDDR5X (say) instead? Then perhaps ESRAM is kept as-is because there's a latency cost - I don't know how relevant that is.
 
Last edited:
So.. is ESRAM or eSRAM just SRAM? Why didn't they call it just SRAM then :). Did they feel like adding an "E" made for better press? SRAM has been embedded into things for 40 years lol. Or is it some 1T-SRAM variant perhaps.

IIRC, the 5 billion transistor claim seemed to only be somewhat realistic with the assumption of 6T SRAM.

As for ESRAM, I thought it was just "enhanced" as opposed to embedded, or rather just another way to distinguish it as not being part of the cache hierarchy.
 
Which is in itself very interesting as the X1 CPU is the same area, the CUs are smaller area, and the memory interface is smaller area. Even taking off 40 mm^2 for the esram, the X1 chip would be roughly the same area as the PS4 Slim. Which (from an outsider perspective) makes no apparent sense. Though obviously there will be an engineering reason for it. Different node? Additional logic and memory blocks (X1 has some, and X1S may have integrated more components)? Removal of redundant CUs on PS4?

Both.
The X1S has a new H265 4K HDR video decoder, plus a new hardware scaler for 4K.
In the PS4 Slim I'm pretty sure they took away the 2 redundancy CUs they had in the original and simply shrunk the chip. It's a really low effort design, as the engineering resources were probably focused on Neo anyway.
 
Well, I actually like the esram, because it it just really really fast for the small chunk of memory. But with a much more capable chip and main-memory system, you don't need it for "emulation". You simply get the time "back" you might loose through higher memory access times by the larget GPU. The GPU in scorpio will be much faster than in original x1. So e.g. if you might loose 2ms each frame from the slower memory access the GPU just can process every frame much faster. You have more than 4 times the computation power. So it should be easy to "simulate" the esram for scorpio.
And as all titles have a fixed top frame-rates (30 or 60), you won't get much of a difference. Like BC now with xbox 360 games, you might get more stable 30fps framerates on 30fps games or stable 60fps on 60fps games.

But a "small" esram would really help with the memory contention problem. but it seems it isn't worth it.
 
As for ESRAM, I thought it was just "enhanced" as opposed to embedded, or rather just another way to distinguish it as not being part of the cache hierarchy.
Embedded DRAM has "E" because it is DRAM that is embedded to processor die. Embedding DRAM requires special fab process. Mostly IBM uses it.

I believe the ESRAM in Xbox One was called ESRAM (with E) simply because Xbox 360 had EDRAM scratchpad for GPU and both are used roughly for the same purposes. Embedding SRAM doesn't require any extra fab trickery and SRAM is commonly used by all kinds of processors for all kinds of purposes. It is commonly called just SRAM.
 
Last edited:
"Specs Leaked!"... I'm not seing any new specs..only speculation based on a paper which has probably been freely available to D3D devs on MSDN since July 2016.
 
Back
Top