Predict: Next gen console tech (9th iteration and 10th iteration edition) [2014 - 2017]

Status
Not open for further replies.
If Microsoft ditched ESRAM for the Xbox One.5 or Xbox Two, could they use HBM to simulate ESRAM for backwards compatibility? My understanding is AMD's first implementation of HBM is 512 GB/s. Because they have some API abstraction, could they not fake ESRAM by directing those api calls to a portion of HBM? Just wondering what issues I might be missing. Really, I guess what I'm asking is how Microsoft can ditch ESRAM and maintain backwards compatibility.

Also, as a hypothetical, could Microsoft release an Xbox One.5 with 8GB of HBM, but with lower bandwidth than the Fiji cards? Say, half 256 GB/s, or is the bandwidth a function of the number of stacked chips? I suppose in that case they could go with GDDR5 or GDDR5x. DDR4 speed doesn't seem to compare to ESRAM.
 
If Microsoft ditched ESRAM for the Xbox One.5 or Xbox Two, could they use HBM to simulate ESRAM for backwards compatibility? My understanding is AMD's first implementation of HBM is 512 GB/s. Because they have some API abstraction, could they not fake ESRAM by directing those api calls to a portion of HBM? Just wondering what issues I might be missing. Really, I guess what I'm asking is how Microsoft can ditch ESRAM and maintain backwards compatibility.

Also, as a hypothetical, could Microsoft release an Xbox One.5 with 8GB of HBM, but with lower bandwidth than the Fiji cards? Say, half 256 GB/s, or is the bandwidth a function of the number of stacked chips? I suppose in that case they could go with GDDR5 or GDDR5x. DDR4 speed doesn't seem to compare to ESRAM.
from a bandwidth perspective this isn't a big deal. But the esram has really low latencies, which could get a problem. Also the bandwidth of esram is well... was it ~208 GB/s (peak) but only for 32MB memory. to emulate that just try to calculate how fast the HBM memory needs to be to reach that massive bandwidth for that little memory. Well the esram is devided into 512KB chunks, each chunk can reach a bandwidth of 3.25 GB/s (when we calculate with this 208GB/s). Question is, can HBM reach this bandwidth for a 512KB or for 64 512KB memory blocks? at least for a 1:1 emulation.
Maybe you could a trick here and there on software-side. But if the latencies come into play... I don't know. GDDR5(x) has much higher frequencies to maybe compensate some of that, but HBM DRAM memory has much lower frequencies and therefore shouldn't be able to compensate the latencies.
 
MS could ditch both the Jaguars and esram hinderings and could do a software approximate emulation like they did with X360 on XB1.
 
The ESRAM is not that special honestly; MS probably has a compile option in newer SDKs to bypass it.
They learned after Xbox 360.
 
The ESRAM is not that special honestly; MS probably has a compile option in newer SDKs to bypass it.
They learned after Xbox 360.

ESRAM is under developer control. It's not just a compiler option of "use esram". It's "special" in the sense that it's a uniquely huge pool of esram that allows fairly modest system BW, and without it X1 would be crippled.

What MS learned from Xbox 360 was that they wanted embedded ram - the system architects even talked about this.
 
The ESRAM is not that special honestly; MS probably has a compile option in newer SDKs to bypass it.
They learned after Xbox 360.
A "compile option" to bypass it, seriously?

You should learn first how the ESRAM is such a central, complex, highly customizable and very important piece of hardware the XB1 has. For instance by reading here or elsewhere the numerous threads about it.
 
MS could ditch both the Jaguars and esram hinderings and could do a software approximate emulation like they did with X360 on XB1.
But that does only work, because the Power architecture was so "slow" (most times) comparing to x86. The 10mb embedded DRAM was quite easy to emulate on the sram. but to emulate sram on dram is really hard.
But sram (just for compatiblity) shouldn't such a huge problem, as the transistor-count might be high right now, but in the near future ... sram is easy to fabricate, easy to shrink and it's fast. In worst case it could be used as third level cache or something like that. Lets just say, even in the future, you can always use a small fast memory pool, if it is available.
 
In the case of a XboxOne and Half, what would be easier and/or cheaper for Microsoft.
To upgrade all the system ESRAM included (64/96 Mb) at a moderate scale, or keep the ESRAM as it is now and have a bigger bandwidth in main ram upgrading the rest too.
For the sanity of developers, I thinks the best is the first option, but I am just a hobbyist.
And I have no idea of the cost of a SOC with more ESRAM.
 
Bandwidth wise ESRAM could be ditched in favor of newer and cheaper tech, but I don't know what would replace it in terms of latency and concurrent access with the system RAM, if one were to emulate it for B/C.
 
Just how much die space would it take up on a die shrink anyway? I think esram would shrink well in comparison to the rest of the die, so why not just leave it on there?
Next generation once you have HBM2 etc then can get rid of it via emulation.
 
Just how much die space would it take up on a die shrink anyway? I think esram would shrink well in comparison to the rest of the die, so why not just leave it on there?
Next generation once you have HBM2 etc then can get rid of it via emulation.
Even HBM2 has DRAM, so bandwidth, yes, but not latencies, and read & write at the same time.
HBM2 has also (almost like HBM) very low frequencies (1GHz) which makes it not easier to get to the latencies of SRAM.
That might not be that big a deal for graphics calculations, but for other GPGPU calculations it could get critical.
 
if it was a big deal, would leaving it on there for b/c be a huge waste of space even considering the node it would be on?
When not in b/c mode could it be used as general scratchpad/cache, then when memory tech can replace it then replace it.
maybe the overall speed improvement of chip would be able to mask the latency difference anyway.
 
Remember that they're also talking about their customers, who are not consumers, but fabless semiconductor companies. From chip tapeout to an actual consumer product can be a window of 12-18 months.
From the recent conference call.
Trial 7nm production in 1H2017, full volume 1H2018, which used to be end of 2017. People with their ears to the ground in this field predict the earlier part of 2018 for volume production. Apple will be there, of course, but Mark Liu of TSMC said that they expected 15 7nm tape outs in 2017. There are a lot of interested customers for this node. Hmm.;)
 
Has anyone spoken about the impact and importance of the low latency? When it was raised in the past, it was basically dismissed as advantage (MS themselves said graphics work was latency tolerant). So unless we have evidence that the latency is a factor, I'm inclined to think it can just be ignored. Gains in processing rate might offset any latency disadvantage.
 
  • Like
Reactions: Jay
So forgoing the 150-200 GB/s BW it offers and limiting themselves to the ~60 GB/s of DDR3 only?

No, I'm suggesting dropping ESRAM and DDR3 and going with one large pool 8GB+ of faster memory that's more in line with videocards and ps4. One of the issues with ESRAM seems to be that it's a little too small at 32MB. The memory seems to take up a lot of die space, so increasing it on a future iteration doesn't seem like the best idea. I could be wrong.

@Allandor You're right. I wasn't thinking about bandwidth to a small memory block. HBM being capable of 256+ GB/s doesn't mean it's capable of 256+ GB/s to a small block of 32MB.
 
I was replying to Egmon who seems to think MS are disabling the ESRAM in XB1 in software and leaving devs with only 60 BG/s to work with. A single large pool as you suggest would work as long as latency isn't an issue, and I for one believe it isn't and what you suggest would work.
 
Status
Not open for further replies.
Back
Top