Predict: Next gen console tech (9th iteration and 10th iteration edition) [2014 - 2017]

Status
Not open for further replies.
That would work both ways:

Based on what technical expertise do you guys think that a large number of games heavily took advantage of the ESRAM through custom code?
Edit: nevermind... Sebbbi said it better. :)

Intel's L4 EDRAM cache (in some Broadwell and Skylake models) on the other hand is a fully automated cache. Caches need considerable amount of extra die space for cache tags and coherency hardware.
Just for curiosity's sake, it's worth noting that at least the 128MB "crystalwell" die consumes 1/4th of the CPU's L3 (2MB) for cache tags... The tradeoff is massively worth it though; I'm not aware of any real-world situation where the 6MB L3/eDRAM loses to 8MB L3/no eDRAM. You could probably make such a benchmark, but it sounds fairly contrived. :)

Okay, that's enough off-topic for today I think. :)
 
Last edited:
Check out some GDC and SIGGRAPH presentations (last three years). ESRAM specific custom optimizations are mentioned by several teams. I am confident that every single AAA game is nowadays using ESRAM heavily, and has lots of custom code for it. Without ESRAM the total memory bandwidth of Xbox One is 68 GB/s. Even indie games nowadays are getting so graphically intensive that losing roughly 2/3 of your total memory bandwidth is not a realistic choice.

You need to manually manage your ESRAM usage and allocation policies, ...

I have read some some presentations on ESRAM uses. I not fully understood them, but I arrived to the conclusion that some trade offs are needed due to ESRAM size.

Could you please, tell us your wild guess size for a XboxOne.5?

Thanks.
 
I have read some some presentations on ESRAM uses. I not fully understood them, but I arrived to the conclusion that some trade offs are needed due to ESRAM size.

Could you please, tell us your wild guess size for a XboxOne.5?

Thanks.
From what I understand, and I'm not speaking for him, but by practice he can not. He is bound by NDA; even a guess if he were correct and it came back onto him it would be a situation not worth stepping to. Overall its less effort for him to do nothing. Man has programmed for 7 different consoles, he's been seeing the goodies for a long time before the public. These next ones aren't going to be all of a sudden a different process for him.
 
From what I understand, and I'm not speaking for him, but by practice he can not. He is bound by NDA; even a guess if he were correct and it came back onto him it would be a situation not worth stepping to. Overall its less effort for him to do nothing. Man has programmed for 7 different consoles, he's been seeing the goodies for a long time before the public. These next ones aren't going to be all of a sudden a different process for him.

I am Sorry, I probably explained myself very badly.
I am trying to ask the minimum quantity of ESRAM XboxOne should have to solve main trade offs of the technics found now in games.
 
I am Sorry, I probably explained myself very badly.
I am trying to ask the minimum quantity of ESRAM XboxOne should have to solve main trade offs of the technics found now in games.
That's a tough question to answer. We have a couple of other threads that address this but it came down to cost, or at least we believe it did.

Paired with DDR3 they needed more bandwidth hence the introduction of esram. If they paired with ddr5 which was harder to acquire in those quantities they could have used the remaining die space for more GPU.
Because they could not procure the ddr5 and Sony took a risk (went with 4GB originally - manages to get 8GH in the end) Xbox became weaker of the two consoles.

For Xbox 1.5 to be backwards compatible they only need to add in 32MB of esram :). No reason to increase that scratch pad further of you have HBM technologies coming.
 
Assuming MS went for the same CU/edram trade off with xb1.5 would that be 30 CUs vs 36 CUs on PS4k? Would it make that much difference?

Or the could go for a slightly bigger die area and CU parity?
 
Assuming MS went for the same CU/edram trade off with xb1.5 would that be 30 CUs vs 36 CUs on PS4k?
One would assume so as the area ought to be about the same relative amount.
Would it make that much difference?
To what? Peak performance, it'll be 1/6th less. BW, it'd add 180+ GB/s (maybe clocked higher for more) to whatever RAM it's paired with.

Or the could go for a slightly bigger die area and CU parity?
They could. It's always a matter of costs versus gains. Is the expense worth it?
 
Sorry, meant performance difference. 83% of the PS4k does seem like it'll make that much difference.

Wouldn't have thought MS would want to be behind on specs for a 2nd time though, even if it's even a smaller difference on screen than on current hardware.

They'd either have stump up the extra cost or wait for for HMB2?
 
Sorry, meant performance difference. 83% of the PS4k does seem like it'll make that much difference.

Wouldn't have thought MS would want to be behind on specs for a 2nd time though, even if it's even a smaller difference on screen than on current hardware.

They'd either have stump up the extra cost or wait for for HMB2?
It's an unknown quantity since we don't know the hardware restrictions. I think it's safe, for me at least, to assume that Microsoft strategy is software first; so by putting software and digital content purchasing first, full backwards compatibility is the ideal direction to head.

The next Xbox being fully compatible all the way back to x360 is a pretty big deal with the major cost of integrating 32 MB esram or equivalent into the next machine.

While I think HBM can emulate esram; I don't think it actually has similar characteristics meaning HBM only solution would not directly translate to easy BC
 
What could Microsoft launch for $500 in fall 2017 is my question now. Zen cpu, vega gpu and 16GB of HBM2?
They can at [best guess] offer what Sony is offering, Its highly unlikely they could offer more when dealing with the same technologies selling to the same industry - without blowing some sort of budget through the roof. I honestly can't see them being that far from their competitor. Considering how close together this particular generation is (the only difference being really the memory architecture and the associated costs to see them through) I would predict we're going to see something similar for this next round as well.
 
If the Xbox system is targeting 1080p, I think it's can be on the lower side of Neo at the cost of better BC and software solutions.
That kind of contradicts Phil Spencer's comments about having a significant upgrade as opposed to a half step.
 
That kind of contradicts Phil Spencer's comments about having a significant upgrade as opposed to a half step.
not necessarily. Going from 12 CUs to just shy of 36 CUs is a significant upgrade in performance.
 
Why does Microsoft's solution have to be mutually exclusive(ESRAM or no ESRAM? How likely is it they include the ESRAM on a separate module for backward compatibility, but the main SOC uses unified memory built around GDDR5? XB2 games wouldn't necessarily need to use or even have access to the ESRAM, but XB1 games & UWP apps needing it would have it. This backward compatibility module could be a shrunk down version that could be removed once later down the road when they are able to completely replace it with a software solution. It could even be used to replace the current XB1 SOC for a lower cost slim version. I know separate SOCs would be more expensive at first, but it might be worth to ease the transition from ESRAM to no ESRAM while maintaining full hardware backward compatibility.

Tommy McClain
 
These may be really bad questions but.

Why have esram specifically?, If we assume esram is not magical in its latency and GPUs are latency sensitive anyway why would any other fast pool of memory not work as a replacement providing it's over 32mb and has enough bandwidth? The move engines/with inbuilt functions seem more important than the memory itself?

Why hbm2?, hbm is in the wild and 4gb of it is a suitable expansion on esram, throw in a large pool of ddr4 and you have a similar setup now with more of everything. Having the OS not adding contention to the GPU fast memory seems useful given the split os design.
 
These may be really bad questions but.

Why have esram specifically?, If we assume esram is not magical in its latency and GPUs are latency sensitive anyway why would any other fast pool of memory not work as a replacement providing it's over 32mb and has enough bandwidth? The move engines/with inbuilt functions seem more important than the memory itself?

Why hbm2?, hbm is in the wild and 4gb of it is a suitable expansion on esram, throw in a large pool of ddr4 and you have a similar setup now with more of everything. Having the OS not adding contention to the GPU fast memory seems useful given the split os design.

@3dilettante might be better able to answer this question, but I think IIRC HBM while having huge bandwidth still cannot replicate the R/W concurrent bandwidth that eSRAM does. Which I guess when we break it down, at most Xbox can see up to 104GB/s reads and writes. eSRAM is not magical, but doesn't deny the fact that it is both cheap and highly effective. The hardware requirements may not require HBM, why not run 32MB eSRAM with high speed 8GB DDR5? That should be a cheaper solution no?
 
@3dilettante might be better able to answer this question, but I think IIRC HBM while having huge bandwidth still cannot replicate the R/W concurrent bandwidth that eSRAM does. Which I guess when we break it down, at most Xbox can see up to 104GB/s reads and writes. eSRAM is not magical, but doesn't deny the fact that it is both cheap and highly effective. The hardware requirements may not require HBM, why not run 32MB eSRAM with high speed 8GB DDR5? That should be a cheaper solution no?

Why bother with Esram if you have enough gddr5? I assumed it is only there due to ddr3 not having enough bandwidth and gddr5 offers enough bandwidth.

Performance wise Esram does concurrent rw combined upto 200 odd GBps but if another solution was offering say 300GBps surely it can read then write 32mb quicker than esram can concurrently rw 32mb? Unless thoes pesky details get in the way of my super basic understanding.
Concurrent operation helps make the previously reported bandwidth comparable to the competition but is there more to concurrent operation?

I admit hbm is expensive if also adding moreory to the system but what if Microsoft want more than 8 GB or ram, does gddr5 scale that far easily / cheaply.
 
Why hbm2?, hbm is in the wild and 4gb of it is a suitable expansion on esram, throw in a large pool of ddr4 and you have a similar setup now with more of everything.
Nobody's going to mix and match two different types of memory, not in a cost-sensitive device like a console. Intel is doing it with the xeon phi (combining GDDR5 and hybrid memory cube IIRC), but this is a server CPU costing thousands of $.
 
Why bother with Esram if you have enough gddr5? I assumed it is only there due to ddr3 not having enough bandwidth and gddr5 offers enough bandwidth.
Some of the designers' decisions about the use of ESRAM were that they were not comfortable with where GDDR5 would take them from a cost and power standpoint.
They also saw that developers generally adapted to the EDRAM of the 360, and the ESRAM was essentially that, but better..

The aggressive memory choice wasn't a sure bet for Sony, since the PS4 did lower its memory speed somewhat at the same time that it bumped capacity to 8 GB relatively late in the process.

The question of why bother if there is enough GDDR5 is that a design decision was made that enough GDDR5 was going to require sacrifices along other axes.
Power-wise, AMD's marketing for the HBM-based Fury X system put a 512-bit 290X subsystem at 37-50W. Cutting it in half for the PS4's 256 bits (slightly faster) is a not insignificant chunk of the power budget for a console working in the 100-200W (the further from 200, the better) range.
http://www.anandtech.com/show/9390/the-amd-radeon-r9-fury-x-review/5

Going above 256-bit seemed to be out of consideration. Also, looking at die shots of Durango, there seems to be less of an obstacle for shrinking the chip than Orbis. The DDR3 PHY blocks are nicely packed into the corners of the chip, so at least from a visual standpoint, there's more free perimeter to handle a shrink, while an interface that takes up more perimeter can make shrinking more problematic.
A shrink of Durango now means the ESRAM takes up (naively) half the area. That could mean area savings, or room for more capacity. The ESRAM and its interface would shrink, more so than any external bus. Whether the on-die interface is scalable is an unanswered question, but in theory if the ESRAM and its interface shrink as silicon is wont, then 2x the capacity AND bandwidth might be possible within the same power budget.

I admit hbm is expensive if also adding moreory to the system but what if Microsoft want more than 8 GB or ram, does gddr5 scale that far easily / cheaply.
GDDR5 has generally lagged the capacity of other standards a bit. DDR3 was a more sure capacity bet at the time, and it's more recently that higher-density GDDR5 has made it unnecessary for the PS4 to use clamshell mode.
 
Status
Not open for further replies.
Back
Top