Xbox One (Durango) Technical hardware investigation

Status
Not open for further replies.
Sony will have a lite version, this is the design for MS's lite version. Its also what will go into OEM brandable products.

The lite version for both Sony and MS is what comes out this year. The 'Next' versions, the powerful ones, will be based off the lite design BUT with extra SoCs..

Lite in 2013
Next in 2014

Your source? You seem like you are pretty sure that will happen rather than making a prediction.
 
I guess the part that I'm finding most surprising is the size of the ESRAM.

The CPU and GPU core and clock numbers aren't surprising, but I thought that they would go for more embedded RAM (64+ MB) and (as in the block diagram), the CPU would have direct access to it as well instead of through the north bridge.

Given what's in this SOC, it's still probably a 100W chip.
 
Xenos' ROPs needed that huge BW, because they were unable to compress/decompress data, thus providing simplified design for eDRAM integration. If Durango GPU is a more conventional design, the reduced BW shoudn't affect it.

According to the article in IEEE Micro, the high ROPs<->EDRAM bandwidth was a design decision to guarantee sustainable throughput for the ROPs. Since compression schemes are lossless, the amount of consumed bandwidth is unpredictable, since it depends on the frame content.

As I stated in the previous post, if the embedded memory is simply a read/write scratchpad, 102.4 GB/s do not make sense to me. It would be quite cheap and useful to provide more bandwidth, since it would end up being used for both texturing and ROPs operation. Hence, a Xenos-like setup is the most likely in my opinion.
 
I guess the part that I'm finding most surprising is the size of the ESRAM.

The CPU and GPU core and clock numbers aren't surprising, but I thought that they would go for more embedded RAM (64+ MB) and (as in the block diagram), the CPU would have direct access to it as well instead of through the north bridge.

Given what's in this SOC, it's still probably a 100W chip.
What would be the physical size of 64MB of such an SRAM?
Is there an easy way to calculate that? It sounds barely less difficult to do than 64MB of L3, which would be pretty big.
 
What would be the physical size of 64MB of such an SRAM?
Is there an easy way to calculate that? It sounds barely less difficult to do than 64MB of L3, which would be pretty big.

As I posted in my previous post, roughly 40 mm^2 for 1T-SRAM, based on Wikipedia figures at 45 nm:
0.14 mm^2 per Mbit at 45 nm -> conservative estimate of 0.08 mm^2 per Mbit at 28 nm
 
The entire setup reeks of some kind of new tile based/deferred rendering approach along the lines of what makes the PowerVR GPUs so very efficient.

It would also be very scalable (integrated into a PC GPU card, it would basically push crossfire scaling to 100% and solve any micro stuttering issues). It would be the end of huge monolithic GPU chips as we know them.

Right now were talking...a new approach that throws out traditional pc comparison measurements in pure flops/mm2..etc...really interested in this...
 
It's a strange diagram, with a few interesting notes:

Kinect In is an interesting quirk. It was mentioned in terms of "Kinect" that the USB bus of the original 360 was bandwidth limited... which would be a sane, boring answer. A more interesting option is that they moved the processing off-board.

But the interesting one, personally, is the bit that seems to say HDMI 1.4a *IN*. Looking back at previous designs, is the 720 going to sit 'in-line' on your existing HDMI TV feed(s)? Allowing you to control everything with a kinect interface and seemlessly switch from TV to game, and have your online status overlaid on TV (and vice-versa).

The design also looks like it could be scalable down into a set-top box? (switch to 1/2 cores, reduce the memory etc).

Performance-wise, all that is over my head :).
 
Unless I missed something, we don't really know what GCN2 is supposed to be. *shrug*

No I dont either..but Im gonna guess it doesnt allow anything like 100% efficiency of those shaders..so if these specs are remotely true..we must be looking at a more advanced setup.
 
As I posted in my previous post, roughly 40 mm^2 for 1T-SRAM, based on Wikipedia figures at 45 nm:
0.14 mm^2 per Mbit at 45 nm -> conservative estimate of 0.08 mm^2 per Mbit at 28 nm

so 32MB @28nm could be more like 20nm. very little.

i love the esram amount, it's perfect. cheap and effective. i hate throwing away a lot of budget on that stuff

it makes much more sense this gen than last gen (where xenos daughter die started at 80mm or something). i think it's probably the smart engineering move this gen, where last gen i'm not so sure it was.
 
What would be the physical size of 64MB of such an SRAM?
Is there an easy way to calculate that? It sounds barely less difficult to do than 64MB of L3, which would be pretty big.

Not sure, it terms of SRAM 64MB would probably be over 100mm. Probably too big, And a cache would be a lot bigger since you have additional logic to deal with all the tags, replacement, fills, etc. With EDRAM I thought that maybe they could go to 64+ MB?

So then the question is why SRAM over EDRAM? Better latency? Predictable performance? Manufacturing issues/Manufacture capability?

Due to drastically increasing leakage power at smaller and smaller process nodes, I think EDRAM has become more power efficient (that's why IBM is using it in PowerA2 for the caches for example.)
 
so 32MB @28nm could be more like 20nm. very little.

i love the esram amount, it's perfect. cheap and effective. i hate throwing away a lot of budget on that stuff

it makes much more sense this gen than last gen (where xenos daughter die started at 80mm or something). i think it's probably the smart engineering move this gen, where last gen i'm not so sure it was.

You make it sound as if they reinvested that money back into other hardware components.
 
I get the impression that this durango...leaving out design costs..is gonna be profitable on launch. .. use dvr.....and use less power...backing previous claims.
Still want to know more about the "extras"

Edit; only one thing I have an issue with...bandwidth..although gpu wide...seems quite low compared to xbox 360 edram... seems like they have been very conservative to me..was expecting something like ddr4..
 
Yes only with MSAA, which virtually no one uses anymore.

So in the 360 calculation, they were basically saying 64GB/s with 8 ROPs (read+write, 32bpp colour + Z) @ 500MHz. Doesn't 102.4GB/s seem rather on the low side? Isn't that significantly worse with MRTs?

Is it actually feasible to render to both ESRAM & DDR3 simultaneously?
 
You make it sound as if they reinvested that money back into other hardware components.

Even if they didn't, a sweet price helps too, theoretically (yes I know it doesnt mean we get that either, but at least it's possible)

Plus given budget X, then every dollar saved somewhere helps elsewhere. Yes maybe they would have went with 10CU's or something pathetic.

This box at 299 vs Orbis at 399?
 
I wonder how closely coupled the SRAM is to the GPU, if it really is SRAM, I wonder if it could be used to increase register storage, or the global/local shader store. That's potentially interesting, might even be worth the trade off.

I guess if you can store inputs for compute jobs there then the low latency could be a big win in memory bound scenarios, and as I've said before it's far more common to be memory bound than ALU bound for many compute tasks.

The only other major reason I can see using SRAM would be ease of manufacture, it would all be done on one process.
 
Not sure, it terms of SRAM 64MB would probably be over 100mm. Probably too big, And a cache would be a lot bigger since you have additional logic to deal with all the tags, replacement, fills, etc. With EDRAM I thought that maybe they could go to 64+ MB?

So then the question is why SRAM over EDRAM? Better latency? Predictable performance? Manufacturing issues/Manufacture capability?

Due to drastically increasing leakage power at smaller and smaller process nodes, I think EDRAM has become more power efficient (that's why IBM is using it in PowerA2 for the caches for example.)
I thought eDram needed expensive and finicky processes, and cause nightmares for later shrinks. if eDram is a possibility, it brings the question why didn't the Cell ever use it for the LS (would have had 1MB per LS for the same space), and why couldn't MS ever integrate it into their GPU. IBM is in a special situation with their super expensive chips :D
 
My point was no one would make up it had a hdmi in lending validity to the 'leak'

I wouldn't go that far, the home entertainment/living room segment has been a strongly pursued by MS for the last few yours and you can already buy things like live Pay Per Views of UFC on the Xbox 360 and much more, plus there's been talk about using it as a DVR/receiver in some capacity for a long time. So there's no extra validity to these rumors because of the HDMI in.

Speaking of HDMI in:
To me, it's actually a bit confusing because what the hell would you need it for? I don't know much about the American market, but over here every newish cable/decoder box is also a DVR and I don't see any mention of DTV/cable decode capabilities in that diagram and without that... and it just doesn't make sense to chain a DVR to another DVR(Durango).
Unless the idea is that you can get picture in picture so you can watch football in a corner while playing a game, or watch football on the tv and grind in a corner. But that just seems mildly pointless and not exactly a killer feature.

edit: just read Dumbo11's post and that seems like a much more feasible use for chaining it to a DVR. On-screen prompts for when something is downloaded so you don't have to switch back and forth and notifications of other activity seems much more worthwhile than what my crappy imagination could come up with. It makes sense with the always-on rumor and it screams "integration!".
 
Last edited by a moderator:
Status
Not open for further replies.
Back
Top