Xbox One (Durango) Technical hardware investigation

Status
Not open for further replies.
Is it at all plausible MS had it designed to yield very well with the understanding they could up the clocks (or even use CUs that would have been disabled?) in response to the competition, taking a lower yield and/or beefing up their stock cooling? Doesn't seem so to me.
Well I don't think anybody would design anything that yields really bad... purposefully :LOL:
Which amount of coarse grained redundancy if any is built into the Design (be it Durango or Orbis) is unknown, still I don't see how it would make them any good to swallow extra to still have a "lesser" products in the eyes of a given set of potential costumers.

By the way it is still unknown what Sony and MSFT will get out of the foundry, putting late rumors from unproven sources about durango production troubles, it would no be out of the extraordinary that instead of an overclock we get the contrary and that some extra CUs are disable.
I would say that Sony has lot more room here, imho MSFT can't go lower than 12 CUs, they need 32MB of ESRAM, (I could even think of a few extra things as I could believe that Durango is more of a SoC than Orbis a bit like Kabini with pretty much everything on the chip whereas Sony has its custom southbridge). and clock speed bump would only won them that much, Sony starts from 18 which let them quiet some room.
To me it is more likely for Sony to lower its specs (while retaining the lead in GPU power) accordingly to what come back from the foundry than for MSFT to do the contrary. My view on the topic is that MSFT has not much to win by swallowing extra costs, the one that has its hand free is Sony, depending on what get back the foundry they can do (if needed) any of the following:
Disable a CPU core, a couples more, CUs, lower the GPU clocks speed, etc.
All would have a positive effect on costs and they would have to make pretty drastic cut to lose their "advantage" (/extra GPU grunt).


EDIT

Actually it all gets me to wonder about some statement by Sony execs about their SoC and saving wrt production, in any case I think I won't post here but in the thread about the BOM of durango and Orbis.
 
Last edited by a moderator:
I don't hink you would have redundant CU's in the design, it's done in the PC space because you can bin the parts and still sell the ones that fail.
I think instead you'd design with a target you know you would get adequate yields for.
 
No chance of a console company tracking two related system designs all the way into pre-production stage before picking one for final retail units?
 
I don't hink you would have redundant CU's in the design, it's done in the PC space because you can bin the parts and still sell the ones that fail.
I think instead you'd design with a target you know you would get adequate yields for.

There was a redundant SPE for Cell.
The rumored console specs are interesting in that we don't know about redundancy and disabled cores, but we know there's a lot of die area that defects could hit.
SRAM can have a lot of redundancy, and will probably have multiple banks for fault tolerance, but there's an octo-core Jaguar and at least 12 SIMDs.

The thing is that there are Jaguar SKUs we do know about already with a bunch of bins for clocks and differing core and GPU configurations, which means there are much smaller AMD chips that might not yield well enough.
 
I'd think those lower jaguar bins are more to fit power profiles in smaller devices than actually being able to produce reliably at medium clocks, but I'm just guessing.
 
No chance of a console company tracking two related system designs all the way into pre-production stage before picking one for final retail units?

Depends on the amount of variation between the two designs and the risk a company is willing to take with yields. Microsoft doesn't have a year+ advantage to get things right like they did with the 360 (which didn't have honest price competition until the PS3 dropped to $399 2 years after the 360 launched in '05), so any kind of major delay will mean a good chunk of sales lost to the PS4.

The risk is not worth the reward.
 
I'd think those lower jaguar bins are more to fit power profiles in smaller devices than actually being able to produce reliably at medium clocks, but I'm just guessing.

I'd expect an unspecified mix of defects, parametric yields, and segmentation.
For example, the dual-core 1 MB L2 models shouldn't have to lose half the L2, or a chip that can only salvage half its L2 can probably limp along with more than two cores.

The CPU-only chip can salvage chips with leaky CPUs, or capture any chips with defective SIMDs as well.
In terms of area, two Jaguar clusters are close in area to two Bulldozer modules, and we know the latter have defect bins. We already know GPUs with the likely SIMD count for either console have defect bins.
 
I don't hink you would have redundant CU's in the design, it's done in the PC space because you can bin the parts and still sell the ones that fail.
I think instead you'd design with a target you know you would get adequate yields for.
Well I would expect that much, actually bonaire has no coarse grained redundancy either (though I guess bad parts could be salvaged as HD 7770 or 7750).
Not too mention that both system should embark a pretty big chip, going further for the sake of having coarse grained redundancy could be sort of self defeating (in turn affecting yields).

There was a redundant SPE for Cell.
The rumored console specs are interesting in that we don't know about redundancy and disabled cores, but we know there's a lot of die area that defects could hit.
SRAM can have a lot of redundancy, and will probably have multiple banks for fault tolerance, but there's an octo-core Jaguar and at least 12 SIMDs.

The thing is that there are Jaguar SKUs we do know about already with a bunch of bins for clocks and differing core and GPU configurations, which means there are much smaller AMD chips that might not yield well enough.
I was thinking of posting in that thread because whereas lot of people seems to expect MSFT to up somehow its specs, I think that Sony is in a better /more flexible position.

I'm not sure exactly (like any outsider) about how the chip are put together but I could see Sony choice making sense wrt production costs (cf the comment of one of their execs).
I expect MSFT to have an almost "complete" SoC, I mean a SoC that integrates almost anything the system needs to function, like what AMD did with Kabini.
So there are a lot of possibilities for a chip to "get bad", on the other hand I seems that Sony went for what seems a pretty standard APUs backed by a "super" south bridge.
As I see it, the chip is more "homogenous" it is mostly CPU and GPU cores, so defects could be more easily hidden by resorting to coarse grain redundancy.
As Sony starts with what seems a more powerful system (at least on the GPU side) I think that they have more option than MSFT to play with the specs depending on what gets back from the foundry.
For example if yields were to be really underwhelming, and taking in account the cost of the memory, they could disable 1 cpu cores, and 2 SIMDs, 8 ROPS and still retain what would be perceived as a technical advantage, they may even tone down the clock speed ( if they ever need to do any of that).
MSFT has less options, there are more critical parts in the chip (I mean that can't get disable, sounds and others IO / stuffs you can't disable for yields) and they start with a more conservative GPU.

I think that the rumors about production problems are FUD, but if they were to be some truth to them I would think that it could be related to the fact that durango is a "proper" SoC with a lot of critical parts that have to function and can't be disable for yields or that could face design issues.
 
Last edited by a moderator:
I just received an interesting pastebin link about the Durango dev kits from a friend. He certainly isn't a game designer (but he works in IT), so I don't know where he got it, and he wouldn't tell. Does this make sense or is this fake? I'm not really a tech guy and I figured this would be the best place to ask. I hope it is true... :cry:

http://pastebin.com/CiKCVeiA
I approved this post, but wish I hadn't. It's 99.99% probability bollocks. In summary, to save everyone else clicking the link, 8 GBs DDR3 and 6 GBs GDDR5 and discrete 8 core CPU that's not Jaguar and no eDRAM and discrete GPU yada yada "this is what the xbox fans are wanting to read and I want my 15 minutes of internet fame".

Move along, nothing to see here (unless MS have been secretly spinning black-ops lies for the past year and want to charge big bucks or take a significant loss!).
 
I'd expect an unspecified mix of defects, parametric yields, and segmentation.
For example, the dual-core 1 MB L2 models shouldn't have to lose half the L2, or a chip that can only salvage half its L2 can probably limp along with more than two cores.

The CPU-only chip can salvage chips with leaky CPUs, or capture any chips with defective SIMDs as well.
In terms of area, two Jaguar clusters are close in area to two Bulldozer modules, and we know the latter have defect bins. We already know GPUs with the likely SIMD count for either console have defect bins.

I was never suggesting a no defect scenario and I suppose we'll never really know and the durango apu is uncharted territory.
 
Any chance that eSRAM is SRAM embedded in ECC logic? I ask because this AMD patent.

Providing test coverage of integrated ecc logic en embedded memory
http://www.faqs.org/patents/app/20120266033

In one embodiment, the graphics card 120 may contain a graphics processing unit (GPU) 125 used in processing graphics data. The GPU 125, in one embodiment, may include an embedded memory 130. In one embodiment, the embedded memory 130 may be an embedded random access memory (RAM), an embedded static random access memory (SRAM), or an embedded dynamic random access memory (DRAM). In one or more embodiments, the embedded memory 130 may be an embedded RAM (e.g., an SRAM) with embedded ECC logic.

This patents shows a few interesting embodiments including embedding ECC RAM into the northbridge and southbridge (possible PS4 implication?).

Microsoft Research has a presentation involving ECC memory.

http://research.microsoft.com/apps/video/dl.aspx?id=103620

In modern microprocessors and systems-on-a-chip, the embedded memory system plays a key role in determining the design’s overall performance, power, area, reliability, and yield. As fabrication process technologies scale into the deep nanometer regime, increasing device variability poses particular difficulties for memory design due to the large number of variation-sensitive near minimum-sized devices in the cell arrays which often must achieve working circuits out beyond six sigma of variation to meet design targets at economically acceptable yields. Furthermore, as scaling progresses, soft errors in the memory system will also increase in frequency and scope, and single error events are more likely to cause large-scale multi-bit errors..Finally, I will present an efficient multi-bit ECC technique tailored for correcting manufacture-time variation errors along with soft-errors by making use of two-dimensional (2D) erasure coding. The proposed scheme when combined with a small amount of row redundancy can improve the SRAM access latency, power, and stability by over 40% on average, while maintaining near 100% yield and run-time reliability. Conventional schemes require >70% variability-hardened redundancy or must disable >70% of memory to achieve the same level of tolerance.
 
Any chance that eSRAM is SRAM embedded in ECC logic? I ask because this AMD patent.
If you mean by eSRAM the 32MB on-die storage:
1) It's not disclosed if this has ECC.
2) This patent deals with a way to integrate testing of ECC logic, and you don't need 32MB for that.
 
If you mean by eSRAM the 32MB on-die storage:
1) It's not disclosed if this has ECC.
2) This patent deals with a way to integrate testing of ECC logic, and you don't need 32MB for that.

Nevermind I get what you are saying.

Thanks
 
Last edited by a moderator:
Not sure where to post that, I would have gone with the now defunct "predict the next generation etc." thread, but reading article about Haswell and the embedded eDRAM included with the GT3e version of the chip, I can't help but feel like somehow MSFT (but it applies to Sony too) fell on the wrong side of the technology curve.

Realworldtech article is as usual interesting, so are sebbbi's (among others) comments on this forum, overall it got me to wonder if what durango should achieve could have been achieved for less money.
D.Kanter estimates the die size of the eDRAM in Haswell at 70/80mm^2, that is for 128MB @22nm.
It estimated the bandwidth to that pool of memory @64GB/s and later stated that that figure should be within 20/30% of the real bandwidth figure (sources told him that it undershot it).
Such a massive amount of memory should allows massive saving in memory traffic with the main RAM.
I wonder how it would affect CPU performances too if like in Haswell (still unclear) the whole pool of memory is seen as a last level of cache.

Going by D.Kanter info that was doable using IBM 32nm process within a reasonable transistor budget (the graphs in realworldtech article show a less than optimal scaling in memory cells size from 32nm to 22nm).

Anyway the more I look at what INtel did with Haswell,which is set to ship soon, and I compare to what MSFT and Sony did the more I feel like the latter might not have gone for the most economic road.

EDIT
For the sake of accuracy here is a link to the aforementioned Realworltech article and a quote from sebbbi taken from the "I can Haswell" thread:
sebbbi said:
Optimized g-buffer layout at 720 (similar to Cryengine 3) uses: 720p * (8888*2 + D24S8) = ~ 10.5 MB memory. Add the texture pages mapped, and you get ~60 MB required to render a frame (for opaque geometry). As the data set changes slowly from frame to frame (animation must be smooth to look good), there's not many texture pages you have to move in/out of the eDRAM every frame. 16 x 64 KB pages (per frame) would be enough for most of the time (except for camera teleports), and since we should assume the GPU can also sample textures directly from DDR memory, the eDRAM<->DDR memory transfer bandwidth would never be a bottleneck (you could amortize the transfers over several frames).

With an optimized texture layout (3 bytes per pixel) + optimized g-buffer layout (12 bytes per pixel) 1080p would need (worst case)... ~ 60 MB * 2.25 = ~135 MB. That could often be below 128 MB, allowing us to fully texture our frame from eDRAM. My calculations are however not taking account how much data the shadow maps require, or how much data the transparent passes require (particles, windows, etc). However this data could be sampled directly from the DDR memory, assuming the memory management can set priorities correctly (and try to keep a most often reused subset of 64 KB texture pages in the eDRAM).
Pretty much aiming for 720P and with 128MB you have ways to never touch the main RAM if I read it properly, you may even be left some room for the CPU or the GPU used in a GPGPU fashion or any other accelerators a given chip could embark.
 
Last edited by a moderator:
Not sure where to post that, I would have gone with the now defunct "predict the next generation etc." thread, but reading article about Haswell and the embedded eDRAM included with the GT3e version of the chip, I can't help but fell like somehow MSFT (but it applies to Sony too) fell on the wrong side of the technology curve.

Realworldtech article is as usual interesting, so are sebbbi's (among others) comments on this forum, overall it got me to wonder if what durango should achieve could have been achieved for less money.
D.Kanter estimate the die size of the eDRAM in Haswell at 70/80mm^2, that is for 128MB @22nm.
It estimated the bandwidth to that pool of memory @64GB/s and later stated that that figure should be within 20/30% of what the real bandwidth figure (sources told him that it undershot it).
Such a massive amount of memory should allows massive saving in memory traffic with the main RAM.
I wonder how it would affect CPU performances too if like in Haswell (still unclear) the whole pool of memory is seen as a last level of cache.

Going by D.Kanter info that was doable using IBM 32nm process within a reasonable transistor budget (the graphs in realworldtech article show a less than optimal scaling in memory cells size from 32nm to 22nm).

Anyway the more I look at what INtel did with Haswell,which is set to ship soon, and I compare to what MSFT and Sony did the more I feel like the latter might not have gone for the most economic road.

I guess you can afford to be more economical in your approach when there isn't much pressure on the graphics performance of your design.
 
Not sure where to post that, I would have gone with the now defunct "predict the next generation etc." thread, but reading article about Haswell and the embedded eDRAM included with the GT3e version of the chip, I can't help but fell like somehow MSFT (but it applies to Sony too) fell on the wrong side of the technology curve.
They're on the "don't have Intel's engineering or process tech" curve, and they weren't going to spend the billions of dollars Intel invests in its 22nm process and fabs.

Going by D.Kanter info that was doable using IBM 32nm process within a reasonable transistor budget (the graphs in realworldtech article show a less than optimal scaling in memory cells size from 32nm to 22nm).
What was doable? An IBM 32nm SRAM sized to 128MB, or 32nm eDRAM that is best known for being used for chips that cost as much as a small car?
 
They're on the "don't have Intel's engineering or process tech" curve, and they weren't going to spend the billions of dollars Intel invests in its 22nm process and fabs.


What was doable? An IBM 32nm SRAM sized to 128MB, or 32nm eDRAM that is best known for being used for chips that cost as much as a small car?

Whats Haswell's bandwidth to main memory? I'm guessing its no where near Durango's (nevermind Orbis), which would dictate a larger on chip memory pool.
 
Status
Not open for further replies.
Back
Top