Xbox One (Durango) Technical hardware investigation

Status
Not open for further replies.
9/8 ECC on the 32MB SRAM would add up everything to 47MB, but then they wouldn't have said it's 8MB per bank in one slide, and then count the ECC bits in the other.
 
Do we know what 28nm process Kabini is manufactured on?

I'm just wondering if it's 28nm HPM like the Xbox One SOC. I can't find anything on it. At least public, TSMC said that Snapdragon 800 was the first SOC to use the process.
 
9/8 ECC on the 32MB SRAM would add up everything to 47MB, but then they wouldn't have said it's 8MB per bank in one slide, and then count the ECC bits in the other.

That's not necessarily true. Since the catch-all number could include anything it's not something they'd really be held to, but if they pass off ECC in specific figures people will probably cry foul when they learn it's false. It would have been okay if they said 8MB + 1MB ECC though.

What makes it less likely is that you don't tend to recalculate bytes based on ECC.. you just consider it to have 9-bit bytes. If they said something like 376 Mbits instead it'd be easier to believe they're including ECC.

McHuj said:
Do we know what 28nm process Kabini is manufactured on?

I'm just wondering if it's 28nm HPM like the Xbox One SOC. I can't find anything on it. At least public, TSMC said that Snapdragon 800 was the first SOC to use the process.

Would not surprise me in the least if Temash/Kabini were HPM. Some of the design presentations go into detail about what the transistor mix is like and it's pretty varied. Where did TSMC say that Qualcomm was first?
 
The uncore bandwidth to the CPU section does look to be higher than some other Jaguar chips. Tweaking the uncore and L2 interface for higher bandwidth might be the reason.

They need to support coherency checks at full tilt, - even if the CPU block can't provide bandwidth enough for all the CUs; Cachelines need to be invalidated on GPU stores and need to serve data for read requests for lines is in a modified state. The 30GB bandwidth figure from the CPU block is in principle enough for four CUs for GPU compute with every access hitting the CPU block (which would be a data management FAIL for the developer)

I could also imagine support for packing/unpacking textures in various formats in the SIMD units, similar to what the 360 has.

Cheers
 
56GB/s DRAM access from GPU for non-CPU cache coherent peak BW including coherent BW. Coherent BW is 30GB/s peak.

56GB/s or 30GB/s+26GB/s, or some combination between those. That doesn't seem like a lot to feed the eSRAM, what happened to the old 68GB/s number and where did the missing 12GB/s go?

Edit: Maybe I just can't see the numbers correctly, the 56 is really a 68?
 
That's not necessarily true. Since the catch-all number could include anything it's not something they'd really be held to, but if they pass off ECC in specific figures people will probably cry foul when they learn it's false. It would have been okay if they said 8MB + 1MB ECC though.

What makes it less likely is that you don't tend to recalculate bytes based on ECC.. you just consider it to have 9-bit bytes. If they said something like 376 Mbits instead it'd be easier to believe they're including ECC.



Would not surprise me in the least if Temash/Kabini were HPM. Some of the design presentations go into detail about what the transistor mix is like and it's pretty varied. Where did TSMC say that Qualcomm was first?

Slightly ot:
http://www.tsmc.com/tsmcdotcom/PRListingNewsAction.do?action=detail&newsid=7581&language=E
 
They need to support coherency checks at full tilt, - even if the CPU block can't provide bandwidth enough for all the CUs; Cachelines need to be invalidated on GPU stores and need to serve data for read requests for lines is in a modified state. The 30GB bandwidth figure from the CPU block is in principle enough for four CUs for GPU compute with every access hitting the CPU block (which would be a data management FAIL for the developer)

I could also imagine support for packing/unpacking textures in various formats in the SIMD units, similar to what the 360 has.

Cheers

I thought the 30GB of coherent bandwidth is referring to bandwidth available to the "system" ram. A cache hit goes over a bus (probably onion) with only 10-15 GB of bandwidth.
 
56GB/s DRAM access from GPU for non-CPU cache coherent peak BW including coherent BW. Coherent BW is 30GB/s peak.

56GB/s or 30GB/s+26GB/s, or some combination between those. That doesn't seem like a lot to feed the eSRAM, what happened to the old 68GB/s number and where did the missing 12GB/s go?

Edit: Maybe I just can't see the numbers correctly, the 56 is really a 68?

What? I thought it was 30 GBs of coherent bandwidth to system ram and 38 GBs (I guess) over what I am guessing is garlic?
 
363mm^2 isn't bad. Makes the ESRAM decision look better.

Also 204 GB/s is monstrous. This thing will definitely have an edge in some areas.
 
Thanks. That angle really distorted it.

From the same PC World article.

One massive chip
Physically, the system-on-a-chip at the heart of the Xbox One is 363 square millimeters. But the real whopper is the amount of logic integrated within it: 5 billion transistors. Although Wikipedia isn’t necessarily the final arbiter, the Xbox One is possibly the largest chip manufactured to date
 
Status
Not open for further replies.
Back
Top