Xbox One (Durango) Technical hardware investigation

expletive · Jan 23, 2013

XpiderMX said:
Why not to put those DSP in Kinect (inside)?

I guess if MS is guaranteeing Kinect with every system its much easier (cheaper) to build the "power" into the console than increase the complexity and cost of an external peripheral.

zed · Jan 23, 2013

msia2k75 said:
Less than 4GB and the machine is DOA.

why?
2GB (excluding OS) should be plenty, thats like 8x + the current generation 256MB
I think I said years ago, next gen 2GB

DuckThor Evil · Jan 23, 2013

This generation has 512MB

Love_In_Rio · Jan 23, 2013

I must say i like the xbox architecture design. It is not simple but seems elegant. I supposse it fills my exhotic hardware itch. Although i would like its performance to be good ( and believe to those who claim its efficiency will reach the 3 tflops of other gpus as i think we need heavy players in the graphics world to mantain this hobby as we like it ) I am afraid this time Sony could be left a little alone in the graphic whoreness department...

XpiderMX · Jan 23, 2013

Do we know the Durango (beta) dev kits specs?

psurge · Jan 23, 2013

ERP - would it make any sense to have the ESRAM be a large LLC, perhaps with extensions to make parts of it useable as a scratch pad if desired, and with the CPU/GPU communicating through it? In other words, something along the lines of how Intel interfaces it's GPU and CPUs... or as a dev would you prefer manual management?

3dilettante · Jan 24, 2013

With a 32 MB pool and 64B lines, that would require 512K tag entries.
Depending on the arrangement, it's easy to require over a MB in cache tags alone.
That would be atypical for a non-server chip (a high-end one at that), but this is potentially an unusual situation.

MrFox · Jan 24, 2013

Do we expect the kinect port to be a custom thing, or could it be a simple dedicated USB 3.0 that was modified to provide enough current? I was thinking maybe they decided they could do a much better job with something custom, so they could enable longer wire lengths, and maybe even help lower the latency. USB is a messy protocol

Brad Grenz · Jan 24, 2013

I assume it's similar to the Kinect port on the 360 Slims. Physically modified USB port.

Silent_Buddha · Jan 24, 2013

That's what I would imagine as well. Except in this case it would likely be USB 3.0 combined with increased power delivery using a different physical port so you don't accidentally plug in something else.

Regards,
SB

Gubbi · Jan 24, 2013

3dilettante said:
With a 32 MB pool and 64B lines, that would require 512K tag entries.
Depending on the arrangement, it's easy to require over a MB in cache tags alone.
That would be atypical for a non-server chip (a high-end one at that), but this is potentially an unusual situation.

MRTs and texture data should have reasonable spatial coherence, so you can probably us a longer cache line length without losing too much efficiency.

CPU code would have to be optimized for this cache line size; Data structures at, or just below, 512 bytes, aligned on 512 byte boundaries.

32MB LLC with 512 byte lines with 8 sectors would use 512KB for tags.

Edit: I could imagine controls for locking down fractions of the LLC for ROP, texture or CPU use.

Cheers

Brimstone · Jan 24, 2013

Rangers said:
i've been told the esram is in fact very low latency.

That most likely results in higher power and heat.

Sounds strange. A small pool RLDRAM3 to compliment the 8gig ddr3 pool would be more affordable.

Xenio · Jan 24, 2013

Brimstone said:
That most likely results in higher power and heat.

this is surely taken in account by amd and microsoft, the main goal is a balanced machine
I think that this heat is overstimate in comparison to cpu, gpu and dissipation capacity

Love_In_Rio · Jan 24, 2013

Well, Arthur Gies in neogaf comments that the improvement in efficiency doesn´t come from the ESRAM, so if true forget about the low latency. He says it comes from the way the GPU simds are managed and from real-time asset compression/decompression(?¿?). Wasn´t GCN architecture suppossed to greatly increase the efficiency of the vector units?. How could this be increased even more?.

Suppossing all of this has any sense:

Would it be possible to make the GPU out-of-order and capable of execute the wavefront instructions not in-order?. ( If then there is a block inside the GPU that is not shown in the leaked graphic ).

Asset compression/decompression could refer to increasing the bandwitdth to memory pools in practice?.

Brimstone · Jan 24, 2013

Xenio said:
this is surely taken in account by amd and microsoft, the main goal is a balanced machine
I think that this heat is overstimate in comparison to cpu, gpu and dissipation capacity

Sure...mabe its Thyristor (t-ram). Thats what global foundries has been working on.

Squilliam · Jan 24, 2013

Could the listed GPU specs be the 'exposed' GPU not the actual GPU? I was thinking that if they have 3GB supposedly set aside for various agendas and all or part of 2 cores as well, they could have also partitioned off part of the GPU as well for processing purposes. Perhaps the GPU really has 14 or 16 CUs but they are reserving some for Kinect?

3dilettante · Jan 24, 2013

Love_In_Rio said:
Well, Arthur Gies in neogaf comments that the improvement in efficiency doesn´t come from the ESRAM, so if true forget about the low latency. He says it comes from the way the GPU simds are managed and from real-time asset compression/decompression(?¿?). Wasn´t GCN architecture suppossed to greatly increase the efficiency of the vector units?. How could this be increased even more?.

Vector length is still quite long, and that can make it a poor fit for problems with naturally smaller granularity. It's also more likely the longer the SIMD vector that branch divergence can affect performance, and more complex code can increase the number of paths a single wavefront will need to loop over.
That's an area that could stand for improvement, although a number of the possible fixes like varying the vector length or coalescing divergent threads are significant modifications to the hardware.
At the very least, there's more storage nearby to play with, possibly.

At a higher level, there may be changes in how wavefronts are scheduled and sent to the CUs. The handoff from the front end to the CU arrays seems to be one area that could improve, as well as conflicts over common global resources like the GDS.

Would it be possible to make the GPU out-of-order and capable of execute the wavefront instructions not in-order?. ( If then there is a block inside the GPU that is not shown in the leaked graphic ).

At least with GCN, this is going to run into measures already defined to provide some parallelism.
By default, the vector memory instructions and the throughput design of the CU are already very good at generating memory traffic--one of the primary benefits of OoO, and the ISA provides a limited form of software-guided runahead for exports and memory operations. For example, up to 16 memory instructions can be fired off before the wavefront has to stall.
These counters may be less effective or broken if the CU goes out of order, since the compiler's statically determined wait counts are based on sequential issue and completion of the instruction stream.

bkilian · Jan 24, 2013

Squilliam said:
Could the listed GPU specs be the 'exposed' GPU not the actual GPU? I was thinking that if they have 3GB supposedly set aside for various agendas and all or part of 2 cores as well, they could have also partitioned off part of the GPU as well for processing purposes. Perhaps the GPU really has 14 or 16 CUs but they are reserving some for Kinect?

Even if this is true, it would be mostly irrelevant for purposes of game performance.

Scott_Arm · Jan 24, 2013

bkilian said:
Even if this is true, it would be mostly irrelevant for purposes of game performance.

The kinect skeletal stuff was processed on the gpu, right? Not that I believe there are more CUs than the rumours suggest.

expletive · Jan 24, 2013

bkilian said:
Even if this is true, it would be mostly irrelevant for purposes of game performance.

Well it does in so far as Kinect 2.0 would now be 'free' as opposed to the x% of system power we were all mentally reserving for it instead.

Xbox One (Durango) Technical hardware investigation

expletive

zed

DuckThor Evil

Love_In_Rio

XpiderMX

psurge

3dilettante

MrFox

Deludedly Fantastic

Brad Grenz

Philosopher & Poet

Silent_Buddha

Gubbi

Brimstone

B3D Shockwave Rider

Xenio

Love_In_Rio

Brimstone

B3D Shockwave Rider

Squilliam

Beyond3d isn't defined yet

3dilettante

bkilian

Scott_Arm

expletive

Similar threads