Xbox One (Durango) Technical hardware investigation

SKYSONY · Feb 5, 2013

Does anyone knows what this means?

VGleaks said:
All GPU memory accesses on Durango use virtual addresses, and therefore pass through a translation table before being resolved to physical addresses. This layer of indirection solves the problem of resource memory fragmentation in hardware—a single resource can now occupy several noncontiguous pages of physical memory without penalty.

Virtual addresses can target pages in main RAM or ESRAM, or can be unmapped. Shader reads and writes to unmapped pages return well-defined results, including optional error codes, rather than crashing the GPU. This facility is important for support of tiled resources, which are only partially resident in physical memory

Scott_Arm · Feb 5, 2013

It means basically what it says. You can define pages of memory that can reside in both main and ESRAM. The page looks like one contiguous list of memory addresses, but the physical memory addresses are not. A translation table maps the virtual addresses to the physical addresses.

This has been around forever on the CPU side. I'm not familiar enough to know how long it's been around with GPUs, but it looks like a standard feature of AMD GCN.

The only relevance I can see in them mentioning it, in whatever doc/presentation this came from, is that it puts more weight to the rumours of a multi-tasking OS.

XpiderMX · Feb 5, 2013

Can the Data Move Engines be related to HSA?

XpiderMX · Feb 6, 2013

which sounds like a variant of 1T-SRAM in which case it will be in the 20-50mm^2 range?

Why it "sound" like a 1t-sram?

We know that it is a low latency memory:

The difference in throughput between ESRAM and main RAM is moderate: 102.4 GB/sec versus 68 GB/sec. The advantages of ESRAM are lower latency and lack of contention from other memory clients—for instance the CPU, I/O, and display output. Low latency is particularly important for sustaining peak performance of the color blocks (CBs) and depth blocks (DBs).

Acert93 · Feb 6, 2013

Because straight up SRAM is pretty large and it would be quite odd to call standard SRAM as "ESRAM". AMD, through ArtX, has experience with T1-SRAM going back over a decade.

As for low latency, I repeat:

Numbers or it never happened.

liolio · Feb 6, 2013

Acert93 said:
Depending on the cost of Kinect 2, packed in software, any HW BC, and the cost of the controller it looks like MS will be in a much better position on cost than last go around.

And then there is the crazy stupid Cell Phone pricing agreement plan--expect to see $99 Xbox 3's with 2-3 year contracts of XBL for $15/mo.

Which will position MS where they want to be: a small, affordable console that is a full blown set top box/media all in one experience.

Well that is pretty much my line of thinking, to me the system is set to be priced aggressively /replace previous SKU.

With MSFT cutting corners they need a price advantage, I hope they won't waste it by making Kinect2 standard with the price implication it has.

The idea is to create intensive for a lot of people to get it even though interest for kinect2 is mild , so starting with a good kinect2 user base but I don't think it would be a good idea to pass on the core gamers on a budget that could get attracted to the system for its price.

Other than that, with the rumors speaking of massive amount of RAM and resources locked down for the OS, I start to believe that MSFT could indeed try to sell people a wiiUmote type of controller, though not enforcing it on the costumers ala Nintendo.

If you think of it could free the TV, but go further than what Nintendo did: you could use the xbox services (netfliw, browser with kinect assistance or not) on the main tv while somebody play a AAA game on the wiiUmote like device. It could also work the other way around though with people having more and more tablet that is a lesser win> freeing the TV when needed, with the plus of letting parents/people still use the services the system offers is a big win, you don't have to choose between using cable/TV or the Xbox, you play and consume content simultaneously through the same hub.

It would be nice if they could support more than one of those devices.
The thing is it is a lot less risky than Nintendo move with the WiiU as it would be a peripheral (though could get included in some SKU) but it would not be mandatory or need special implementation in games, it would just be a secondary display pretty much as Shield is set to act while you stream games from your pc.
Another thing is that, it would enable a lot of touch games on the system (assuming the system runs Metro apps and I think it should).

Pricing structure could be:
Core:1 pad+ HDD 279$
Pro: 1 pad, kinect, bigger HDD, a quarter of Gold member ship, 349$
Deluxe: Pro + a surface controller, 449$

rpg.314 · Feb 6, 2013

Gubbi said:
Speculation: The data move engines could be used for decoding high compression ratio textures (JPEG XR) to fixed compression ratio formats.

The data move engines would not only transcode textures from permanent storage, but also encode dynamically generated textures. The L1 and L2 texture caches both store compressed textures. To take full advantage of the on-die memory, the data move engines might encode shadow maps, environment maps and imposters on the fly.

In particualar, hardware encoding of BC6 and 7 formats would constitute 'secret sauce'.

Cheers

I think data move engines are more for moving rendertargets from ESRAM to DDR and compressing them on the fly. Though that doesn't preclude texture decompression.

LightHeaven · Feb 6, 2013

XpiderMX said:
Can the Data Move Engines be related to HSA?

From the little that's been released, the following tasks is the only things i could think of:

- Tiling of data (not just the render targets, but also stuff like texture tiling, similar to a mega texture but instead of a gigantic texture it will tile regular textures that reside on the memory)
- Scattering the tiled data between both pools in advance, so when the gpu needs to read a texture for instance, tiles from it will be at both pools, effectively increasing the bandwidth, and specially making sure that latency sensitive data to be ready on the esram.
- Reading and writing compressed data, with somewhat aggressive methods.

Perhaps they could also work during the scan out phase too, reading the tiles from both memory pools, directly to the display out, instead of copying all them to a single buffer first...

Brad Grenz · Feb 6, 2013

XpiderMX said:
Why it "sound" like a 1t-sram?

We know that it is a low latency memory:

On chip embedded DRAM would have lower latency than external memory. Otherwise IBM wouldn't bother putting gobs of EDRAM as L3 cache on Power 7 chips. "Low" is a relative term. Without knowing an actual point of comparison we can't conclude anything.

LightHeaven said:
From the little that's been released, the following tasks is the only things i could think of:

- Tiling of data (not just the render targets, but also stuff like texture tiling, similar to a mega texture but instead of a gigantic texture it will tile regular textures that reside on the memory)
- Scattering the tiled data between both pools in advance, so when the gpu needs to read a texture for instance, tiles from it will be at both pools, effectively increasing the bandwidth, and specially making sure that latency sensitive data to be ready on the esram.
- Reading and writing compressed data, with somewhat aggressive methods.

Perhaps they could also work during the scan out phase too, reading the tiles from both memory pools, directly to the display out, instead of copying all them to a single buffer first...

Or maybe they're just DMAs since they're where the DMAs should be and MS renamed everything else.

Scott_Arm · Feb 6, 2013

We do not know exactly what the Data Move Engines are, or how the ESRAM is setup, but are there any potential benefits for the implementation of Partially Resident Textures?

Squilliam · Feb 6, 2013

I haven't been able to keep up with the discussion sorry but I was wondering whether Durango could possibly use DDR4? Has the door closed on this memory architecture?

Inuhanyou · Feb 6, 2013

Much too early, won't be out in time for the launch. By the time DDR4 is ready for full production Durango will have already been well into manufacturing and getting ready to ship in the holiday period. Ship has sailed.

That being said, what bandwidth durango does have seems well appropriated for what it is used for. And the ESRAM should make a huge difference if the rumors its flexibility and ability to render tiling much easier to implement are correct.

Hecatoncheires · Feb 6, 2013

pjbliverpool said:
Lol almost every single one of your posts is some kind of variation on this one. You should start a HSA fan club!

There is one already. Sony joined it a couple of days ago.

pjbliverpool said:
For the record though, while HSA - if these consoles truly are fully HSA - will allow gameplay effecting GPGPU algorithms to be incorporated into games, it's not going to make much difference to the actual graphics of the game which is what most people seem to be focussed on when comparing the relative power of the GPU's. So it's not really correct to say the GPU in PS4 will be more powerful than the equivalent GPU in a PC because of HSA. The proper comparison would be between the CPU's. i.e. thanks to the fusion like design, the Jaguar based CPU's in some respects have performance far beyond a PC CPU sporting ~100 GLFOPS. Assuming that PC CPU doesn't also have access to HSA features.

You say that as if being able to play that game somehow has something to do with Temash being HSA ready. It has nothing to do with that. Dirt runs on that APU because it has an incredibly powerful (by tablet standards) GCN based GPU along with a decent x86 CPU. Dirt likely makes no use of HSA whatsoever

Everyone is splitting up the APUs of the new consoles into it's single parts (100 GFLOPS CPU, 12/18 CU GPU) but I think that's inappropriate. It's one single processor, every part, be it CPU or iGP, works as a gearwheel and not as a complete new engine. The HSA allows for much faster and much more efficient communication between the single elements of the APU in contrast to a traditional designed system. This will be a important speedup for next gen and it is one of the reason why a single APU with a 5W TDP (!!!) and a thick DirectX API can render Dirt: Showdown fluently in 1920x1080 while a 80W current gen game console can't. I find this most impressive! This is progress! Temash is a paragon for efficient computing and a step in the right direction in a world where a single high end desktop graphics card has a TDP of 200W and more. Some people may be disappointed with the specs but I'm very optimistic that a console with a TDP of 150W can deliver a very enjoyable next gen experience. And this is whithout entering the complete new HSA algorithms, that allow CPU and iGP to work collaboratively, into the equation.

Inuhanyou · Feb 6, 2013

Is HSA really so revolutionary? I'd like to think so...but maybe a few others could chime in on the subject

Hecatoncheires · Feb 6, 2013

It's the next big step in the semiconductor industry:

single cores -> multi cores -> hetero cores -> cloud cores

Ever wonder why Intel is packing an iGP with each desktop CPU?

Inuhanyou · Feb 6, 2013

What i mean is, does it really have such a big impact on game performance as you are saying? Design wise, i can believe, efficiency wise, i'm simply wondering.

Hecatoncheires · Feb 6, 2013

It's very efficient.

Do you remember when the first dual cores emerged? Instead of having one single thread you divide your code into two threads and it will receive a sweet speedup. A Core2Duo with two 1.8Ghz threads was able to ditch a 3Ghz Pentium 4 easily. The HSA is basically the same, but instead of having just more cores, you integrate different kinds of cores into a single processor. Ony type of core is great for runtime intensive tasks, the other type is better at paralellizable tasks.

But having different kinds of cores integrated in a single processor doesn't make it automatically a good hetero-core. The shrinked processors of the XBox360 are integrated in a single die, which alone let them receive a latency and bandwidth speedup that had to be choked by Microsoft, but it's a very, very bad hetero core. The single elements still act like traditional homogeneous processors. This is physical integration only. The CELL in the PlayStation 3 on the other hand is a true hetero-core, it's a physical as well as an architectural integration which allows the different types of cores to work in concert. The first HSA APUs like 40nm Llano were physical integration only, but the latest 28nm HSA APUs consisiting of GCN and Jaguar/Steamroller are a true architectural integration. The single elements can work together as a fused system.

Inuhanyou · Feb 6, 2013

And this is...what we are speculating is in Orbis/Durango because of their Jaguar and GCN SoC architectures?

What is the difference between SoC's and APU's anyway?

Hecatoncheires · Feb 6, 2013

...and because Sony joined the HSA fanclub.

Inuhanyou said:
What is the difference between SoC's and APU's anyway?

Just as I said: Physical integration and architectural integration. XBox360 SoC vs CELL.There is a difference between just placing two types of cores on a single die and designing the architectures so they can work in concert.

Inuhanyou · Feb 6, 2013

Ah i see, so SoC = components bunched together, APU = component Hivemind, basically.

Would an MCM like the Wii U uses be somewhat similar or is that something completely different?

Xbox One (Durango) Technical hardware investigation

SKYSONY

Scott_Arm

XpiderMX

XpiderMX

Acert93

Artist formerly known as Acert93

liolio

Aquoiboniste

rpg.314

LightHeaven

Brad Grenz

Philosopher & Poet

Scott_Arm

Squilliam

Beyond3d isn't defined yet

Inuhanyou

Hecatoncheires

Inuhanyou

Hecatoncheires

Inuhanyou

Hecatoncheires

Inuhanyou

Hecatoncheires

Inuhanyou

Similar threads