Xbox One (Durango) Technical hardware investigation

pjbliverpool · May 12, 2014

mosen said:
ON PS4 Onion+ bus which shares Onion's 10GB/s and bypasses the GPU caches, is coherent. Onion (20GB/s, 10GB/s read and 10GB/s write) Which snoops CPU L2/L1 caches isn't coherent. Onion+ and Onion run over the same I/O controller so they can't access it at the same time. Also CPU bandwidth is 20GB/s on PS4.

On XB1, the GPU has a coherent read/write path (30GB/s) to the CPU’s L2 caches and to DRAM. The CPU requests do not probe any other non-CPU clients, even if the clients (GPU for example) have caches. Coherent read-bandwidth of the GPU is limited to 30 GB/s when there is a cache miss, and it’s limited to 10–15 GB/s when there is a hit. CPU bandwidth is 30GB/s on XB1, they beefed up CPU bandwidth.

The PS4 solution is exactly like the Kaveri.
http://forum.beyond3d.com/showthread.php?t=64406
http://share.csdn.net/uploads/5232b691522ba/5232b691522ba.pdf
http://pc.watch.impress.co.jp/img/pcw/docs/632/794/html/06.jpg.html
http://pc.watch.impress.co.jp/docs/column/kaigai/20140129_632794.html

You don't need 2 more DMA for data transfer (AMD PC GPUs use 2 of them) and two of them are useful for texture streaming. Two of display planes are for games and one of them is for system (PS4 has one for game and one for system). Kinect uses most of audio block right now but in future they could free up part of the resources (DSPs) for games (SHAPE is accessible by games).

Awesome links thanks, that makes things much clearer.

My only question around this is how that relates to this slide:

I'm probably just misunderstanding something here but doesn't that slide suggest that the CPU and GPU caches are coherent where-as they are not if the CPU can't snoop the GPU cache?

mosen · May 12, 2014

Shifty Geezer said:
I think Brad's point is that XB1 has three planes instead of two because of the OS overlays. The DX feature you point to talks of two planes and native 2D UIs over 3D renders, requiring only 2 planes. XB1's customisation added a third plane for the OS docked app feature.

Although I've no idea where this conversation is headed now or what it's trying to achieve. People are making platform comparisons without a clear indication (to me at least) as to what those comparisons are trying to show. Is there a question being answered here?

I think Brad points is that XB1 has more modifications because of it's shortcomings. But I'm saying that every modification in XB1 could be used in any other system to make that system better. Like it's coherent bandwidth (which is important for GPGPU or rendering) or CPU bandwidth that many users thought that 20GB/s would be enough but official patent from AMD says:

During operation, an external processor, reading and writing the shader core context state (which is required when saving and restoring GPU context state) can be severely limited (1) by the external processor's own read/write capabilities. (2) It can also be limited by the bandwidth between the external processor and the GPU. These two limitations can result in extremely long context switching times. This problem also exists for the other non-shader resource components associated with the GPU context state save and restore process.

So They beefed up the CPU and CPU-GPU coherent bandwidth. They designed SHAPE and 4 DSPs on system and they could be beneficial in long term even for games. They put 4 DMEs and two of them could help for texture streaming and "Swizzled Texture CPU Access" which is one of the DX12 API changes.

Multitasking is possible with only two display planes (like PS4). There is no need to use third display plan unless you wanted to do something different. You think 2nd DP (for games) is only for 2D IUs over 3D renders?!

The three display planes are independent in the following ways, among others:

They can have different resolutions.

They can have different precisions (bits per channel) and formats (float or fixed).

They can have different color spaces (RGB or YCbCr, linear or sRGB).

Each display plane can consist of up to four image rectangles, covering different parts of the screen. The use of multiple screen rectangles can reduce memory and bandwidth consumption when a layer contains blank or occluded areas.

The display hardware contains three different instances of various image processing components, one per display plane, including:

A hardware scaler.

A color space converter.

A border cropper.

A data type converter.

http://www.vgleaks.com/durango-display-planes/

Now read this article from Microsoft research:

http://research.microsoft.com/pubs/176610/foveated_final15.pdf

This could help games even without eye tracking.

pjbliverpool said:
Awesome links thanks, that makes things much clearer.

My only question around this is how that relates to this slide:

I'm probably just misunderstanding something here but doesn't that slide suggest that the CPU and GPU caches are coherent where-as they are not if the CPU can't snoop the GPU cache?

Your welcome.

No, CPU and GPU caches can see same thing (updated data) on coherent memory. GPU could snoop CPU caches but CPU can't snoop GPU caches or other client caches.

Shifty Geezer · May 12, 2014

mosen said:
You think 2nd DP (for games) is only for 2D IUs over 3D renders?!

Yes. It enables 1080p UIs over 900p or lower renders. That's exactly the described use in DX description and everything we've talked about for this unity. Three planes : One for 3D, one for UI, one for system. That's how it's been described by the XB1 architects IIRC. Devs could do something novel with it, although a moving rectangle of different res on the screen is of limited use, but it's overall design was to add an OS layer on top of the future-target dual planes.

Edit: It's worth noting that other (mobile) hardwares have far more complex compositing hardwares AFAIK. For a flexible solution, XB1's solution isn't really flexible at all. If it was intend as a video-FX processor type thing, it'd have more features like rotation. It's just there to overlay three screens AFAICS.

Betanumerical · May 12, 2014

mosen said:
They designed SHAPE and 4 DSPs on system and they could be beneficial in long term even for games. They put 4 DMEs and two of them could help for texture streaming and "Swizzled Texture CPU Access" which is one of the DX12 API changes.

Texture swizzling is nothing new its been in video cards for a long time its nothing really of note, Microsoft liked to rename everything this gen though.

mosen · May 12, 2014

Shifty Geezer said:
Yes. It enables 1080p UIs over 900p or lower renders. That's exactly the described use in DX description and everything we've talked about for this unity. Three planes : One for 3D, one for UI, one for system. That's how it's been described by the XB1 architects IIRC. Devs could do something novel with it, although a moving rectangle of different res on the screen is of limited use, but it's overall design was to add an OS layer on top of the future-target dual planes.

Edit: It's worth noting that other (mobile) hardwares have far more complex compositing hardwares AFAIK. For a flexible solution, XB1's solution isn't really flexible at all. If it was intend as a video-FX processor type thing, it'd have more features like rotation. It's just there to overlay three screens AFAICS.

That is probably one of the uses of display planes which should be first thing that comes to mind. All of the display planes are the same and all of them have the same ability. Architects didn't say that it's only for UI overlay.

But why rotate? They only rendered 3 (or 2 in this case) static layers with defined shape. They did this with an HPZ800 PC with Intel Xeon CPU (E5640 at 2.67GHz), and an NVidia GeForce GTX 580 GPU. It could be useful for VR (or maybe AR) and maybe kinect could do some eye tracking. I don't know if it's possible to do eye tracking with Kinect or not but there are some works that used kinect for eye tracking like this one:

Our proposed gaze estimation model uses only simple capturing devices, a HD webcam and a Kinect, but can achieve real-time and accurate gaze estimation, as demonstrated by the presented experimental results.

http://doras.dcu.ie/19588/1/gazetracking.pdf

Even without eye tracking it should have some benefits except UI.

http://research.microsoft.com/apps/pubs/default.aspx?id=176610

Betanumerical said:
Texture swizzling is nothing new its been in video cards for a long time its nothing really of note, Microsoft liked to rename everything this gen though.

I didn't say it's new but it wastes CPU time and memory for decompression, didn't it? If your answer is yes then on XB1 it could be used for free (or nearly for free).

Shifty Geezer · May 12, 2014

mosen said:
Architects didn't say that it's only for UI overlay.

You're right, they didn't say it's only for that, but they explained that was the reason for its inclusion. Might devs find other users? Perhaps. Are they allowed to? Sure. Was the intention of the display planes to do anything more than keep crisp UIs over 3D? Nope.

Interview said:
Digital Foundry: With the recent disclosure that Ryse is running at "900p" and Killer Instinct at 720p, and that launch titles were profiled to balance the system, what are the limiting factors that prevent these tiles running at full 1080p?
Andrew Goossen: We've chosen to let title developers make the trade-off of resolution vs. per-pixel quality in whatever way is most appropriate to their game content. A lower resolution generally means that there can be more quality per pixel. With a high-quality scaler and antialiasing and render resolutions such as 720p or '900p', some games look better with more GPU processing going to each pixel than to the number of pixels; others look better at 1080p with less GPU processing per pixel. We built Xbox One with a higher quality scaler than on Xbox 360, and added an additional display plane, to provide more freedom to developers in this area. This matter of choice was a lesson we learned from Xbox 360 where at launch we had a Technical Certification Requirement mandate that all titles had to be 720p or better with at least 2x anti-aliasing - and we later ended up eliminating that TCR as we found it was ultimately better to allow developers to make the resolution decision themselves. Game developers are naturally incented to make the highest-quality visuals possible and so will choose the most appropriate trade-off between quality of each pixel vs. number of pixels for their games.

http://www.eurogamer.net/articles/digitalfoundry-the-complete-xbox-one-interview

Even without eye tracking it should have some benefits except UI.

http://research.microsoft.com/apps/pubs/default.aspx?id=176610

You've cited a paper on foveated rendering, which requires eye tracking unless you just fix the player on the centre portion of the screen. You're also talking about experimental techniques which are unlikely to have been a major consideration when MS designed their HW. I doubt the HW guys said, "Let's put in a third display plane just in case there's something cool that can be done with it," especially when there was a clearly defined use for the third display plane in their box requirements (on screen apps). If there was an interest in putting in flexible display hardware for doing more than just UI+OS overlays, it should extend to screen rotation, transforms and blends, to allow stuff like low-overhead UI effects and 2D screen effects. As far as display-level hardware goes, it's simple stuff ideally suited to 3D+UI+OS and not ideally suited for much else. There's not really much of an argument to be made against that, when the designers said that's why they put the feature in and that's what the hardware is suited for!

mosen · May 12, 2014

Shifty Geezer said:
You're right, they didn't say it's only for that, but they explained that was the reason for its inclusion. Might devs find other users? Perhaps. Are they allowed to? Sure. Was the intention of the display planes to do anything more than keep crisp UIs over 3D? Nope.

My answer to last question is "YES". Why? because there would be no reason to put HW scaler (or much other stuff) on the display plane that only supposed to render UI at native resolution. This isn't first time that a company says something different from their intention (always on LED bar on DS4 is another example).

You've cited a paper on foveated rendering, which requires eye tracking unless you just fix the player on the centre portion of the screen. You're also talking about experimental techniques which are unlikely to have been a major consideration when MS designed their HW. I doubt the HW guys said, "Let's put in a third display plane just in case there's something cool that can be done with it," especially when there was a clearly defined use for the third display plane in their box requirements (on screen apps). If there was an interest in putting in flexible display hardware for doing more than just UI+OS overlays, it should extend to screen rotation, transforms and blends, to allow stuff like low-overhead UI effects and 2D screen effects. As far as display-level hardware goes, it's simple stuff ideally suited to 3D+UI+OS and not ideally suited for much else. There's not really much of an argument to be made against that, when the designers said that's why they put the feature in and that's what the hardware is suited for!

Eye-tracking may be possible with Kinect 2 (1080p and ToF at higher resolution). Also their intention isn't only about UI, See this:

This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Furthermore, the claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure.

Read more at: http://www.faqs.org/patents/app/20110304713#ixzz31XDHnZDW

If you read the whole patent there is a lot of talks about left-eye & right-eye discussion which should be for some kind of VR or AR or even a 3D feel on flat screen (another invention from Microsoft Research team which is cool). Display planes also supports blend ( independently for each plane).

http://www.faqs.org/patents/imgfull/20110304713_03
http://www.faqs.org/patents/imgfull/20110304713_04
http://www.faqs.org/patents/imgfull/20110304713_05

Yet I don't understand benefits of having rotate and transform in DPs.

dobwal · May 12, 2014

I've read some of the patent and the most interest parts are...

Making dynamic resolution less obstrusive.

The App HUD plane then comprises an overlay to the App Main plane. The addition of such a source could potentially allow for eradication of frame drops caused by the GPU. In other words, the App Main source may change rendering dimensions on a frame-by-frame basis, with support for drawing the App HUD elements at a constant resolution on top of the scaled game rendering. In this way, developers can seamlessly scale down their render target dimensions as they get closer to running over their GPU budget so that they can maintain a consistent frame rate. This may be achieved since GPU render times tend not to change significantly from frame-to-frame. The HUD elements remain at a consistent resolution since it can be visually obvious and distracting to a user when those are dynamically scaled differently. The application may utilize the App HUD overlay for any output resolution, which further utilizes a scaler as shown.

3D display agnostic format.

further shows computing device 20 outputting to various displays 36. Examples of displays 36 include computer monitors, televisions, 3D displays, etc. As such, each display may have different specifications as to how content is to be displayed on the display...Computing device 20 is configured to output video content to each display, wherein the content is configured for that display. As such, a content developer need not be aware of the display specifications, and may develop applications that render to buffers that are similarly configured. For example, computing device 20 may be configured to take display data (e.g., produced by graphics cores and/or other system components) and structure the data in frame buffer memory in a display-agnostic format. The computing device 20 may then dynamically pack the data into a frame buffer based on the display parameters associated with the display to which the computing device is outputting, as described in more detail with reference to FIG. 2...In this way, an application need not know the specific interface format for the display. As such, the content is insulated from changes in output standards, 3D standards, etc.

mosen · May 13, 2014

Lots of cool patent from the same team (kinect & sound):

DEPTH CAMERA WITH INTEGRATED THREE-DIMENSIONAL AUDIO

A three-dimensional audio system includes a depth camera and one or more acoustic transducers in the same housing. Further, the same housing also houses logic for determining a world space ear position of a human subject observed by the depth camera. The logic also determines one or more audio-output transformations based on the world space ear position. The one or more audio-output transformations are configured to produce a three-dimensional audio output configured to provide a desired audio effect at the world space ear position.

Read more: http://www.faqs.org/patents/app/20130208900#ixzz31Xv8Jiyd

SURROUND SOUND SIMULATION WITH VIRTUAL SKELETON MODELING

A method for providing three-dimensional audio includes determining a world space ear position of a human subject based on a modeled virtual skeleton. The method further includes providing three-dimensional audio output to the human subject via an acoustic transducer array including one or more acoustic transducers. The three-dimensional audio output is configured such that channel-specific sounds appear to originate from corresponding simulated world speaker positions.

Read more: http://www.faqs.org/patents/app/20130208926#ixzz31XvkWOVz

SKELETAL MODELING FOR POSITIONING VIRTUAL OBJECT SOUNDS

Providing three-dimensional audio includes determining a world space ear position of a human subject based on a modeled virtual skeleton. A world space sound source position is determined such that a spatial relationship between the world space sound source position and the world space ear position models a spatial relationship between a virtual space sound source position of a virtual space sound source and a virtual space listening position. Three-dimensional audio is output to the human subject via an acoustic transducer array including one or more acoustic transducers. The three-dimensional audio output is configured such that at the world space ear position a sound provided by a particular virtual space sound source appears to originate from a corresponding world space sound source position

Read more: http://www.faqs.org/patents/app/20130208899#ixzz31Xw4HUHf

THREE-DIMENSIONAL AUDIO SWEET SPOT FEEDBACK

A method for providing three-dimensional audio is provided. The method includes receiving a depth map imaging a scene from a depth camera and recognizing a human subject present in the scene. The human subject is modeled with a virtual skeleton comprising a plurality of joints defined with a three-dimensional position. A world space ear position of the human subject is determined based on the virtual skeleton. Furthermore, a target world space ear position of the human subject is determined. The target world space ear position is the world space position where a desired audio effect can be produced via an acoustic transducer array. The method further includes outputting a notification representing a spatial relationship between the world space ear position and the target world space ear position.

Read more: http://www.faqs.org/patents/app/20130208898#ixzz31XwKawal

SKELETAL MODELING FOR WORLD SPACE OBJECT SOUNDS

A method for providing three-dimensional audio includes determining a world space object position and a world space ear position of a human subject based on a modeled virtual skeleton. The method further includes providing three-dimensional audio output to the human subject via an acoustic transducer array including one or more acoustic transducers. The three-dimensional audio output is configured such that sounds appear to originate from the object.

Read more: http://www.faqs.org/patents/app/20130208897#ixzz31XweJIN3

Brad Grenz · May 13, 2014

mosen said:
My answer to last question is "YES". Why? because there would be no reason to put HW scaler (or much other stuff) on the display plane that only supposed to render UI at native resolution. This isn't first time that a company says something different from their intention (always on LED bar on DS4 is another example).

What if it was cheaper to use three identical planes instead of further customizing one just to REMOVE functionality? Or maybe they just remembered 1080p isn't the only possible output resolution.

Regarding your earlier point about PC GPUs not needing more than 2 DMAs, they also have a far larger pool of VRAM which requires data to be moved back and forth far less frequently than the 32MBs of ESRAM necessitates.

mosen · May 13, 2014

Brad Grenz said:
What if it was cheaper to use three identical planes instead of further customizing one just to REMOVE functionality? Or maybe they just remembered 1080p isn't the only possible output resolution.

Regarding your earlier point about PC GPUs not needing more than 2 DMAs, they also have a far larger pool of VRAM which requires data to be moved back and forth far less frequently than the 32MBs of ESRAM necessitates.

I don't think so, if you want to render UI at native resolution you don't need HW scaler (720p, 1080p or 4K, it should be only native). Also adding one more display plane should be considered as a customization by itself. So removing some parts of it isn't cheaper or simpler. This is exactly what Sony did with PS4's second graphics command processor, they removed some part of it. Also I don't know if display planes on AMD GPUs have the same functionality as XB1 display planes or not (the patent suggest that XB1 DPs are Microsoft invention not AMD). If you read VGleaks each plane of DisplayScanOut Engine (DCE) has less features than XB1 equivalent.

By adding more DMAs you can't reach higher bandwidth but you can send and get more smaller data packets at the same time to different parts of the system. They can work synchronously with lower bandwidth (6.4GB/s). Using all of them at the same time to fill eSRAM isn't different than filling eSRAM with one of them with 25.6GB/s bandwidth. There is no point on sending 4 smaller data packet to/from eSRAM from/to for example DDR3. But it's possible to use all of them for different parts of the system at the same time asynchronously with CPU and GPU and I consider this as a benefit.

Brad Grenz · May 13, 2014

mosen said:
I don't think so, if you want to render UI at native resolution you don't need HW scaler (720p, 1080p or 4K, it should be only native). Also adding one more display plane should be considered as a customization by itself. So removing some parts of it isn't cheaper or simpler. This is exactly what Sony did with PS4's second graphics command processor, they removed some part of it. Also I don't know if display planes on AMD GPUs have the same functionality as XB1 display planes or not (the patent suggest that XB1 DPs are Microsoft invention not AMD). If you read VGleaks each plane of DisplayScanOut Engine (DCE) has less features than XB1 equivalent.

You're assuming a lot of stuff not in evidence. VGLeaks is hardly an exhaustive resource on hardware functionality. And we saw with the entire Xbox One spec saga that Microsoft has a nasty habit of giving their own name to standard GCN parts. Hell, do we even know, based on the way the display panes work, if it is even possible to selectively disable certain functions on a per plane basis?

Obviously you don't NEED display panes to composite native resolution UIs. PS3 games were doing it, as we know. MS just seemed to think it was important. I think their role in early concept could have been somewhat different from what was eventually delivered. The UI aspect is just a convenience feature where maybe those other concepts didn't work out or did not end up being needed. For example, maybe they assumed one pane would always be dedicated to the HDMI feed for seamless switching, but in the shipping software that was determined unneeded or wasteful. And maybe they believed they needed to be identical because they were going to be assigned in a fully dynamic or rolling fashion.

By adding more DMAs you can't reach higher bandwidth but you can send and get more smaller data packets at the same time to different parts of the system. They can work synchronously with lower bandwidth (6.4GB/s). Using all of them at the same time to fill eSRAM isn't different than filling eSRAM with one of them with 25.6GB/s bandwidth. There is no point on sending 4 smaller data packet to/from eSRAM from/to for example DDR3. But it's possible to use all of them for different parts of the system at the same time asynchronously with CPU and GPU and I consider this as a benefit.

With 3 OSes running all the time and the small size of the ESRAM relative to the DDR3, the granularity has obvious benefits. But you wouldn't put 4 DMAs in a unified memory design because that would be wasteful.

mosen · May 13, 2014

Brad Grenz said:
You're assuming a lot of stuff not in evidence. VGLeaks is hardly an exhaustive resource on hardware functionality. And we saw with the entire Xbox One spec saga that Microsoft has a nasty habit of giving their own name to standard GCN parts. Hell, do we even know, based on the way the display panes work, if it is even possible to selectively disable certain functions on a per plane basis?

Obviously you don't NEED display panes to composite native resolution UIs. PS3 games were doing it, as we know. MS just seemed to think it was important. I think their role in early concept could have been somewhat different from what was eventually delivered. The UI aspect is just a convenience feature where maybe those other concepts didn't work out or did not end up being needed. For example, maybe they assumed one pane would always be dedicated to the HDMI feed for seamless switching, but in the shipping software that was determined unneeded or wasteful. And maybe they believed they needed to be identical because they were going to be assigned in a fully dynamic or rolling fashion.

If you don't accept Vgleaks credibility (while all of their leaks is from official documents and none of them wasn't wrong) and you think that Microsoft filled the patent to change their mind at a later time, I don't know what to say anymore. You'r free to think what you want, with many IFs and MAYBEs.

With 3 OSes running all the time and the small size of the ESRAM relative to the DDR3, the granularity has obvious benefits. But you wouldn't put 4 DMAs in a unified memory design because that would be wasteful.

Add other component in the system and it's a good reason that DMEs added to XB1. Texture, Audio and network data transfer and supporting both system and game partitions. Having a more flexible system is a good reason to invest more on DMAs/DMEs (as I said eSRAM isn't the only reason).

Lalaland · May 13, 2014

mosen said:
If you don't accept Vgleaks credibility (while all of their leaks is from official documents and none of them wasn't wrong) and you think that Microsoft filled the patent to change their mind at a later time, I don't know what to say anymore. You'r free to think what you want, with many IFs and MAYBEs.

Add other component in the system and it's a good reason that DMEs added to XB1. Texture, Audio and network data transfer and supporting both system and game partitions. Having a more flexible system is a good reason to invest more on DMAs/DMEs (as I said eSRAM isn't the only reason).

Reading tea leaves and reading patents aren't as far from one another as you appear to believe, most patent filings are defensive and usually premature to ensure exclusivity (particularly in s/w). Your theory as to the display planes doing all sorts of funky stuff beyond what MS have repeatedly stated they were for (Game, UI, O/S) is interesting but far from 'proven'. Until a game dev comes out and says we did X,Y & Z using display planes the preponderance of evidence suggests they are as dull as ever.

Betanumerical · May 13, 2014

mosen said:
Add other component in the system and it's a good reason that DMEs added to XB1. Texture, Audio and network data transfer and supporting both system and game partitions. Having a more flexible system is a good reason to invest more on DMAs/DMEs (as I said eSRAM isn't the only reason).

We have something for this on desktop computers (and no doubt the consoles too ) it's called DMA , the amount of DMA controllers that are in your desktop probably number around or at least the ten - tens, the next gen consoles probably have a similar amount for stuff like audio chips , secondary processors , networking cards.

Shifty Geezer · May 13, 2014

mosen said:
Summary

Click to expand...

Read more at: http://www.faqs.org/patents/app/20110304713#ixzz31XDHnZDW

That's a standard, craptastic patent disclaimer, saying, "this patent not only covers what we've thought of, but also other things we haven't thought of that can be applied to other people's invention." It's meaningless in understanding the tech. The patent itself is just talking about blending two (specifically two) planes into a single display (something no doubt covered in dozens of other patents).

Yet I don't understand benefits of having rotate and transform in DPs.

A really decent bit of hardware would have lots of definable rectangles you can scale, rotate, and translate across the screen for animated effects and full-screen effects. If the intention of the scaling hardware was to provide useful features beyond just UI+3D+OS, it's really weak HW. If the intention of the scaling hardware is to provide just UI+3D+OS, it's perfect hardware.

Clearly you're going to remain unconvinced and expect to see some funky (and highly implausible, even impossible) VR/AR/2D3D applications. The best, likely non-standard application of the display planes would be foveated rendering in a VR headset. Actually scrub that - you'd need four display planes (peripheral and main renderings for both eyes).

Brad Grenz · May 13, 2014

mosen said:
If you don't accept Vgleaks credibility (while all of their leaks is from official documents and none of them wasn't wrong) and you think that Microsoft filled the patent to change their mind at a later time, I don't know what to say anymore. You'r free to think what you want, with many IFs and MAYBEs.

I didn't doubt their credibility, but they have never claimed the information they posted was a complete and exhaustive account of the hardware capabilities in each system. You can't use them to claim something is absent from a system, which is what you were trying to do. Let's not forget everyone used them to "prove" PS4 lacked any audio DSPs until the TrueAudio stuff came out confirming their presence. Oh, and companies file patents they never use ALL THE TIME.

And yeah, I use a lot of "ifs and maybes" because I recognize I'm speculating, whereas you seem to treat your own speculations as fact.

Add other component in the system and it's a good reason that DMEs added to XB1. Texture, Audio and network data transfer and supporting both system and game partitions. Having a more flexible system is a good reason to invest more on DMAs/DMEs (as I said eSRAM isn't the only reason).

It's just the primary reason and I don't disagree they are there for a good reason. What I disagreed with originally was the insinuation that their addition conferred some kind of advantage relative to competitors' designs, when that is simply not the case. They are there to facilitate a function necessitated by larger architectural choices.

That's what I said at the beginning. Customization is not a virtue in isolation. Their value is determined by context. In the Xbox One's case that context was accommodating larger economic and strategic goals for the system for the most part. They thought first about how they would market the device, then what components would best help them accomplish that, and the customizations done were all about making those disparate goals coexists as best as possible.

That is in contrast to the PS4 which drew all its philosophical design goals from an architect who writes game code for a living. Cerny wasn't thinking about how to solve bandwidth deficits, virtualized OSes, TV or Kinect overhead. The customizations in the PS4 are mostly about pushing forward what is possible in next gen game software using Compute in one of the first high performance HSA APUs.

dobwal · May 13, 2014

Brad Grenz said:
With 3 OSes running all the time and the small size of the ESRAM relative to the DDR3, the granularity has obvious benefits. But you wouldn't put 4 DMAs in a unified memory design because that would be wasteful.

The DMAs would probably be present regardless of eSRAM. You don't want occupy gpu or cpu cycles with moving data to and from a bunch of other processors.

Its the number of special processors and their memory pools that most likely dictate the number of DMAs not eSRAM alone.

hesido · May 13, 2014

dobwal said:
The DMAs would probably be present regardless of eSRAM. You don't want occupy gpu or cpu cycles with moving data to and from a bunch of other processors.

Does having eSRAM as a separate pool where you are constantly pushing data in and out, benefit from the extra DMA's? (As in, you wouldn't need as many in other setups?)

dobwal · May 13, 2014

hesido said:
Does having eSRAM as a separate pool where you are constantly pushing data in and out, benefit from the extra DMA's? (As in, you wouldn't need as many in other setups?)

Yes, but look at all the other pools of memory that on the Durango. AMD's more general gpus get by with 2, it doesn't seem necessary that adding a pool of scratchpad memory on gpu requires 4 more.

Cell employed 9 DMEs, one for each spe and ppe. Each with the same throughput of a XB1 DME (25.6 GBs) for a combined internal bandwidth of 200 GBs. If you going to offload work to a bunch special accelerators, you going to need a robust memory system. The more producers and consumers you have (in terms of co processors and their memory pools) the more DMAs you need to parallelize data movement.

Xbox One (Durango) Technical hardware investigation

pjbliverpool

B3D Scallywag

mosen

Shifty Geezer

uber-Troll!

Betanumerical

mosen

Shifty Geezer

uber-Troll!

mosen

dobwal

mosen

Brad Grenz

Philosopher & Poet

mosen

Brad Grenz

Philosopher & Poet

mosen

Lalaland

Betanumerical

Shifty Geezer

uber-Troll!

Brad Grenz

Philosopher & Poet

dobwal

hesido

dobwal

Similar threads