Xbox One (Durango) Technical hardware investigation

Status
Not open for further replies.
True. Coders monkeys trained in the field should get it though I've met some programmers that were horribly ignorant of the hardware they used. That has worried me more than once cause I got into the field of coding for love of the hardware and the results.
 
Where are they getting their estimates from? These sorts of breakdowns, although good for educating readers about how BW consumption is spread throughout the system, kinda gloss over the complete flexibility the developers have. If some developers chooses to consumer all available CPU BW, they can, leaving less for the GPU, and they can have the CPU doing barely anything freeing up more for the GPU. That's why we list peak BW speeds, so devs know what resources they have ready to choose how to use them.

I assume they're just copying and pasting from the Durango dev doc they have.

The inclusion of HDMI in could just be for the ability to do overlay the OS, game or app content on another source like your Cable or Satellite TV. ?

That's exactly what the HDMI in is for, so they can do things like pop up XBL notifications and messages while you're watching TV or do PIP etc.
 
All I'm saying is you shouldn't add bandwidth on different buses like that it bothers me. It's 68.2GB\s on both buses and 68.2 of the 102 usable on ESRAM. Adding the aggregate bandwidth of all the buses together and then trying to do a percentage is meaningless and more likely to confuse the issue at hand then it is to explain it to moth people. As I'm willing to bet the non technically inclined will believe that the 136GB\s is the transfer rate of the data not the amount of bandwidth being used on all buses added together by the transfer.
But 136GB/s _is_ the bandwidth being used. bandwidth and transfer are not the same thing, depending on what you're doing with it. If the PS4 copies a chunk of memory from one place to another, they can use max 176GB/s bandwidth, and copy max 88GB/s of data. If this durangoid thingy copies memory from DRAM to ESRAM, it uses max 136GB/s bandwidth and max 68GB/s data. If there is a shader reading data directly from DRAM, as well as data directly from ESRAM and then not outputting anything, it has max 170GB/s bandwidth, and can transfer max 170GB/s data, although if you're not outputting it, it seems like a waste.
 
dumb question for somebody that may have more experience in the industry.

After R&D for a game console, shouldn't R&D budgets go down?

I mean, everybody is pretty certain that the next gen xbox is pretty much set in specs right? But there have been rumors and statements made that there are multiple versions of the xbox floating around( I don't have proof, I don't remember where I saw this exactly). Also were statements made that after the Durango conference last February, Microsoft went back to the drawing board or at least made changes to the design that developers were happy with but had given advice to Microsoft.

I just find it odd that roughly one year out from the xbox release, the Entertainment and Devices Division R&D budget went up 44%. Certainly there is the possibility that they are working on something else, but what other major releases do they have.


http://www.gamasutra.com/view/news/..._less_as_next_gen_approaches.php#.UUGI73Hn86Y

I don't know, I just find it odd, but it's probably nothing.
 
What is the relationship between the input and output bandwidth usage of a GPU during the most commonly executed operations? More input, more output or is it roughly equal? Or are all three cases relatively common?
 
What is the relationship between the input and output bandwidth usage of a GPU during the most commonly executed operations? More input, more output or is it roughly equal? Or are all three cases relatively common?

Not sure about current games, but when the Xbox 360 was designed, most of the write bandwidth was used for framebuffer operations, performed on the EDRAM die. Most of the remaining traffic (texture, vertex, etc.) was read-only. Microsoft claimed that performing most of the writes on the EDRAM die improved the efficiency of the main memory pool, since fewer cycles were wasted to change the direction of the memory operations.
 
What is the relationship between the input and output bandwidth usage of a GPU during the most commonly executed operations? More input, more output or is it roughly equal? Or are all three cases relatively common?
It varies, from reading multiple samples of a buffer to write a pixel (post effects) to multiple writes drawing cached images repeatedly (smoke particles). I expect an analysis of games will find an average, but in a console you can balance what you do with the system. If you are rendering you scene with plenty of BW to spare, you can up particle effects/quality, say.
 
I wonder if the low latency esram would result in low bus turnaround penalties, and higher effective BW than a traditional dram pool?

If z or colour writes are causing frequent, small bus turnarounds then that could maybe get in the way of keeping TMUs fed even if you had loads of peak uni-directional BW. Texturing from esram, or at the very least keeping something like z out of main ram, could be a win for other units using the memory pool(s).
 
Not sure about current games, but when the Xbox 360 was designed, most of the write bandwidth was used for framebuffer operations, performed on the EDRAM die. Most of the remaining traffic (texture, vertex, etc.) was read-only. Microsoft claimed that performing most of the writes on the EDRAM die improved the efficiency of the main memory pool, since fewer cycles were wasted to change the direction of the memory operations.
If the leaked specs are correct I wonder if games that rely heavily on deferred engines are going to work better on Durango because of the split pool of memory and the eSRAM. :?:

Especially now that the eSRAM is several times more flexible than the eDRAM on the X360 ever was.
 
Can someone explain the memory speed? it's somehow better than before?

I believe it's the same as the spec rumours in previous vgleaks articles.

Only question I have is whether 16 ROPs on the GPU can exceed the 68 GB/s of the DDR3 bus.

Edit: Looking at AMD GCN gpus, 16 ROP units are paired with 72 GB/s memory. Maybe the GPU will not be bandwidth starved writing to DDR3 and reading from ESRAM. There is the contention issue, where move engines and the north bridge share the DDR3 bandwidth. Someone more knowledgeable than me could probably make a better guess.
 
I believe it's the same as the spec rumours in previous vgleaks articles.

Only question I have is whether 16 ROPs on the GPU can exceed the 68 GB/s of the DDR3 bus.

Easily! Assuming that the 16 ROPs can do a 64-bit colour read and write and a 32 bit z read and write per clock, it could theoretically exceed it by several times even without MSAA. It seems that compressions systems and "bursty" usage (ROPs probably spend quite a lot of time idle) make this much less of a problem than theoretical peaks indicate though.
 
Easily! Assuming that the 16 ROPs can do a 64-bit colour read and write and a 32 bit z read and write per clock, it could theoretically exceed it by several times even without MSAA. It seems that compressions systems and "bursty" usage (ROPs probably spend quite a lot of time idle) make this much less of a problem than theoretical peaks indicate though.

AMD does pair 16 ROP parts with 72 GB/s, looking at a list of their GPUs. Anything over 16 ROPs (seems to be 32+, nothing inbetween) is paired with 150+ GB/s (the ratio of bandwidth to ROPs is way higher in the 79xx series). They must perceive some balance between the two. 32 ROPs would theoretically exceed 150+ GB/s the same way 16 ROPs would theoretically exceed 72 GB/s.

It'll be pretty interesting to see how this system works in practice, in terms of what data gets put into ESRAM vs DDR3, and what the move engines end up copying back and forth.

Edit: I guess what I'm wondering is how bandwidth starved a 16 ROP AMD gpu is on the PC-side. I know it'll change depending on the game, and whether you're using MSAA (and I'm not expecting MSAA to be common on this gen of consoles).
 
AMD does pair 16 ROP parts with 72 GB/s, looking at a list of their GPUs. Anything over 16 ROPs (seems to be 32+, nothing inbetween) is paired with 150+ GB/s (the ratio of bandwidth to ROPs is way higher in the 79xx series). They must perceive some balance between the two. 32 ROPs would theoretically exceed 150+ GB/s the same way 16 ROPs would theoretically exceed 72 GB/s.

Both the 7870 and the 7970 has 32 ROPs. The 7870 has 153GB/s, 58% the bandwidth of the 264GB/s bandwidth of the 7970. The 7870 scores 60% the 3dMark Vantage pixel fillrate of a 7970. That's blend fillrate, so there will be situations where the 32 ROPs can see full use. However, overall, 32 ROPs seems excess for 78xx (and Orbis), while slightly low for 79xx.

AFAICT, the 7870 (and the derived Orbis) has 32 ROPs because it is two 7770 ganged together, the 7770 has 16 ROPs because 8 is not enough (and 12 may be a weird number if a ROP partition ties into a L2 cache slice or maybe the designers had an obsession with powers of two)

Cheers
 
Edit: Looking at AMD GCN gpus, 16 ROP units are paired with 72 GB/s memory.
As Function says, the ROPS can burn through more BW, but they aren't working 100% of the frame time.
Maybe the GPU will not be bandwidth starved writing to DDR3 and reading from ESRAM. There is the contention issue, where move engines and the north bridge share the DDR3 bandwidth.
If the MEs are using BW, they are contributing to the rendering and without them the GPU or CPU would be using the BW to move the same data. If the ME's aren't needed, they'll sit idle and leave the BW untouched.

I consider it a mistake to think of CPU and GPU and discrete parts in competition with each other. They are all contributing to the game, and can all be optionally balanced to dev requirements. A part will only be used if it is contributing a positive amount to the game, at which point it's not a loss to other parts of the system. Well, it is technically a loss, but the end result on screen isn't negatively affected.
 
What I find interesting, or maybe I'm thinking about it too much, is that, in the provided memory example, as much of the DDR3's bandwidth is dedicated to the GPU as seems to be the case. Now it shouldn't surprise me that this is the case, but the more I think about the amount dedicated to the GPU compared to the amount dedicated to the CPU, the more I'm thinking that the choice of DDR3 may not have been such a bad idea after all.

In and of itself there should be nothing extraordinary about this, but wouldn't this bandwidth breakdown not potentially be more of a concern if Durango did not have DDR3? It's pretty much also why Durango doesn't exactly suffer as much by the fact that the 102GB/s from the ESRAM isn't apart of a single unified memory bandwidth pool making up the total 170GB/s that the system has, is it not?

Strictly in the context of Durango's design, couldn't going with DDR3 thanks to its latencies end up being a pretty good decision?
 
dumb question for somebody that may have more experience in the industry.

After R&D for a game console, shouldn't R&D budgets go down?

I mean, everybody is pretty certain that the next gen xbox is pretty much set in specs right? But there have been rumors and statements made that there are multiple versions of the xbox floating around( I don't have proof, I don't remember where I saw this exactly). Also were statements made that after the Durango conference last February, Microsoft went back to the drawing board or at least made changes to the design that developers were happy with but had given advice to Microsoft.

I just find it odd that roughly one year out from the xbox release, the Entertainment and Devices Division R&D budget went up 44%. Certainly there is the possibility that they are working on something else, but what other major releases do they have.


http://www.gamasutra.com/view/news/..._less_as_next_gen_approaches.php#.UUGI73Hn86Y

I don't know, I just find it odd, but it's probably nothing.
Even if the next Xbox is completely designed Microsoft probably isn't going to fire everyone so they continue to have the same number of employees. Plus they need to actually manufacture the design which requires a lot of capital. Since they won't start selling the product for a while this is a cost without a return during the same quarter.

Also, consoles require a lot of software and validation work through the launch period.
 
Even if the next Xbox is completely designed Microsoft probably isn't going to fire everyone so they continue to have the same number of employees. Plus they need to actually manufacture the design which requires a lot of capital. Since they won't start selling the product for a while this is a cost without a return during the same quarter.

Also, consoles require a lot of software and validation work through the launch period.

That and evolution doesn't end at launch

Engineers are constantly working on die reductions, increasing cost effectiveness, tweaking thermal. This is also excluding potential projects such as game streaming roll out, Fortaleza, OS revisions, product improvements (such as controllers, post-launch issues resolvement and whatnot)

Microsoft internal R&D will fluctuate here and there, but they have to pay lump sum to AMD (potentially in multiple payments... When the work starts and then for any overhead when the work is finished). They have to pay GlobalFoundries for their engineering samples and Flextronics/Foxxcon/whoever to start aligning resources for manufacturing which should fall under R&D

It's honestly too hard to properly map out these things without seeing the companies internals

Though we should see a jump in the next couple months when they start putting orders on parts to build everything
 
I doubt any of the console chips will be made at globalfounderies. Both AMD gpus and the jaguar SoCs are being fabbed at TSMC. Likely chips are being manufactured right now or starting to.

Microsoft will likely not fire anyone who doesn't deserve to be fired. They are a large company and can always move people around even if the xbox team downsized due to less need of engineers. They can always do work on low level software testing and revisions as well as be placed on a team for a successor.
 
I doubt any of the console chips will be made at globalfounderies. Both AMD gpus and the jaguar SoCs are being fabbed at TSMC. Likely chips are being manufactured right now or starting to.

Microsoft will likely not fire anyone who doesn't deserve to be fired. They are a large company and can always move people around even if the xbox team downsized due to less need of engineers. They can always do work on low level software testing and revisions as well as be placed on a team for a successor.

I live close by GF. It's been rumored that the xbox chip will be made there, I would be highly surprised if it didn't happen especially since they are expanding and they are already huge.
 
Status
Not open for further replies.
Back
Top