Xbox One (Durango) Technical hardware investigation

Discussion in 'Console Technology' started by Love_In_Rio, Jan 21, 2013.

Thread Status:
Not open for further replies.
  1. Airon

    Banned

    Joined:
    Dec 12, 2012
    Messages:
    172
    Likes Received:
    0
    Shifty Gear you just concentrate on the 2 last lines of my post.

    To me, they go with esram in order to capitalize the experience done with x360 (their experience and developers experience). And, again, esram was there from the beginning.
    With this assumption do you see Esram/dram + gddr5 reasonable? It is a honest question, I ask you.

    Regarding latency topic, to me, it dwell more with ddr3 and it will be more relevant for the CPU.
     
  2. Solarus

    Newcomer

    Joined:
    Jan 12, 2009
    Messages:
    156
    Likes Received:
    0
    Location:
    With My Brother
    theres one thing im confused on. is the 10% gpu reserve just for the os or is kinect a part of that too? i know they said they use GPGPU for kinect, but does kinect have its own resources that it pulls? Has ms ever given a figure as to how much those resources are? kinect 2.0 has its own processor now correct? so is it kinect cpu + XBO gpus gpgpu/compute shaders or is it using XBO's processor as well as its own and XBOs gpu?
     
  3. astrograd

    Regular

    Joined:
    Feb 10, 2013
    Messages:
    418
    Likes Received:
    0
    Solarus,

    The 10% figure is a conservative, estimated reserve on MS's part which includes both the OS functions and some Kinect stuff. Here's some more info on the Kinect aspect specifically:

    http://www.vgleaks.com/durango-next-generation-kinect-sensor/

    There is an MEC chip in the audio block for Kinect's voice recognition, but some of the other stuff is using some GPU cycles. Exactly what that breakdown is nobody (outside MS) knows.
     
  4. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,106
    Likes Received:
    16,898
    Location:
    Under my bridge
    this has already been discussed in detail in some thread of other, so let's put an end to it. MS measured data across their bus. Over one second it as 100 GB of data. Over another it was 200 GBs. Over another it was 150. And they came up with an average attained BW of 150 GB/s. How can we be sure of this, you ask? How do we know that isn't 150 GB/s for one millisecond and the rest of the second it was only managing 100 GBs? Because 1) That figure makes sod-all sense when trying to communicate to your developers what resources they have at their disposal, and 2) the peak BE is 200 GB/s, so if Ms just wanted the biggest possible measurement, they could have contrived a scenario like that and reported, "we've measured 200 GB/s average use."

    So the BW for data in ESRAM, available to developers, is around about 150 GB/s real-world. One can choose to disbelieve if one wants, but there's no more point arguing it. The cards are on the table and it's down to individual interpretation.


    The measurements are for developers. Real world measurement are far more useful for targeting an engine and assets than paper metrics.

    So, to be clear of this, the discussion of ESRAM's bandwidth is now off limits. Either one believes it's 150 GB/s as the engineers tell us, or not, but no-one needs to engage into a discussion about how much BW there is.
     
  5. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,106
    Likes Received:
    16,898
    Location:
    Under my bridge
    There's been a whole discussion or three on that. The situations in which they arise are tangential to the question of how much BW is available to developers. Betanumerical is saying the 150 GB/s cannot be trusted (off the back of a comment about system bandwidth). At this point, it can be taken as fact that devs have typically 150 GB/s in their usual workloads. The intricacies of how to extract more BW are well worth discussing, but I'd say not in this thread as that discussion requires a low-level software debate, whereas this thread is establishing XB1's hardware, including upper and average bus BWs.

    http://forum.beyond3d.com/showthread.php?t=64291

    Understanding XB1's internal memory bandwidth *spawn
     
  6. oldschoolnerd

    Newcomer

    Joined:
    Sep 13, 2013
    Messages:
    65
    Likes Received:
    8
    Sweet. Can we also have a similar statement for bandwidths being able to be added together...and overall average system bandwidth has been measured at 200GB/s? Because there are ramifications to that which we should be discussing...like how is the "limited" x1 gpu capable of chewing through so much data?
     
  7. zupallinere

    Regular Subscriber

    Joined:
    Sep 8, 2006
    Messages:
    768
    Likes Received:
    109
    What do you mean by x1 gpu ? Is there another number beside x1 that you have in mind ?
     
  8. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    Im pretty sure he means the XBONE GPU, not 1x the GPU :).
     
  9. zupallinere

    Regular Subscriber

    Joined:
    Sep 8, 2006
    Messages:
    768
    Likes Received:
    109
    Yes my mistake.
     
  10. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    So how do you run a test without real code? Simulate it? :lol:

    Anyway, MS just did throw out some arbitrary efficiency numbers in that interview without backing them up at all. Numbers from fillrate test from a respected hardware site (hardware.fr) got already posted in this forum (of course using using real code running on real graphics cards, basically the same as MS will very likely measure their bandwidth numbers) which prove, that alpha blending (also mentioned by MS as a scenario for getting more than 109GB/s out of their eSRAM) is good enough to realize 91+% of the theoretical bandwidth of GDDR5. Pure read (or writes) attain 93% in the tests using a Pitcairn GPU (that is reasonably close to the bandwidth and ROP configuration of the PS4).

    So could you please stop to use the arbitrary multipliers. The most straightforward assumption is that this "efficiency" applies to both achitectures, XB1 and PS4, roughly the same way. For this kind of stuff the GPU is very likely able to use the available bandwidth with about the same efficiency, as long as proven otherwise. The picture is likely more complicated if you dive into the different characteristics of DRAM and SRAM (which we don't know in detail) and parallel use scenarios. But your grossly simplified approach is leading nowhere other than to fuel some crappy fanboy advantage arguments.
     
  11. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,106
    Likes Received:
    16,898
    Location:
    Under my bridge
    That's discussed in the other thread. One can add bandwidths together but it's a pretty meaningless value.
    Yes. 150 GB/s for ESRAM and 60 odd for DDR3, means ~200 GB/s. How devs make use of that bandwidth is a complicated issue as it's not directly comparable to 200 GB/s of a unified RAM pool. For further discussion, how devs use the hardware as opposed to what the HW is, use the existing RAM discussion thread.
     
  12. onQ

    onQ
    Veteran

    Joined:
    Mar 4, 2010
    Messages:
    1,540
    Likes Received:
    56
    I think the coherent memory is virtual memory that's part of the 8GB DDR3 & the 30GB/s is part of the 68GB/s bandwidth. I have a feeling that this was something mostly done for Kinect.


    The 47MB is the ESRAM + all the Cache.

    • 32MB of ESRAM
    • 4MB of CPU L2
    • 512KB of CPU L1
    • 232 KB of Audio Chip Cache/SRAM
    • 512 KB GPU L2
    • 192 KB GPU L1
    • 768 KB GPU LSM
    • 64 KB GPU GSM
    _____________



    Well I was able to find 38.2MB of the 47MB using the Hot Chips document & the leaked GPU info from VGLeaks, that leaves 8 - 9 MB hidden somewhere on the SoC.
     
  13. Ceger

    Newcomer

    Joined:
    Aug 21, 2013
    Messages:
    59
    Likes Received:
    1

    Again, the specific 91%+ scenario is to specific functions; has their been a percentage estimation of full bandwidth utilization from actual titles running? This is a question.

    As for the stated tests/real apps, I would assume that actual titles they have would be the source of those measures; Forza, Ryse, etc..

    So my point is about what seems to be honest talk of real average, not specific tests into bandwidth measurement of which I am sure someone at MS can engineer near peak situations as well. No fanboy argument, actually pushing to look at this past fanboy interpretation.
     
  14. zupallinere

    Regular Subscriber

    Joined:
    Sep 8, 2006
    Messages:
    768
    Likes Received:
    109
    Thanks for the response. Since the MS guy was using "coherent read bandwidth" and framing it as the BET against other competing systems, it stands to reason that it was quite important. The big difference is the ESRAM and the audio blocks since most of the other stuff is shared by each system. As such the ESRAM would be the biggest contributor to the "coherent read bandwidth" difference. Seems to be a lot riding on that particular piece of real estate ;-)

    Having it all laid out there when they state the coherent read bandwidth "bet" I think they are probably averaging up collective bandwidths and then parsing that out over the 47 MB ...say 47 MB / 140 GB/S ( just making that last number up ) for some ratio of memory to bandwidth or maybe just averaging the bandwidth. Just a supposition.

    So forward going in terms of the BET the MS engineer suggests seems to be suggesting that their coherent bandwidth advantage ( depending on how that is defined ) will give the XB1 longer legs compared to some other not to be named system and it's GPGPU bet.

    Ah another thread for that discussion.:wink:
     
  15. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    We had this topic already, conclusion was, they counted all SRAM, including the redundant stuff. That changes a few of the GPU numbers:
    14 x 16 kB vector L1 in the GPU = 224 kB (instead of 192kB)
    14 x 64 kB LDS = 896 kB (instead of 768 kB)

    And you forgot a few things:
    14 x 256 kB vector registers in the GPU = 3584 kB
    14 x 8 kB scalar registers in the GPU = 112 kB
    4 x 16 kB scalar (constant) L1 = 64 kB
    4 x 32 kB instruction cache = 128 kB
    [strike] GDS = 64 kB[/strike]
    4 x (16kB+4kB) ROP tile caches = 80 kB

    That together adds [strike]4192[/strike] 4128 kB to your number, leaving a bit over 4 MB unaccounted for. And if you consider, that MS has said the eSRAM actually is ECC protected, you can do some creative counting and come to the conclusion that the 32 MB are actually 36 MB closing that gap. If there should be a few hundred kB missing, there are also a lot of small buffers everywhere all over the die made up from small SRAMs.
     
  16. onQ

    onQ
    Veteran

    Joined:
    Mar 4, 2010
    Messages:
    1,540
    Likes Received:
    56
    Honestly I have no idea what you're talking about right now lol.


    The Coherent Bus isn't connected to the ESRAM & it says that "Any DRAM data can be coherent with the CPU caches" it's 30GB/s but it's part of the 68GB/s DDR3.

    [​IMG]
     
  17. zupallinere

    Regular Subscriber

    Joined:
    Sep 8, 2006
    Messages:
    768
    Likes Received:
    109
    Thanks again. So I was wrong in remembering the 47 mb as being coherent then :oops:
     
  18. onQ

    onQ
    Veteran

    Joined:
    Mar 4, 2010
    Messages:
    1,540
    Likes Received:
    56
    The 47MB is from adding the 32MB of ESRAM with all the Cache & SRAM on the SoC.

    [​IMG]
     
  19. kots

    Regular

    Joined:
    Oct 30, 2008
    Messages:
    394
    Likes Received:
    0
    Could deferred renderers be a problem for XB1 , considering the relative small amount of esram or it is a non issue ?
     
  20. jlippo

    Veteran

    Joined:
    Oct 7, 2004
    Messages:
    1,744
    Likes Received:
    1,090
    Location:
    Finland
    You can read and write from/into both ESRAM and DDR3, so no issue.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...