Xbox One (Durango) Technical hardware investigation

Discussion in 'Console Technology' started by Love_In_Rio, Jan 21, 2013.

Thread Status:
Not open for further replies.
  1. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Thanks, this is a clear and logical explanation as to why the esram should be able to hit 150GB/s on average. Assuming of course there are no other caveats of which we are not aware then quite frankly this makes the Microsoft statement kinda moot since all they are saying is that they've measured a throughput (peak, average, whatever) which you can already show must logically be achievable on average from such a design.

    Now why didn't you tell us this a month ago? j/k :wink:
     
  2. Gubbi

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,661
    Likes Received:
    1,114
    Seconded.

    Cheers
     
  3. DrJay24

    Veteran

    Joined:
    May 16, 2008
    Messages:
    3,894
    Likes Received:
    634
    Location:
    Internet
    I guess you are considering MS public PR as evidence? I would say right now we don't have any real evidence of the effective bandwidth and how it is used in real games.

    I guess we have some info about frame rates and frame buffer resolutions, but that would hard to use as data to figure out memory bandwidth.
     
  4. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Fair play. So if another vendor came out tomorrow with a statement "we've measured 170GB/s in a real game" then you would also accept that as being the average utilization for said system. Glad we understand each other.
     
  5. blakjedi

    Veteran

    Joined:
    Nov 20, 2004
    Messages:
    2,985
    Likes Received:
    88
    Location:
    20001
    I think Gubbi answered your questions in the last two pages.
     
  6. DrJay24

    Veteran

    Joined:
    May 16, 2008
    Messages:
    3,894
    Likes Received:
    634
    Location:
    Internet
    We clearly have different definitions of evidence.
     
  7. Bigus Dickus

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    943
    Likes Received:
    16
    Then by your definition, the xb1 has real world bandwidth to the GPU if some 270 ~ 290 GB/s (depending on whether the combined esram rate is 204 or 218, I still don't think that's clear), so long as some code can be written to achieve that burst over a short time, which there is no indication thus far that isn't possible.

    Now, I think your definition is silly. There is a reason we use terms like peak or theoretical... being that we expect average, or even maximum shirt term rates in real game code to be somewhat less.

    We don't know how much less for either system. People can pontificate and speculate, and make up numbers to suit their agenda, but we simply don't know. For either system. And probably won't know for quite some time. MS has given some extra information suggesting perhaps 200 GB/s is more representative, but the information is ambiguous because we don't know representative of WHAT and under which conditions.

    So we have peak numbers to compare, which doesn't help much except to say that both consoles seem to have pretty high bandwidth and games should be interesting.

    As for the 99% of slow RAM talk, its just as much handwaiving as thinking the XB1 will achieve sustained 290 GB/s. So what if the faster pool is only 1% the total memory size, if the data structures being written to and from it fit within its size, and the GPU winds up using this pool for more than half the total GPU memory accesses? See, I can make up numbers too.

    We will have to wait for quite a while before making more informed comparisons. Release of tech details probably won't even be enough. I think measured performance in realistic code will ultimately shed some light on his these two approaches compare.
     
  8. dobwal

    Legend

    Joined:
    Oct 26, 2005
    Messages:
    5,955
    Likes Received:
    2,326
    Doesn't 133 GBs work out to quite a few 1080p frames being alpha blended per second? Like somewhere in the neighborhood of ridiculous?
     
  9. dobwal

    Legend

    Joined:
    Oct 26, 2005
    Messages:
    5,955
    Likes Received:
    2,326
    Whats the max mph on your car? Does the fact that there exists a ton a factors that keeps you from regularly approaching that speed means that the max mph of your car is somewhat other than those times you can actually mash the pedal to the medal and hit that max speed if only for a few seconds.

    Its a max rate thats all it is, it doesn't mean anything in and of itself especially under the circumstance you are comparing bandwidth of two different systems.

    You might think I hold those peak numbers out there to mean one system is better than the other. But I am not. I understand that max or peak bandwidth can happen but in and of itself does not provide a clear look at how robustly the memory system of either console performs especially in comparison to each other.

    I don't readily look at the peak numbers provided by the memory of these console like I would when comparing discrete gpus where you don't need to understand the average or sustainable rate because you know that the sustainable or average rate is usually proportional to the peak bandwidth especially when comparing cards with the same type of DRAM.

    Like when MS throws out 133 GBs for alpha blending. I find it more natural to look at that as a PR number calculated from blending a 720p or 1080p frame and dividing by the time it takes to complete said frame and then extrapolating out to produce a 133 GBs number. Otherwise what is MS doing alpha blending a 8K frame at 30 fps?
     
    #6709 dobwal, Sep 25, 2013
    Last edited by a moderator: Sep 25, 2013
  10. MrFox

    MrFox Deludedly Fantastic
    Legend

    Joined:
    Jan 7, 2012
    Messages:
    6,488
    Likes Received:
    5,996
    My brain likes the simplest explanations, and so far what I can understand about the memory system is what the engineers have said. Does anybody disagree with the following?

    A. 102GB read to the 32MB pool.
    B. 102GB write to the 32MB pool.
    C. 68GB read/write to the 8GB pool.

    - All three paths are operating in parallel.
    - They have no contention between each other except ESRAM bank conflict during read/write
    - The latencies of these paths are unknown.

    I think this is what we know. If one of the paths saturates, your code can't go any faster. Trying to add numbers together doesn't simplify things, it removes important data, and as a side effect it makes the internets explode.
     
  11. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,791
    Likes Received:
    1,596
    .6%...You keep throwing around the incorrect .04% and others also accept it.

    1% of 5000MB=50MB, so 32MB=.6% (for an graspable walkthrough not the math precise math 32/5000)

    Bit of a moot point but yeah.
     
  12. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,791
    Likes Received:
    1,596
    Trying to imply DDR3+ESRAM utilization is somehow some large fraction less capable of reaching it's peak, than other memory setups, which is basically what goes on all the time, is annoying, and typical posturing.

    And yes it does appear X1 is a bandwidth monster.

    X1 has it weaknesses but we should also give due credit to it's apparent strengths.
     
  13. warb

    Veteran

    Joined:
    Sep 18, 2006
    Messages:
    1,057
    Likes Received:
    1
    Location:
    UK
    Unless most games do end up with ~200GB/s system bandwidth in this setup. They likely put some some thought into it.
     
  14. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,106
    Likes Received:
    16,898
    Location:
    Under my bridge
    Aaaaaaa! :runaway: It is not possible to understand the flow of data in a system by a single metric (unless that system has a single memory pool). Your aggregate number is true and yet pointless, and there's zero sense in trying to condense understanding of the BW into this single value.

    Sometimes the code will run at the fastest aggregate speed of the total RAM. Sometimes it could be bottlenecked by the slowest singular pipe. Mostly it'll be hitting shifting limits as data moves around the different pools. All games will have access to ~200 GB/s (actually 272 GB/s as total peak available BW) but the amount of data flowing through the system could be very different. The most important thing is that devs will try to maximise dataflow within budgets and development targets, which is why they want to know bus speeds. Bus speeds aren't really for informing the masses about the potential of the consoles!
     
  15. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    From the Orbis thread, benchmarks using ROP blending operations on discrete GPUs have a 91% utilization rate versus blending numbers leaked for the eSRAM, 133-140 out of 204.
    It may come down to different controller priorities, or the static split might not fit the necessary mix for the dynamic demands of those benches.
     
  16. XpiderMX

    Veteran

    Joined:
    Mar 14, 2012
    Messages:
    1,768
    Likes Received:
    0
    But numbers are 140-150 GB/s no?
     
  17. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    19,426
    Likes Received:
    10,320
    For the most part yes. So, every frame you'd have the followng...

    1. Read from main memory.
    2. Read-modify-write to eSRAM, quite likely multiple times for each read and write to main memory.
    3. Write to main memory.

    Yes, that is greatly simplified. For a traditional GPU memory pool, all of the read-modify-write would go back to man memory unless it manages to fit withing the onchip gpu caches thus eating into your main memory bandwidth and triggering read-write-read performance penalties.

    As well, that does not take into account that some data is likely to persist n eSRAM between frames.

    And to certain people that are likely reading this... Note, this isn't to say one is better or faster than the other. Only pointing out that the situation is far more complex than X has more bandwidth than Y.

    Regards,
    SB
     
  18. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    I don't recall them saying the new numbers were for the same tests when they gave the 150 bound.
     
  19. DrJay24

    Veteran

    Joined:
    May 16, 2008
    Messages:
    3,894
    Likes Received:
    634
    Location:
    Internet
    You are trying to have it both ways. An extra 2CUs don't yeild the ~16% increase because the XB1 is "balanced", yet it has a abundance of memory bandwidth. Where is the bottleneck then?

    The numbers CUs scales with memory bandwidth on dedicated cards for a reason, yet the XB1 can't seem to make it scale, why?
     
  20. oldschoolnerd

    Newcomer

    Joined:
    Sep 13, 2013
    Messages:
    65
    Likes Received:
    8
    How about this. The on die esram is so low latency because of physical proximity to the CUs, that they are able to feed those cores up to really high utilisation. So you have managed to use all the bandwidth with 12CUs. Adding extra CUs isn't going to help much. Speeding the clock speed even by 6% gives you a linear performance increase across the board for free. Maybe.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...