Understanding XB1's internal memory bandwidth *spawn

Discussion in 'Console Technology' started by zupallinere, Sep 11, 2013.

  1. function

    function None functional
    Legend

    Joined:
    Mar 27, 2003
    Messages:
    5,854
    Likes Received:
    4,411
    Location:
    Wrong thread
    They only gave one example, but didn't say there's only one case where it can happen.

    Situations involving compute or depth ready/modify/write could also be large consumers of simultaneous read/write bandwidth. There's no doubt that there are situations where the esram / DDR3 config can offer significantly more BW than a 256-bit GDDR 5 setup could, but there are probably also situations where effective BW will be significantly lower.

    The devil will be in the detail. A single pool will certainly be easier to optimise for and be closer to the PC platforms that many launch games seem to have been built on.
     
  2. Ceger

    Newcomer

    Joined:
    Aug 21, 2013
    Messages:
    59
    Likes Received:
    1
    Other than the actual X1 architects? Who would that be?
     
  3. function

    function None functional
    Legend

    Joined:
    Mar 27, 2003
    Messages:
    5,854
    Likes Received:
    4,411
    Location:
    Wrong thread
    That was before the up-clock iirc.

    Edit: The whole GPU got the upclock, so ROPs and esram included. Measured BW for ROP or BW limited scenarios should scale linearly with the clock.
     
    #243 function, Oct 6, 2013
    Last edited by a moderator: Oct 6, 2013
  4. mosen

    Regular

    Joined:
    Mar 30, 2013
    Messages:
    452
    Likes Received:
    152
    Someone like Nick Baker (from Xbox One hardware architecture team)? :smile:
     
  5. liolio

    liolio Aquoiboniste
    Legend

    Joined:
    Jun 28, 2005
    Messages:
    5,724
    Likes Received:
    195
    Location:
    Stateless
    I guess that is how much bandwidth the scratchpad can deliver in bandwidth bound scenario.
    It is quite an high measurement.
    I'm not sure (or worse I suspect why) why so much people are wary of MSFT claims and can't take what they said at its face value, while in the mean time... well...

    Anyway they are pretty honest, too much in my opinion, they said that achievable bandwidth from DDR3 is ~55GB/s (out of 67GB/s), not exactly PR friendly when you come out of your way to explain the choice they made in the light of Sony choice for lots of really fast Memory.
     
  6. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    Someone telling us how it was achieved and with what ops, because the DF quote I quoted paints a different picture imo. Its a moot point anyway as no one else seems interested in exploring this.
     
  7. Brad Grenz

    Brad Grenz Philosopher & Poet
    Veteran

    Joined:
    Mar 3, 2005
    Messages:
    2,531
    Likes Received:
    2
    Location:
    Oregon
    They've said they measure it as being that high, but we have no sense of whether that rate is something that can be sustained. Maybe it briefly peaks at that rate during specific operations, or maybe it is an overall average usage across a significant time scale. They've never said one way or another, but the example scenarios that have been given seem to suggest achieving that rate requires optimal access patterns.
     
  8. function

    function None functional
    Legend

    Joined:
    Mar 27, 2003
    Messages:
    5,854
    Likes Received:
    4,411
    Location:
    Wrong thread
    How so?
     
  9. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    Because it mentions they achieved the rate of 133GB/s (150GB/s now) using alpha blending . Others are assuming that the 150GB/s is over the entire timestep at not at a specific point in time with seems to be contrary to what the quote I quoted is saying.
     
  10. oldschoolnerd

    Newcomer

    Joined:
    Sep 13, 2013
    Messages:
    65
    Likes Received:
    8
    What I think is intriguing in this, is that if you accept ms saying that they managed to record 200GB/s real world utilisation over an entire second (30/60 frames of gameplay), what is the GPU doing being able to chew through so much data with only 12cus etc? Compared to a 7870 or something?
     
  11. Airon

    Banned

    Joined:
    Dec 12, 2012
    Messages:
    172
    Likes Received:
    0
    You have miss the UP-clock
     
  12. function

    function None functional
    Legend

    Joined:
    Mar 27, 2003
    Messages:
    5,854
    Likes Received:
    4,411
    Location:
    Wrong thread
    But that 133 GB/s is from before the upclock. 133 x 1.066 = 142 GB/s. And that may not even be the peak BW, just the BW they measured during 64 bpp alpha blending.
     
  13. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    Thats exactly my point, the upclock is not the point here, the point is that the timestep of the measured 150GB/s is unknown if the 150GB/s only happens at alpha blending (for example) then the 150GB/s is not a good 'average' figure is it.
     
  14. function

    function None functional
    Legend

    Joined:
    Mar 27, 2003
    Messages:
    5,854
    Likes Received:
    4,411
    Location:
    Wrong thread
    Actually, I was wrong, I'd forgotten that there is more than one BW example. Here is another example that uses only 32bpp and does not use blending.

    This doesn't make it sound like a sustained 140 ~ 150 GB/s is some theoretical, pie in the sky, bullshit figure.
     
  15. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    For one, that more then saturates there eSRAM bandwidth at those figures (albeit by a tiny amount) and yet once again, the argument is not over wether or not these figures are real but how often they occur, if they occur for 1% of the frame (for example) they are useless. If we applied the 80% rule here I think we would out at a good number.

    And once again he is talking peak numbers, how often do you actually achieve your peak fillrate?.
     
  16. function

    function None functional
    Legend

    Joined:
    Mar 27, 2003
    Messages:
    5,854
    Likes Received:
    4,411
    Location:
    Wrong thread
    The interview says "we've measured about 140-150GB/s for ESRAM. That's real code running. That's not some diagnostic or some simulation case or something like that. That is real code that is running at that bandwidth."

    The upclock is the point, as it would get you into the 140~150 GB/s figure that MS give. You're taking the extreme edge of the range that they give and using a small gap with a figure that figure comfortably within that argument to sow FUD. Because that's all you're doing - trying to attach uncertainty and doubt to the claims that they're making

    Also, why would you want an "average" figure? The BW used will dependant on the workload, and so trying to get an out of context "average" figure is meaningless. The 140~150 "real world measured" figure (and they give one for the DDR3 too) is already infinitely closer to reality than the 176 GB/s figure that people are comparing it to, yet there's isn't some FUD campaign against that.

    They're giving you actual examples of real world bandwidth from specific use-cases. No-one else is doing that. What is actually going on in this thread?
     
  17. Brad Grenz

    Brad Grenz Philosopher & Poet
    Veteran

    Joined:
    Mar 3, 2005
    Messages:
    2,531
    Likes Received:
    2
    Location:
    Oregon
    That rate can't even be achieved with just ESRAM because he's talking about 164GB/s or write only so it outstrip the 109GB/s max for writing to ESRAM. It is not an example of a usage scenario that results in 150GB/s combined read+write for the ESRAM alone.
     
  18. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    I'm just trying to workout over what operations the number is from and over how long, if that is FUD then I honestly think you have its meaning mixed up with something else. The upclock is not relevant because im aware of it and using it as I always have, i'm just finding it strange that the prior DF article said they were getting the equiv of 150GB/s prior to the upclock on ONE case that was alpha blending.
     
  19. Ceger

    Newcomer

    Joined:
    Aug 21, 2013
    Messages:
    59
    Likes Received:
    1
    And that would be with what they have and know now. I'd expect average results to go up as they learn better ways to utilize the ESRAM as time goes on.
     
  20. function

    function None functional
    Legend

    Joined:
    Mar 27, 2003
    Messages:
    5,854
    Likes Received:
    4,411
    Location:
    Wrong thread
    So you want to apply the 80% number to their 80% number?

    How is a BW figure that's actually been measured in real application 'useless', in a world where people like their competitors use peak and utterly, utterly unattainable bus x clock figures in their marketing?

    And what would be the point in giving a BW range for regular workloads if they couldn't sustain that BW and those workloads over a meaningful period of time? Why would they be giving those figures to developers in NDA'd docs? It's not like they couldn't check to see.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...