Understanding XB1's internal memory bandwidth *spawn

Discussion in 'Console Technology' started by zupallinere, Sep 11, 2013.

  1. dobwal

    Legend

    Joined:
    Oct 26, 2005
    Messages:
    5,955
    Likes Received:
    2,325
    Hey, taisui, 3dilettante (or basically anybody with greater knowledge of memory) could you please help me out?

    What if the volatile bits that were talked about the ps4 were applicable to durango and its esram?

    Would it be okay under that circumstance to say that for alpha blending FP16 X 4, every 16 bits of data had a volatile bit associated it. With a 256 bit X 4 interface, you wouldn't get 256 bits per pool of eSRAM but 240 bits. 240 X 4 would give 960 bits of worth data accessed per cycle or 120 bytes X 2 at a DDR which at 800 mhz would work out to 192 GB/s. At 853 mhz that works out to 204.72 GB/s. Or am I wrong in my math or reasoning?


    If SIMD takes about 4 cycles to complete an operation, would alpha blending worked out like this?

    60 pixels (960 bits) read in a cycle with 4 cycles needed to complete operation on gpu and an additional cycle for write to eSRAM. Throughput would be 660 pixels per 16 cycles extrapolated out to 1.6 Ghz would that work out to 132 GB/s? Thats missing a GB every second though.
     
  2. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    The volatile bit is a cache line status flag indicating the value was loaded from coherent memory.
    There's a bunch of strikes against using in this regard.

    It wouldn't be included in the data bandwidth of the storage pool, it wouldn't be every 16 bits, and the usage described for it so far wouldn't apply for blending because the ROPs don't deal with coherent memory.
    It's also described as a Sony customization, so there's that as well

    edit:

    There's a benefit for having a large on-die memory pool.
    There's a multitude of reasons why SRAM or eDRAM could be chosen or not, and not all of them are related to the code being run on them.
    At that size, it could be argued that eDRAM would be superior in many ways, but there would have been concerns from the perspective of manufacturing the device and its ability to be updated to future nodes.

    The makeup of the storage cell is not really relevant to the bits stored in it.
     
    #222 3dilettante, Sep 16, 2013
    Last edited by a moderator: Sep 16, 2013
  3. dobwal

    Legend

    Joined:
    Oct 26, 2005
    Messages:
    5,955
    Likes Received:
    2,325
    AMD patent.
    Abstracting scratch pad memories as distributed arrays
    http://www.google.com/patents/US20130212350

    I was thinking while eSRAM may not be cpu coherent, can't it still be I/O coherent? Or basically coherent with portions of gpu caches.

    I was thinking one of the function of the volatile bit was selective flushing.

    Probably wrong in my thinking so thanks.
     
    #223 dobwal, Sep 16, 2013
    Last edited by a moderator: Sep 16, 2013
  4. taisui

    Regular

    Joined:
    Aug 29, 2013
    Messages:
    674
    Likes Received:
    0
    I don't feel it's likely that theoretical bandwidth formula are more detailed than ports x interface x frequency, since it's just theoretical, and it has always been computed this way for the longest time.

    Since Panello had already clarified the 204 vs 218 discrepency, it would seem that it's just as simple as a typo.
     
  5. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    There seems to be some kind of link hinted at in the Hot Chips slide, but it doesn't describe what that is.

    That bit is set for cache lines loaded from coherent memory, which the ROPs do not deal with. The primary purpose is for selective flushing, but the first official disclosure of it was Sony saying it was a feature they asked for.
     
  6. (((interference)))

    Veteran

    Joined:
    Sep 10, 2009
    Messages:
    2,499
    Likes Received:
    70
    To be fair, Richard put all the info he had from that post on MS's Developer Central on the ESRAM upgrade so it's not like he could have given more clarity there.
     
  7. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    My criticism was to the probative value of the information he was given, which is where some if the more technical skepticism from myself and a few other posters came in.
    I know internal sources like to give tidbits that say something without saying everything, which leaves room for interpretation on the part of the writer and reader.
    If it's an overly positive interpretation that doesn't bear out after people start paying money, they can say "aw shucks" all the way to the bank.

    More details have slowly leaked out over time, but that little bit of contextless information was problematic because so many scenarios could produce it--ranging from improper testing, cache effects, various design choices, and memory optimizations that are common enough that they are expected as a matter of routine.

    My first reaction was that the little bit of information was leaked ambiguously on purpose.
    It's not like the console makers are above trumpeting every standard CPU and GPU feature as something new and mind-blowing.
     
  8. (((interference)))

    Veteran

    Joined:
    Sep 10, 2009
    Messages:
    2,499
    Likes Received:
    70
    Well, the thing is Richard was sent a direct cut & paste of a post MS made on their internal MS Dev Central developer resource so he was just telling us what MS was announcing to devs in private.

    So it wasn't like some internal source was leaking him (and him alone) tidbits that made the XB1 seem better than it is.
     
  9. AzBat

    AzBat Agent of the Bat
    Legend

    Joined:
    Apr 1, 2002
    Messages:
    7,749
    Likes Received:
    4,847
    Location:
    Alma, AR
    Richard? You mean Albert Penello?

    BTW, long time no see. ;)

    EDIT: Nevermind, brain fart. Richard Ledbetter DF, move along, it's too early this morning. LOL

    Tommy McClain
     
  10. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    We have no idea how often the 150GB/s figure for the eSRAM happens, I suspect it is not as often as people think.
     
  11. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    The problem is that the 150GB/s is not achievable all the time. Its a 'sometimes case' just like any other peak bandwidth, but it has further caveats.
     
  12. Ceger

    Newcomer

    Joined:
    Aug 21, 2013
    Messages:
    59
    Likes Received:
    1
    No, the 208GB/s is not achievable all the time. The 150GB/s is the regularly achievable amount. This is just the ESRAM as well.
     
  13. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    But didn't they say they measured the 150GB/s in only one case? not all the time, if thats the case then its not the same kind of number.
     
  14. Cranky

    Newcomer

    Joined:
    May 22, 2013
    Messages:
    134
    Likes Received:
    0

    That is correct. I was actually being generous there and accounting for less efficiency for the ESRAM as 200/272 actually equals 73%, which I used for the numerator and not 80% which I used for the alternative. Had I used equal coefficients then the advantage would have been 59%.
     
  15. liolio

    liolio Aquoiboniste
    Legend

    Joined:
    Jun 28, 2005
    Messages:
    5,724
    Likes Received:
    195
    Location:
    Stateless
    Not really:
     
  16. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    Didn't DF previously mention that the 150GB/s was measured using alpha blending?. You're not getting 150GB/s with only read or writes, you have to do both to get it.
     
  17. mosen

    Regular

    Joined:
    Mar 30, 2013
    Messages:
    452
    Likes Received:
    152
    Are you sure? I have heard that 204GB/s is not achievable all the time and 150GB/s is real world number from real tests with real apps.
     
  18. Ceger

    Newcomer

    Joined:
    Aug 21, 2013
    Messages:
    59
    Likes Received:
    1
    No, they said that it is actual real code, not tests and such. I think you read it backwards.

     
  19. dragonelite

    Veteran

    Joined:
    Dec 20, 2009
    Messages:
    1,556
    Likes Received:
    1
    Location:
    netherlands
    Wasn't that around 135GB/s?
     
  20. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    I would like further clarification from someone who knows more to be honest.

    This is what DF said previously.

     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...