CELL Patents (J Kahle): APU, PU, DMAC, Cache interactions?

Discussion in 'Console Technology' started by j^aws, Aug 19, 2004.

  1. j^aws

    Veteran

    Joined:
    Jun 1, 2004
    Messages:
    1,909
    Likes Received:
    8
    [​IMG]

    [​IMG]

    IBM Source :Updating remote locked cache

    IBM Source: On-chip data transfer in multi-processor system

    I'm pretty sure these are Cell related patents from IBM, James Kahle. They describe interactions between a processor with local memory (APU), with a DMAC, a processor with cache (PU) and system memory.

    They show that the APU local memory can directly access PU cache. Also PUs can have L1, L2 and L3 caches. L2 is shared with APU local memory. L3 is shared with other PUs L3. 8) ...bye bye latencies :?:
     
  2. Panajev2001a

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,187
    Likes Received:
    8
    Edit: I take it back...

    I already saw that patent, but I need to re-read it again.

    This might be a mess of Virtual to Physical address translation as usually DMAC talk in terms of Physical addresses: there is to say that I do not see Virtual Memory being used in PlayStation 3 games and even if they do use it, there is ways around it that the guys writing the CELL OS and the basic libraries for CELL can take (as long as they can assure that if I allocate a X MB chunk with malloc/new that the chunk is all physically contiguous then there should be no problem... stitching DMA packets might be another challenge to add to the table, but even that problem could be solved).
     
  3. Fafalada

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    2,773
    Likes Received:
    49
    OMG, Xenon violates a PS3 patent because it's GPU can read from CPU cache!!!!!!!!
    Ok joking aside, at least this shows they are thinking about how memory goes around efficiently. Whether they are thinking about it enough... well... we'll see..

    Tsk tsk, you need to brush up on your cell patents Pana. ;)
    Your question has been answered quite some time ago... the Virtual addresses are translated BY the DMA - controller is supposed to house a TLB (or use the host cpu one, I forget which).
     
  4. Panajev2001a

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,187
    Likes Received:
    8
    You are right, I am sorry those pages of my memory wewre in the swap file at the moment of that post.

    Actually I answered this question to you on either here or GA as we had a nice discussion about APUs being able to DMA really or not :p.

    It figures that I am the one that misses that detail now... :(.

    I am more PlayStation 2 oriented now and its DMAC hates Virtual addresses :lol.

    BTW, I finally got to have a single DMA call and more than one object displayed in in the scene each using its own matrix (no hardcoded 2-3 objects limit), two layers of CALL tags ( one master CALL tag per object basically that calls the first tag of a CALL chain that calls all the sub-DMA chains that upload the needed data and render the object) and double bufferign of inputs and outputs on VU1.

    I have BASE = 0 and OFFSET = 512, my VU packets are less than 4 KB in size.

    In each 8 KB buffer I have the input data and then an area for the data to be output to the GS (UV/ST coordinates, RGBAQ data, transformed vertices, GIFTag, etc...).

    It took a while to get things working :(.
     
  5. DeanoC

    DeanoC Trust me, I'm a renderer person!
    Veteran Subscriber

    Joined:
    Feb 6, 2003
    Messages:
    1,469
    Likes Received:
    185
    Location:
    Viking lands
    Re: CELL Patents (J Kahle): APU, PU, DMAC, Cache interaction

    The fact these are IBM patents is fairly important, apart from DMAC this describes 'another' system pretty well as well.

    BTW Latencies are still stupidly high even with lots of cache.
     
  6. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    Those 2 patents are 'old'..I believe someone already posted here about them.
    This is new:
    Streaming data using locking cache

     
  7. Guden Oden

    Guden Oden Senior Member
    Legend

    Joined:
    Dec 20, 2003
    Messages:
    6,201
    Likes Received:
    91
    What is what in those diagrams?

    In the first one, is 202 supposed to be a PU and 204 an APU? In that case, what's 110? Also, I have no memory of a bus controller sitting inbetween each PU and the main system bus in previous Cell diagrams. The second diagram adds to the mess, that architecture doesn't seem to correspond with the one depicted in the first diagram. Actually, none of them look particulary Cell:y to me.

    Previous Cell descriptions have had the PU core first and then a row of APUs hanging off of it, but here bits of that chain seem randomly omitted. How can a patent apply if it doesn't describe an actual implementation? Again, doesn't look like Cell to me.

    This might actually be Xenon CPU core methinks. :lol:
     
  8. one

    one Unruly Member
    Veteran

    Joined:
    Jul 26, 2004
    Messages:
    4,823
    Likes Received:
    153
    Location:
    Minato-ku, Tokyo
    Kahle working in 2 teams at the same time? :roll:
     
  9. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    Umh..maybe Xenon CPU and PS3 PUs will have a lot of stuff in common..
     
  10. Guden Oden

    Guden Oden Senior Member
    Legend

    Joined:
    Dec 20, 2003
    Messages:
    6,201
    Likes Received:
    91
    Are you sure he's actually working in either team, and don't just serve as some kind of manager/oversight dude? He could just be the one signing the patent application docs you know, doesn't mean he's the one actually working on the design. IBM's a huge corporation with many things going on at the same time, so I don't see it as anything strange if a guy has a finger in multiple projects.

    Miyamoto does the same for Nintendo, so why not here? :D
     
  11. one

    one Unruly Member
    Veteran

    Joined:
    Jul 26, 2004
    Messages:
    4,823
    Likes Received:
    153
    Location:
    Minato-ku, Tokyo
    Wow, while there are plenty of engineers in IBM, you say there is shortage of capable managers in IBM the mega-corporation? :lol:

    When is Nintendo doing an outsourced job? :roll:

    Rather I'm curious about ATI that bought ex-SGI ArtX people and now making Xbox GPU and probably Nintendo one.
     
  12. Guden Oden

    Guden Oden Senior Member
    Legend

    Joined:
    Dec 20, 2003
    Messages:
    6,201
    Likes Received:
    91
    Why would I say that? Sheesh... :roll::lol:

    Well, they outsourced Metroid Prime... :lol: Anyway, what does outsourcing have anything to do with anything in this thread?
     
  13. one

    one Unruly Member
    Veteran

    Joined:
    Jul 26, 2004
    Messages:
    4,823
    Likes Received:
    153
    Location:
    Minato-ku, Tokyo
    Nah, I mean Nintendo is doing contracted development for other publisher or not. Why in the first place have you put software projects in here?

    BTW skimming through patents I find this new patent application 'Game system with graphics processor' which describes PS2 :lol:
     
  14. Gubbi

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,528
    Likes Received:
    862
    Re: CELL Patents (J Kahle): APU, PU, DMAC, Cache interaction

    Are these two patents related in any way?

    The former patent details a system that uses bus snooping to ensure cache coherency, the latter uses a directory based system.

    The difference is huge. In a snooping system, coherency traffic goes up with n squared, where n is the number of CPUs (local memories or caches really). In a directory based system it scales with the number CPUs.

    Opteron broadcasts memory requests (snooping) and scales poorly beyond 4 CPUs. SGI's Altix (and old Origin 2&3K) series and Alpha EV7s uses directory based coherency and scales to 2^10 CPUs (and more).

    If they really apply to CELL, I can see snooping used in small scale CELL systems and directories used in large scale systems.

    I'm puzzled that these patents are granted in the first place since there seems to be very little new in them.

    Cheers
    Gubbi
     
  15. Guden Oden

    Guden Oden Senior Member
    Legend

    Joined:
    Dec 20, 2003
    Messages:
    6,201
    Likes Received:
    91
    You must be mistaken, because I have not.
     
  16. j^aws

    Veteran

    Joined:
    Jun 1, 2004
    Messages:
    1,909
    Likes Received:
    8
    [​IMG]

    Probably flamebaite but a guestimate...any thoughts? :p
     
  17. Guden Oden

    Guden Oden Senior Member
    Legend

    Joined:
    Dec 20, 2003
    Messages:
    6,201
    Likes Received:
    91
    Yeah: no amount of L2 is confirmed yet (though some is suspected I guess), and where did you get the idea there would be any L3 at all? It's not in any of the illustrations.

    Also, the patent seems to describe some kind of shared bus, or possibly crossbar bus, that gives processors from one PU access to cache and scratchpad belonging to another PU.
     
  18. Fafalada

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    2,773
    Likes Received:
    49
    It could sort of make sense though couldn't it? PE only needs to scale up to 8 APUs, while external communication is expected to scale much higher.

    The odd thing is that the directory patent refers to transfers "on-chip" and alludes to the situation illustrated being on a single chip.

    I think they are not granted yet? Anyway, I agree - looking at that newest patent nAo posted - it doesn't seem to contain anything particularly new either.
    Then again - we also saw there are patents for GCN and PS2 respectively... :?

    Jaws, don't you think you're going a bit overboard with all the cache? If there'll be THAT much eDram don't expect large or lots of caches. I somehow doubt we'll see the massive eDram pool though.
     
  19. V3

    V3
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    3,304
    Likes Received:
    5
    Its probably better for them to go for more caches.

    At this point, If they didn't embed the 32MB of the PSP memory on chip, I find it difficult to think that they'll make enough progress within a year to be able to embedd 64MB of memory with the kind of logics and caches that cell is suppose to have. Well my mind probably change again at the end of the year, when they supposedly going to do some kind of demonstration.
     
  20. j^aws

    Veteran

    Joined:
    Jun 1, 2004
    Messages:
    1,909
    Likes Received:
    8
    From the 2nd patent,


    I've tweked the diagram a bit so that the L3 cache is shared amongst the PUs. It now looks more like a unified L3 cache! 8)

    [​IMG]
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...