PlayStation III Architecture

Discussion in 'Console Technology' started by alexsok, Jan 7, 2003.

  1. Nite_Hawk

    Veteran

    Joined:
    Feb 11, 2002
    Messages:
    1,202
    Likes Received:
    35
    Location:
    Minneapolis, MN
    marconelly!:

    It also said that it's connected via a switched 1024bit bus. It would be pretty silly to have main memory on such an encredibly fast bus, but make it too small to store everything needed. My guess is that this is more akin to a centralized video memory, or cache area. There is no reason that you couldn't have local cache on a per processor basis, but setup a centralized L2 cache. This would be especially useful if the 64MB cache area is used for information that can be computed in parellel. Say for example, that you have multiple processors computing lighting vectors for doing monte carlo raytracing for global illumination. You can't simply work on a small area of the scene at a given time without knowing what's happeneing in other areas of the scene. In this way, each processor needs to have access to the entire scene's lighting calculations that have been accomplished so far. Thus, each processor could get the relevent information it needs from the central 64MB L2 cache, perform it's computations in the 128KB L1, and send the data back to 64MB L2. You'd need it to be switched to avoid race conditions caused by each processor reading and writing to the memory.

    I personally am pretty confident they'll get the ratios of memory to cache correct to a certain extent. As you've already mentioned, having a 64MB segment for main memory would be pretty silly, and Sony's engineers arn't that dumb. I'm more interested in how the heck they plan to have a switched 1024bit memory bus between cells. They certainly must have some talented engineers on staff.

    Nite_Hawk
     
  2. marconelly!

    Veteran

    Joined:
    Aug 25, 2002
    Messages:
    2,742
    Likes Received:
    0
    I see what you mean, and it's in agreement with the article. 64MB is probably going to be some sort of cache shared among (4?) cells. There will not be 64MB per cell as someone mentioned earlier. The actuall main memory will be a separate entity.
     
  3. DVFtaxman

    Newcomer

    Joined:
    Oct 11, 2002
    Messages:
    86
    Likes Received:
    0
    1ST PICTURE?
    [​IMG]
    :lol:
     
  4. mr

    mr
    Newcomer

    Joined:
    Oct 7, 2002
    Messages:
    143
    Likes Received:
    0
    Nah...that would only be 3 cells. :)
     
  5. bryanb

    Newcomer

    Joined:
    Sep 6, 2002
    Messages:
    68
    Likes Received:
    0
    Rapid speculation:

    So if the cells, which analogous to CPUs?, share the memory, how will this work?

    One cell could be allocated to doing transforms on graphics in that memory and it gets the lion share of the memory?

    While another cell which is allocated to AI functionality doesn't bother allocating much of that memory for its workload?
     
  6. V3

    V3
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    3,304
    Likes Received:
    5
    The guess is One cell = PowerPC + 8 APUs ?

    That's like an overblown PS2 EE.

    4GHz producing 256 GFLOPS, for each cell, is definitely doable considering it has 8 vector processors.

    Now this central memory arrangement is similar to that IBM cellular article. Now that 1024 bit bus to the central memory, I assume would be shared by several cells.

    At this moment I don't think each cell will have 1024 bit into the central memory. My guess would be each cell have 128 or 256 bit bus to the central memory. If 256 bit than there should be 4 cell which would gives 1 TFLOPS.

    1 TFLOPS is not 1000 times PS2. Maybe the new GS with all the new features will add up.

    I haven't gone through the patent yet. I'll save it for later.

    The Inquirer integrity is from a patent, so its not too bad :) Yes that 64 MB its for several cell to share, this is not a suprise since that old IBM article stated the advantages of SMP.
     
  7. Vince

    Veteran

    Joined:
    Apr 9, 2002
    Messages:
    2,158
    Likes Received:
    7
    I aim to stay away from this; but isn't this difficulty what GRID based computing is? I should hope that any implimentation of a hardware design as elegent as this would find better solutions for problems than a weak and obvious idea to divide tasks up as you stated.

    If the abstraction is there; who knows, or cares, what the underlying architecture is.

    Not to sound like I'm kissing Sony's ass; but do you have any idea how much 'power' there is in a sustainable tFLOP of computing? You sound like the guys in the 3D forum asking why not 4 or 6 TCU's per pipline as if bigger nomenclature is allways better.

    Much more important than the big numbers is that there is a substantial about of onboard memory with an extremely low latency and high bandwith. This is what Diefendorff (sp) talked about in his paper on the future of dynamic media and computing. This, along with the Yellowstone announcement which could bring 30Gb/sec of system level bandwith in addition to the internal bandwith - and you have a machine that could be quite formidable.
     
  8. Tagrineth

    Tagrineth murr
    Veteran

    Joined:
    Feb 14, 2002
    Messages:
    2,537
    Likes Received:
    25
    Location:
    Sunny (boring) Florida
    So PS2 only manages 1GFLOP? Because that's what you get when you divide 1 TFLOP by 1,000...

    THAT's what he means. It doesn't quite add up mathematically.
     
  9. Vince

    Veteran

    Joined:
    Apr 9, 2002
    Messages:
    2,158
    Likes Received:
    7
    How dumb, The comment was talking about preformance of the console as a whole. I have the comment made by Okamoto, even had the slides which showed this posted.

    Every part of the console doesn't have to be 1000X preformance, just the agregate.

    Beyond this talk of he said, she said - sweetie, do you just want to argue? I mean, a tFLOP is roughly 1/30th the computing preformance of the worlds ranking supercomputer. It's about 1/10th that of the most advanced ASCI seies the DoD and DoE use - all in a single chip. The costs of a ASCI are in the millions of dollars for electricity alone. And their going to put this in a frickin' game console.

    So, to make a comment like "It's not 1000X, so I can bash IBM/SCE/Toshiba and feel good about myself" is a bit retarded.
     
  10. Glonk

    Regular

    Joined:
    May 26, 2002
    Messages:
    334
    Likes Received:
    0
    Location:
    Markham, Ontario
  11. Kolgar

    Regular

    Joined:
    Mar 3, 2002
    Messages:
    698
    Likes Received:
    2
    Location:
    Wisconsin
    Yeah. :( I was with him until that last bit. :oops:

    Kolgar
     
  12. V3

    V3
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    3,304
    Likes Received:
    5
    But you do sound like that :)

    Anyway I went through the patent.

    That's how Inquirer got 4 GHz and 256 GFLOPS. But that's not final though.

    It also seems to be flexible. Those Attached Processor Units can be varied.

    The glimpse of PS3.
     
  13. Vince

    Veteran

    Joined:
    Apr 9, 2002
    Messages:
    2,158
    Likes Received:
    7
    Um, ok.. whatever V3.

    Cloest thing I've ever hear of would have to be the P10 architecture; you can dynamically form 'pipelines' for specific tasks using the PU as an arbitrator of individual APUs (which are said to be SIMD like) operating on data in their sandboxs. - I'm not so clear on on the 'Software Cell' part...
     
  14. randycat99

    Veteran

    Joined:
    Jul 24, 2002
    Messages:
    1,772
    Likes Received:
    12
    Location:
    turn around...
    Man, this thing just gets more and more interesting... :D
     
  15. V3

    V3
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    3,304
    Likes Received:
    5
    According to that patent the PU would have the same ISA. The APUs is more flexible.

    Check this part out.

     
  16. Vince

    Veteran

    Joined:
    Apr 9, 2002
    Messages:
    2,158
    Likes Received:
    7
    Ok, but what exactly dictates or descides what data/code/et al resides with in the "body" of the software-cell?

    I picked up on the global identification system, but still curious how someone would code for this. It's definatly cool though. Am also curious as to what extent this distributed computing will go - as far as the initial speculation went? Or how much of this flexibility will be seen by developers; basically how low-level they will code.
     
  17. megadrive0088

    Regular

    Joined:
    Jul 23, 2002
    Messages:
    700
    Likes Received:
    0
    Lets say the PS3's Cell or Cells has 1 TFLOPs of computational power. lets assume that PS3 is significantly more effecient than PS2. The PS2 has 6.2 GFLOPs/sec but is fairly ineffecient. I have heard comments that PS2 sustains as little as 1 GFLOP/sec. Could it be that if PS3's 1 TFLOP is going to be sustained performance, that PS3 could be closer to 1000 times PS2's performance than 1 TFLOP vs 6.2 GFLOPs might suggest at first thought?
     
  18. V3

    V3
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    3,304
    Likes Received:
    5
    I think it depends on the apps. This one looks at the streaming MPEG.

    In this example one APUs examines the header for MPEG data. So the APUs doesn't have the knowledge of what's in the software cell at first. So the determination of what's goes into software cell is not that important as long as the header describe what it is.
     
  19. megadrive0088

    Regular

    Joined:
    Jul 23, 2002
    Messages:
    700
    Likes Received:
    0
    Another possibility (of many possibilities) is that the graphics processor, Graphics Synth 3, has its own 1 TFLOP 4-cell chip for geometry & lighting calculations.

    Also, it's very likely, almost gaurnteed, that Graphics Synth 3 will be completely floating point throughout its pipelines, like the current ATi R300 and Nvidia NV30, thus adding more performance to the equasion, even if Graphics Synth 3 does not have its own Cell for geometry & lighting.
     
  20. V3

    V3
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    3,304
    Likes Received:
    5
    The possibility is that one of the Processor Element would have the APUs being pixel engines. So they might have a one chip solution. Which would be good for cost.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...