How much work must the SPU's do to compensate for the RSX's lack of power?

Discussion in 'Console Technology' started by Commenter, Jun 16, 2010.

Thread Status:
Not open for further replies.
  1. swaaye

    swaaye Entirely Suboptimal
    Legend

    Joined:
    Mar 15, 2003
    Messages:
    8,591
    Likes Received:
    673
    Location:
    WI, USA
    The old G7x architecture has trouble with more complex shader programs. Dynamic branching is not a G7x strength, for example. ATI was ahead of the game with R520 for complex shaders compared to G7x. Xenos' design is somewhat like a hybrid of R520/R600 so it's possibly better than RSX at complex shaders.

    G7x also has performance issues if you go beyond bilinear filtering. They cheated in their PC drivers, with aggressive trilinear and anisotropic tweaks, to be more competitive.

    RSX is also likely limited to max 4X MSAA, but that probably doesn't matter for console-land.
     
    #21 swaaye, Jun 17, 2010
    Last edited by a moderator: Jun 17, 2010
  2. Weaste

    Newcomer

    Joined:
    Nov 13, 2007
    Messages:
    175
    Likes Received:
    0
    Location:
    Castellon de la Plana
    RSX seems to be a strange animal in that it does certain things very well yet others no so well depending up the context. No, it can't compete with the Xenos with certain things as it doesn't have the raw bandwidth with the ROPs etc.

    Sometimes it's nice to see what real developers say about it. For example, the most recent I've seen someone say things about RSX is http://www.eurogamer.net/articles/digitalfoundry-lbp2-tech-interview

     
  3. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    ... which is why the SPUs come in handy to keep the RSX focused on its tasks on-hand -- like having a dedicated vertex processor and dedicated pixel processor working at the same time.

    The culling may take up as much as a rough 40% of the SPU power, but there should be plenty to go around. They can fit other jobs in between culling, or even while culling.
     
  4. swaaye

    swaaye Entirely Suboptimal
    Legend

    Joined:
    Mar 15, 2003
    Messages:
    8,591
    Likes Received:
    673
    Location:
    WI, USA
    It's interesting how that worked out. The bought an old school GPU, threw in a funky new school CPU. I don't think they knew how it was all going to work out. The fake pre-rendered videos and stupid marketing stuff was proof enough of that. I think the CPU and Bluray were mostly Sony pushing their own tech for some ROI and to push their corporate world view or whatever. Devs discovered that the CPU could be used to supplement the gimpy GPU. How "fortuitous", aside from the extra pain involved for the people working unreal hours to make the games.
     
  5. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    While the Cell is a general purpose CPU, the PS3 is clearly designed to allow Cell to help in graphics. Otherwise, there'd be no need to have Cell fast read/write the video memory directly. I believe they profiled games for a year or more when they designed Cell. At that time, rendering was probably the only heavy weight job. Game physics and video/audio recognition AI were at their infancy (Nothing much to profile).
     
    #25 patsu, Jun 18, 2010
    Last edited by a moderator: Jun 19, 2010
  6. T.B.

    Newcomer

    Joined:
    Mar 11, 2008
    Messages:
    156
    Likes Received:
    0
    40%? I'm curious, what title does that number come from?
     
  7. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    I remember reading Graham's post about Uncharted 2 using 40% Cell for culling. But I can't find any link right now. What's the right figure, insider ? :p
     
  8. jonabbey

    Regular

    Joined:
    Oct 12, 2006
    Messages:
    809
    Likes Received:
    1
    Location:
    Austin, TX
    Given that the Cell reads from RSX's RAM at 16MB/s, I wouldn't lean so hard on the Cell having fast read/write of the video memory. :wink:
     
  9. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Ah yes, it's the other way round. The Cell can write to video memory fast for RSX to read.

    I believe the RSX also has a larger texture cache (may be to smooth data access from the video and system memory ?), but it's all NDA'ed.
     
  10. Squeak

    Veteran

    Joined:
    Jul 13, 2002
    Messages:
    1,262
    Likes Received:
    32
    Location:
    Denmark
    That was proven looong ago to be FUD. Yes it's true but it has little impact on rendering to copy the buffer over to CPU RAM.

    RSX has ca. 1.25 the transistor budget of Xenos if you leave out the daughter dies EDRAM cells. Nvidia would have to be incredible bad engineers for that not to make a difference.
    The EDRAM die is mainly a cost saving measure. It's nice for a few things, but the features it offers is far from free. When some of the most high profile exclusive and 1st party games on the platform jumps through hoops to use or not use it, something is wrong.
    The reason some games look slightly better on 360, can as I already said before, have many other reasons than technical superiority.
    What matters is that PS3 beat 360 pretty thoroughly, when comparing the top games on either platform. It should, it's a year younger and the hardware's first iteration didn't look like the components was thrown in with a shovel in the mid eighties.
    I would like to see what numbers the poster who came up with the 70 to 80% number used, because it can't the pretty creditable Wikipedia numbers I'm looking at, where RSX comes out in top more often than not.
     
    #30 Squeak, Jun 18, 2010
    Last edited by a moderator: Jun 18, 2010
  11. jonabbey

    Regular

    Joined:
    Oct 12, 2006
    Messages:
    809
    Likes Received:
    1
    Location:
    Austin, TX
    And RSX can read/write XDR fast, yeah.

    It's very nice that the SPUs are there, but I wish Sony hadn't handicapped the RSX so much. A 256 bit interface to the GDDR3 would have helped a whole lot.

    Everyone says that makes a system much more expensive and harder to price-reduce, unfortunately.
     
  12. jonabbey

    Regular

    Joined:
    Oct 12, 2006
    Messages:
    809
    Likes Received:
    1
    Location:
    Austin, TX
    I wasn't intending FUD, I was just correcting patsu on the very specific claim he made.
     
  13. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    It's not FUD. It's just not an intended use case.
     
  14. scooby_dooby

    Legend

    Joined:
    May 28, 2005
    Messages:
    8,563
    Likes Received:
    145
    Location:
    E-town, Alberta
    Physics is an entirely different ballgame, I'm not talking about Physics.Obviously more CPU resources allow for better physics, that goes without saying.

    I can see AI becoming burdensome if you have 100s of AI's to control at once, but that has nothing to do with the Quality of AI, but rather the magnitude. So, more CPU power may make it more feasible to have more objects with AI, or allow you to not scale back your AI when supporting many objects, but it doesn't really do anything towards making an individual AI object "smarter". This is a programming/logic issue.

    Pathfinding can get fairly processor intensive for sure, but I'm thinking more of decision making.

    All AI's in games these days are still just as stupid as they were last generation. ie Halo had great AI last gen, and it has basically not improved whatsoever with this latest generation.
     
  15. Npl

    Npl
    Veteran

    Joined:
    Dec 19, 2004
    Messages:
    1,905
    Likes Received:
    6
    After 1-2 die-shrinks they could`ve replaced the 256 bit GDDR3 RAM (512MB) with a 128bit GDDR5. Initially it would`ve been added cost for sure, but atleast you`d have a observable advantage in multiplattform games, instead of parity or disadvantage for more money compared to XB360.
     
  16. Murakami

    Regular

    Joined:
    Jul 26, 2002
    Messages:
    443
    Likes Received:
    0
    Location:
    Padua, Italy
    Just a curiosity: if SPU's are used for vertex processing, whick work is letf to RSX vertex shader units? Are they underutilised in latest games?
     
  17. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Again, not for recognition AI though. The software are smarter today, but still trying to chase after live human actions, speech, and other form of expressions, etc. MS just invested millions into this relatively new branch of gaming. Sony started to work on it since EyeToy. I think they both have a long way to go.
     
  18. Arwin

    Arwin Now Officially a Top 10 Poster
    Moderator Legend

    Joined:
    May 17, 2006
    Messages:
    18,063
    Likes Received:
    1,660
    Location:
    Maastricht, The Netherlands
    It's FUD in-so-much as the way these things work is misunderstood, I think? The RSX is the boss of the memory bus between itself and the other components for the most part. From what I understand, the RSX can stream textures from main memory at high speed, and read a framebuffer from there in the same way, as well as vertex data.

    But similarly, the RSX can push data straight at the Cell or even (only?) I think directly at an SPU, which can then process, pass on, and write to, say, a framebuffer or a texture in main memory, from where RSX can pull the data again.

    So the 16MB/s is just the Cell who can't read fast from the graphics memory. So why is the GPU boss of graphics memory, while the main memory can be accessed so 'freely' by both the Cell and the GPU? This is because the main memory is RAMBUS stuff, made and slotted into the motherboard to be effeciently accessed by several components at once. The graphics memory on the other hand is GDDR memory designed to be very quickly accessed by one component only. Hence the RSX being master of it.

    That the Cell can in fact sort of write to it directly at all, is because for DVD/BluRay playback, it's convenient to be able to use the Cell to process the compressed stream and put it straight into the framebuffer. The 4GB/s alotted to this is just for that purpose (but note that it was also useful in the Linux environment where the RSX was taken out of the picture completely for security purposes, in hindsight a correct decision I think).

    So at any rate, this means that the RSX has the high speed read/write necessary for close cooperation with the Cell processor and its SPUs to make an interesting, integrated rendering pipeline.

    Note that this also answers the question of why the RSX was linked to its graphics memory with a 128bit interface - the other 128bit is hooked up to main memory.
     
  19. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    I suspect the (rumored) larger texture cache may have something to do with it.

    Csn RSX push data to a Local Store directly ? (I don't know either way). That's rather interesting if true. I only know you can map the Local Stores and main memory into a logical global memory map.

    EDIT: So may be by writing into the right range of memory, it's equivalent to loading data into an SPU's Local Store ?
     
  20. T.B.

    Newcomer

    Joined:
    Mar 11, 2008
    Messages:
    156
    Likes Received:
    0
    A lot less than that. ;)
    Maybe he was talking about all of geometry processing, which could make sense. I don't remember seeing Uncharted's SPU schedules, but 40% for culling seems excessive, unless you do some pretty sophisticated occlusion culling.

    In any case, 80ms are 256M cycles, or 1G cycles, if you account for SIMD. So even if you have 10M triangles in your scene, that would be 128cy per triangle.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...