How much % acceleration is Cell for RSX

Discussion in 'Console Technology' started by ihamoitc2005, Jul 5, 2010.

  1. Akumajou

    Regular

    Joined:
    Nov 11, 2004
    Messages:
    725
    Likes Received:
    64
    Its nice to do these comparisons but a console like PS3 has custom development tools like OpenGL/ES to the very low level LibGCM this is used in combination with Cell to increase performance while on a PC enviroment the graphics card is really doing all of the work and the cpu is just scaling... I could have missed something though.
     
  2. Mr Deap

    Newcomer

    Joined:
    Oct 5, 2007
    Messages:
    100
    Likes Received:
    0
    It's more like off loading the RSX with by addin specific task by increasing the load on the Cell, because using this way can shorter the overall time frame rendering.

    Well there are weak spot as few task cannot be combined due it will lower the overall performance or the end result will result unwanted overlay artifact.

    Load Cell + Load RSX <=> Balanced to time frame rendering target

    The reality is that the load of the GPU take a lot of importance in the overall time frame rendering with the combinaison with the CPU that even with optimization with the help of the CPU for rendering, the load balance is still very high on the GPU side.

    Well if we target Apple vs Apple I can see 5 to 7fps increase at most, the target framerate is like 30fps, it's a lot for a console system.

    Example, if the current development run at 37fps, you can increase task for more candy on either CPU or GPU with specific task to get the 30fps target to get more rendering.

    Some task have different quality & can increase the load on either target. Remember that a GPU & CPU are calculator in the end & those can be specified with any task. Some task can do better on either side but still take load no matter what you do.

    There's also the overall architecture that take a role in it such as latency which can add more in the equation & the specific task if a huge file is required.
     
  3. Arwin

    Arwin Now Officially a Top 10 Poster
    Moderator Legend

    Joined:
    May 17, 2006
    Messages:
    18,762
    Likes Received:
    2,639
    Location:
    Maastricht, The Netherlands
    Well I'm sure we can accumulate at least ballpark figures for a few things. For instance, apparently MSAA is very expensive on RSX. Let's say it decreases the level of detail you can handle by up to 60% for 4xMSAA. Then if you move your AA solution to MLAA on the SPUs, you can have a very big improvement in performance if that is where your primary bottleneck is.

    It's all down to the bottlenecks in your gaming engine, which will be different from game to game, and I think it's completely fair to say that this question just cannot be answered properly unless we take very specific examples or know the details of a particular game engine's requirements and bottlenecks. For all you know physics calculations are your bottleneck and it's not graphics at all.
     
  4. Laa-Yosh

    Laa-Yosh I can has custom title?
    Legend Subscriber

    Joined:
    Feb 12, 2002
    Messages:
    9,568
    Likes Received:
    1,455
    Location:
    Budapest, Hungary
    Code doesn't matter much if the internal architectures of the systems are completely different. The memory controllers, buses, bandwidths, latencies and such are not even on the same page, so it's worthless to try to draw any conclusions.

    Then it all depends on the actual content as well, is it a light pre-pass renderer, a standard forward renderer, is it using high poly assets, is there dynamic lighting and if yes, how many light sources, how complex are the vertex shaders (skinning, cloth, deformable objects and terrain) and so on. Every single game presents a different kind of workload, even within the FPS/TPS/racing/whatever genres.

    And on top of this all, bottlenecks are constantly changing even through the rendering of a single frame. Vertex load may be starting low, then go high, then sink low again, then turn to absolutely nothing; SPU's might have a limited amount of time to do anything meaningful because data will have to start going to RSX in order to complete the rendering before vsync; there may be game or physics or other code to run, etc etc.


    This entire effort is meaningless, even the developers of any single game would probably have serious trouble giving even rough estimates.
     
  5. ultragpu

    Banned

    Joined:
    Apr 21, 2004
    Messages:
    6,242
    Likes Received:
    2,306
    Location:
    Australia
    Is it possible to predict how much more software optimization on SPUs that can be extracted? AFAIK parallel processing or multicore programming are still at their infant stage and I keep hearing this game reaches 50%, 100% of CELL etc. I've always wondered what is the limit of software optimization, are we close to code to the metal yet?
     
  6. Ruskie

    Veteran

    Joined:
    Mar 7, 2010
    Messages:
    1,291
    Likes Received:
    1
    Oh my this is quite bad...but than again dont forget this is on PC so they cant code to metal like on consoles and not only that,those are 720p 4xAA and console version is a bit lower in res,but 60fps...Than again RSX is even worse gpu than 7900GTX.The definite fact is,SPUS are doing quite alot to do what RSX should do.
     
  7. ihamoitc2005

    Veteran

    Joined:
    Sep 21, 2005
    Messages:
    1,181
    Likes Received:
    15
    I disagree.

    We can estimate these things. That's why we can run game benchmarks for multiple games and get approximate average relative performance of PC GPUs. Toms hardware and Anandtech do that for a living, no?

    The idea is to compare full RSX rendering (run a game on 7900GT, perhaps) at console resolutions and detail/AA/AF settings then compare with how much potential speed-up you can get by removing things that have large drag on performance like MSAA or post-processing.

    No need for precise numbers. No need for reengineering code. On many PC games you can turn things off to see speed up. Over a number of games you get approximate sense of how much can be gained and where by removing that load from the GPU.

    The key word is approximate.

    So, for example, if 7900GT can run Crysis at 20fps at high settings (except 2 at medium) at 1024x768, how much can it run if you turn off AA (move to SPU)? What if you lower shadow quality (simulate preprocess on SPU)? What if you remove depth of field (move to SPU)? Etc.

    Example video Crysis Max settings on over-clocked 7900GT at 1024x768:

    http://www.youtube.com/watch?v=7Wmp2mS7KLc&feature=related

    This will also, perhaps, reveal if there's another bottleneck for RSX utilization.

    For example, people claim they can run Bioshock on PC at max settings at 1280x1024 on a 7900GT without problems. But RSX version is visibly poorer despite similar GPU architecture.

    Example video Bioshock at 1280x960 with following settings:

    1280 x 960 No AA / No AF

    Everything on High Quality

    Windowed mode: Off
    Vertical Sync: Off
    Shadow Maps: Off
    High Detail Shaders: On
    High Detail Post Processing: On
    Real Time Reflection: On
    Distortion: On
    Force Global Lighting: On

    30 - 80 fps

    http://www.youtube.com/watch?v=uKvxJiTAucg

    ________________________


    Note:

    Please do not mention any other consoles, comparisons type comments, or talk about "porting" in this thread. This is purely to understand (without dev-kit) what is starting expectation of RSX game performance (fps measure) where 100% of rendering is GPU based (most PC games) and approximately simulating what can be highest expectation by off-loading certain tasks to SPU such as AA, post-process, etc.by turning off such features in the PC games.
     
    #27 ihamoitc2005, Jul 6, 2010
    Last edited by a moderator: Jul 6, 2010
  8. Laa-Yosh

    Laa-Yosh I can has custom title?
    Legend Subscriber

    Joined:
    Feb 12, 2002
    Messages:
    9,568
    Likes Received:
    1,455
    Location:
    Budapest, Hungary
    Your logic is flawed and I've already explained why. Please do try to understand it before posting the same wall of text again and again.


    What you're doing is like, let's take this seriously overweight 50+ years old male person, and see how fast he can run a mile in running shoes, in dancing shoes, without shoes.
    Then use this data to approximate how fast a cheetah would run 100 yards in the rain.

    Both are about someone running for some distance - but other than that it's completely unrelated.
     
  9. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    Nah, it is minimum 1280x1024, 4xMSAA (who knows if TSAA was enabled by misstake or not changed back to disabled) and 16xAF apart from what differences versions has in settings quality by max quality settings. That alone is a helluva difference though it seems odd that the 7900GTX being in the same ballpark as the x19xxx series cant even touch the min framerate of the x1950xt with it's avg framerate. If anything I would beleive it had TSAA enabled hence the high perfomance impact since there is quite some transparencies in MW2, right?
     
  10. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    ihamoitc2005, what makes this quite flawed is that you cant measure it with any good precision since it is not clear how well optimised each version for each platform is in regards to each other.
     
  11. function

    function None functional
    Legend

    Joined:
    Mar 27, 2003
    Messages:
    5,854
    Likes Received:
    4,411
    Location:
    Wrong thread
    I thinks it's more likely that PCGH got the AA setting right, and it's just that the 7900 GTX sucks.
     
  12. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    I think on console, the developers target a very specific set of config. There are no parameters to turn on/off per se. So consistent measurement is hard due to lack of public data.

    There are some measurements which make sense, like: We can compare RSX performance with and without SPU culling, doing the same post-processing effect on RSX and SPUs. Unfortunately, these data are not available on the net. They may not be optimized "to the same level" in-house too since one of the alternatives is not shipped.

    The developers may also use completely different techniques because the SPU cores are more flexible than GPU cores.

    Take MLAA for example (since we have public data on it), Santa Monica studio mentioned 2xMSAA timing vs MLAA timing in GoW3. The visual outcome is very different. What exactly are we measuring if we simply determine the timing ratio between these 2 techniques ? What do we mean by free up GPU time here ?

    And each game budgets the resources differently. So cross game comparison may not make sense.

    In the end, it may be a hodge-podge of high level numbers that cannot be compared technically. To balance the view a little, it is also possible to find tasks whereby the RSX outruns the SPUs.
     
  13. ihamoitc2005

    Veteran

    Joined:
    Sep 21, 2005
    Messages:
    1,181
    Likes Received:
    15
    Who needs precision?

    We can make a good estimate with sample of many games. Also, PC games are all unoptimized.

    7900GT has identical ALU/TMU components as RSX.

    RSX is in closed box so it is much more easy to get shader limited (ideal situtation).

    See this bioshock video:

    Max settings on 7900GT at 1280x960

    Shadowmap is off, No AA, No AF

    http://www.youtube.com/watch?v=uKvxJiTAucg

    Can those (shadows, AA) be moved to Cell or removed for console? If so, can RSX run Bioshock with those settings and resolution? Seeing this the PS3 Bioshock does not seem to be shader limited, no? Maybe 3D is possible at 720P or little bit less.

    This is fun to try.
     
    #33 ihamoitc2005, Jul 6, 2010
    Last edited by a moderator: Jul 6, 2010
  14. Lucid_Dreamer

    Veteran

    Joined:
    Mar 28, 2008
    Messages:
    1,210
    Likes Received:
    3
    I find it interesting that the same lack of precise performance is acceptable when comparing RSX to the performance of a Nvidia 7800 or 7900 GPU. If one is accepted, so should the other.

    I believe ihamoitc2005 is really just talking about supposed similar GPU performance on the same games (MW2 vs MW2). I think the mention of the Cell forced some people into a different mental posture. There are two different scenarios to be considered and they can be completely separated. One scenario is just GPU vs GPU on the same game title with certain features on and off. The other is what GPU tasks can the SPUs do.

    No one here should have an aversion to either one of those scenarios separately. It's been talked about many times in many threads here without the need to be absolutely precise.

    See, there have been many GPU comparisons to RSX. Why is one accepted and the other not? Neither are precise. :)
     
  15. ihamoitc2005

    Veteran

    Joined:
    Sep 21, 2005
    Messages:
    1,181
    Likes Received:
    15
    Precision

    You make a very good point, my friend. I also do not know why people want such high precision for this question. It is almost as if people are looking for reasons to not answer/think about it.

    I find it is a very interesting question for people who do not own the dev kits.
     
  16. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    Well if it is that interesting I was asked to help out with the DF article "can Crysis run on consoles?". The assimilation which are with same visuals as the provided images compared to CE3 techdemo Island part. A 7900GT 430MHz, 256MB VRAM at 1280x720 with assimilation settings ran at the scenery view part IIRC somewhere at 20-25fps. Though nor OS or HW was in optimal conditions (old drivers, faulty RAM stick etc). Scene had about 1-1,2m polygons per frame. It was also run on CE2 not CE3.
     
  17. ihamoitc2005

    Veteran

    Joined:
    Sep 21, 2005
    Messages:
    1,181
    Likes Received:
    15
    Interesting

    That is interesting my friend. Do you still have this construction?

    What is it you mean by assimilation settings?
     
  18. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
  19. ihamoitc2005

    Veteran

    Joined:
    Sep 21, 2005
    Messages:
    1,181
    Likes Received:
    15
  20. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...