How much % acceleration is Cell for RSX

A precise answer is, of course, impossible. No one has asked for that.

However, shadows, post-processing, etc have large impacts and unlikely that console and PC versions will have significantly different GPU code.

Also, benefit of doing test on PC is that it is code that is not optimized for specific hardware. This means that whatever gains we get will maybe even underestimate what is possible in fixed hardware environment of consoles.

So, if Crysis can run in high settings at 1024x768 on a 7900GT at X avg fps, what if we disabled AA? What if we lowered AF? etc

Its nice to do these comparisons but a console like PS3 has custom development tools like OpenGL/ES to the very low level LibGCM this is used in combination with Cell to increase performance while on a PC enviroment the graphics card is really doing all of the work and the cpu is just scaling... I could have missed something though.
 
It's more like off loading the RSX with by addin specific task by increasing the load on the Cell, because using this way can shorter the overall time frame rendering.

Well there are weak spot as few task cannot be combined due it will lower the overall performance or the end result will result unwanted overlay artifact.

Load Cell + Load RSX <=> Balanced to time frame rendering target

The reality is that the load of the GPU take a lot of importance in the overall time frame rendering with the combinaison with the CPU that even with optimization with the help of the CPU for rendering, the load balance is still very high on the GPU side.

Well if we target Apple vs Apple I can see 5 to 7fps increase at most, the target framerate is like 30fps, it's a lot for a console system.

Example, if the current development run at 37fps, you can increase task for more candy on either CPU or GPU with specific task to get the 30fps target to get more rendering.

Some task have different quality & can increase the load on either target. Remember that a GPU & CPU are calculator in the end & those can be specified with any task. Some task can do better on either side but still take load no matter what you do.

There's also the overall architecture that take a role in it such as latency which can add more in the equation & the specific task if a huge file is required.
 
Well I'm sure we can accumulate at least ballpark figures for a few things. For instance, apparently MSAA is very expensive on RSX. Let's say it decreases the level of detail you can handle by up to 60% for 4xMSAA. Then if you move your AA solution to MLAA on the SPUs, you can have a very big improvement in performance if that is where your primary bottleneck is.

It's all down to the bottlenecks in your gaming engine, which will be different from game to game, and I think it's completely fair to say that this question just cannot be answered properly unless we take very specific examples or know the details of a particular game engine's requirements and bottlenecks. For all you know physics calculations are your bottleneck and it's not graphics at all.
 
However, shadows, post-processing, etc have large impacts and unlikely that console and PC versions will have significantly different GPU code.

Code doesn't matter much if the internal architectures of the systems are completely different. The memory controllers, buses, bandwidths, latencies and such are not even on the same page, so it's worthless to try to draw any conclusions.

Then it all depends on the actual content as well, is it a light pre-pass renderer, a standard forward renderer, is it using high poly assets, is there dynamic lighting and if yes, how many light sources, how complex are the vertex shaders (skinning, cloth, deformable objects and terrain) and so on. Every single game presents a different kind of workload, even within the FPS/TPS/racing/whatever genres.

And on top of this all, bottlenecks are constantly changing even through the rendering of a single frame. Vertex load may be starting low, then go high, then sink low again, then turn to absolutely nothing; SPU's might have a limited amount of time to do anything meaningful because data will have to start going to RSX in order to complete the rendering before vsync; there may be game or physics or other code to run, etc etc.


This entire effort is meaningless, even the developers of any single game would probably have serious trouble giving even rough estimates.
 
Is it possible to predict how much more software optimization on SPUs that can be extracted? AFAIK parallel processing or multicore programming are still at their infant stage and I keep hearing this game reaches 50%, 100% of CELL etc. I've always wondered what is the limit of software optimization, are we close to code to the metal yet?
 
Here's a bench showing how a 7900 GTX (considerably faster and *massively* more bandwidth than RSX) fares with Modern Warfare 2:

http://www.pcgameshardware.com/aid,...nchmarks-of-the-latest-Call-of-Duty/Practice/

And that's with an overclocked Core i7 as CPU.

Horrible. Anyone complaining about "lazy ports" would do well remember just how badly RSX's brethren have faired outside the sheltered waters of PS3 land.

Oh my this is quite bad...but than again dont forget this is on PC so they cant code to metal like on consoles and not only that,those are 720p 4xAA and console version is a bit lower in res,but 60fps...Than again RSX is even worse gpu than 7900GTX.The definite fact is,SPUS are doing quite alot to do what RSX should do.
 
The memory controllers, buses, bandwidths, latencies and such are not even on the same page, so it's worthless to try to draw any conclusions.

I disagree.

We can estimate these things. That's why we can run game benchmarks for multiple games and get approximate average relative performance of PC GPUs. Toms hardware and Anandtech do that for a living, no?

The idea is to compare full RSX rendering (run a game on 7900GT, perhaps) at console resolutions and detail/AA/AF settings then compare with how much potential speed-up you can get by removing things that have large drag on performance like MSAA or post-processing.

No need for precise numbers. No need for reengineering code. On many PC games you can turn things off to see speed up. Over a number of games you get approximate sense of how much can be gained and where by removing that load from the GPU.

The key word is approximate.

So, for example, if 7900GT can run Crysis at 20fps at high settings (except 2 at medium) at 1024x768, how much can it run if you turn off AA (move to SPU)? What if you lower shadow quality (simulate preprocess on SPU)? What if you remove depth of field (move to SPU)? Etc.

Example video Crysis Max settings on over-clocked 7900GT at 1024x768:

http://www.youtube.com/watch?v=7Wmp2mS7KLc&feature=related

This will also, perhaps, reveal if there's another bottleneck for RSX utilization.

For example, people claim they can run Bioshock on PC at max settings at 1280x1024 on a 7900GT without problems. But RSX version is visibly poorer despite similar GPU architecture.

Example video Bioshock at 1280x960 with following settings:

1280 x 960 No AA / No AF

Everything on High Quality

Windowed mode: Off
Vertical Sync: Off
Shadow Maps: Off
High Detail Shaders: On
High Detail Post Processing: On
Real Time Reflection: On
Distortion: On
Force Global Lighting: On

30 - 80 fps

http://www.youtube.com/watch?v=uKvxJiTAucg

________________________


Note:

Please do not mention any other consoles, comparisons type comments, or talk about "porting" in this thread. This is purely to understand (without dev-kit) what is starting expectation of RSX game performance (fps measure) where 100% of rendering is GPU based (most PC games) and approximately simulating what can be highest expectation by off-loading certain tasks to SPU such as AA, post-process, etc.by turning off such features in the PC games.
 
Last edited by a moderator:
Your logic is flawed and I've already explained why. Please do try to understand it before posting the same wall of text again and again.


What you're doing is like, let's take this seriously overweight 50+ years old male person, and see how fast he can run a mile in running shoes, in dancing shoes, without shoes.
Then use this data to approximate how fast a cheetah would run 100 yards in the rain.

Both are about someone running for some distance - but other than that it's completely unrelated.
 
Oh my this is quite bad...but than again dont forget this is on PC so they cant code to metal like on consoles and not only that,those are 720p 4xAA and console version is a bit lower in res,but 60fps...Than again RSX is even worse gpu than 7900GTX.The definite fact is,SPUS are doing quite alot to do what RSX should do.

Nah, it is minimum 1280x1024, 4xMSAA (who knows if TSAA was enabled by misstake or not changed back to disabled) and 16xAF apart from what differences versions has in settings quality by max quality settings. That alone is a helluva difference though it seems odd that the 7900GTX being in the same ballpark as the x19xxx series cant even touch the min framerate of the x1950xt with it's avg framerate. If anything I would beleive it had TSAA enabled hence the high perfomance impact since there is quite some transparencies in MW2, right?
 
ihamoitc2005, what makes this quite flawed is that you cant measure it with any good precision since it is not clear how well optimised each version for each platform is in regards to each other.
 
Nah, it is minimum 1280x1024, 4xMSAA (who knows if TSAA was enabled by misstake or not changed back to disabled) and 16xAF apart from what differences versions has in settings quality by max quality settings. That alone is a helluva difference though it seems odd that the 7900GTX being in the same ballpark as the x19xxx series cant even touch the min framerate of the x1950xt with it's avg framerate. If anything I would beleive it had TSAA enabled hence the high perfomance impact since there is quite some transparencies in MW2, right?

I thinks it's more likely that PCGH got the AA setting right, and it's just that the 7900 GTX sucks.
 
I disagree.

We can estimate these things. That's why we can run game benchmarks for multiple games and get approximate average relative performance of PC GPUs. Toms hardware and Anandtech do that for a living, no?

The idea is to compare full RSX rendering (run a game on 7900GT, perhaps) at console resolutions and detail/AA/AF settings then compare with how much potential speed-up you can get by removing things that have large drag on performance like MSAA or post-processing.

I think on console, the developers target a very specific set of config. There are no parameters to turn on/off per se. So consistent measurement is hard due to lack of public data.

There are some measurements which make sense, like: We can compare RSX performance with and without SPU culling, doing the same post-processing effect on RSX and SPUs. Unfortunately, these data are not available on the net. They may not be optimized "to the same level" in-house too since one of the alternatives is not shipped.

The developers may also use completely different techniques because the SPU cores are more flexible than GPU cores.

Take MLAA for example (since we have public data on it), Santa Monica studio mentioned 2xMSAA timing vs MLAA timing in GoW3. The visual outcome is very different. What exactly are we measuring if we simply determine the timing ratio between these 2 techniques ? What do we mean by free up GPU time here ?

And each game budgets the resources differently. So cross game comparison may not make sense.

In the end, it may be a hodge-podge of high level numbers that cannot be compared technically. To balance the view a little, it is also possible to find tasks whereby the RSX outruns the SPUs.
 
Who needs precision?

ihamoitc2005, what makes this quite flawed is that you cant measure it with any good precision since it is not clear how well optimised each version for each platform is in regards to each other.

We can make a good estimate with sample of many games. Also, PC games are all unoptimized.

7900GT has identical ALU/TMU components as RSX.

RSX is in closed box so it is much more easy to get shader limited (ideal situtation).

See this bioshock video:

Max settings on 7900GT at 1280x960

Shadowmap is off, No AA, No AF

http://www.youtube.com/watch?v=uKvxJiTAucg

Can those (shadows, AA) be moved to Cell or removed for console? If so, can RSX run Bioshock with those settings and resolution? Seeing this the PS3 Bioshock does not seem to be shader limited, no? Maybe 3D is possible at 720P or little bit less.

This is fun to try.
 
Last edited by a moderator:
Code doesn't matter much if the internal architectures of the systems are completely different. The memory controllers, buses, bandwidths, latencies and such are not even on the same page, so it's worthless to try to draw any conclusions.

Then it all depends on the actual content as well, is it a light pre-pass renderer, a standard forward renderer, is it using high poly assets, is there dynamic lighting and if yes, how many light sources, how complex are the vertex shaders (skinning, cloth, deformable objects and terrain) and so on. Every single game presents a different kind of workload, even within the FPS/TPS/racing/whatever genres.

And on top of this all, bottlenecks are constantly changing even through the rendering of a single frame. Vertex load may be starting low, then go high, then sink low again, then turn to absolutely nothing; SPU's might have a limited amount of time to do anything meaningful because data will have to start going to RSX in order to complete the rendering before vsync; there may be game or physics or other code to run, etc etc.


This entire effort is meaningless, even the developers of any single game would probably have serious trouble giving even rough estimates.
I find it interesting that the same lack of precise performance is acceptable when comparing RSX to the performance of a Nvidia 7800 or 7900 GPU. If one is accepted, so should the other.

I believe ihamoitc2005 is really just talking about supposed similar GPU performance on the same games (MW2 vs MW2). I think the mention of the Cell forced some people into a different mental posture. There are two different scenarios to be considered and they can be completely separated. One scenario is just GPU vs GPU on the same game title with certain features on and off. The other is what GPU tasks can the SPUs do.

No one here should have an aversion to either one of those scenarios separately. It's been talked about many times in many threads here without the need to be absolutely precise.

Here's a bench showing how a 7900 GTX (considerably faster and *massively* more bandwidth than RSX) fares with Modern Warfare 2:

http://www.pcgameshardware.com/aid,...nchmarks-of-the-latest-Call-of-Duty/Practice/

And that's with an overclocked Core i7 as CPU.

Horrible. Anyone complaining about "lazy ports" would do well remember just how badly RSX's brethren have faired outside the sheltered waters of PS3 land.
See, there have been many GPU comparisons to RSX. Why is one accepted and the other not? Neither are precise. :)
 
Precision

I find it interesting that the same lack of precise performance is acceptable when comparing RSX to the performance of a Nvidia 7800 or 7900 GPU. If one is accepted, so should the other.

I believe ihamoitc2005 is really just talking about supposed similar GPU performance on the same games (MW2 vs MW2). I think the mention of the Cell forced some people into a different mental posture. There are two different scenarios to be considered and they can be completely separated. One scenario is just GPU vs GPU on the same game title with certain features on and off. The other is what GPU tasks can the SPUs do.

No one here should have an aversion to either one of those scenarios separately. It's been talked about many times in many threads here without the need to be absolutely precise.


See, there have been many GPU comparisons to RSX. Why is one accepted and the other not? Neither are precise. :)

You make a very good point, my friend. I also do not know why people want such high precision for this question. It is almost as if people are looking for reasons to not answer/think about it.

I find it is a very interesting question for people who do not own the dev kits.
 
Well if it is that interesting I was asked to help out with the DF article "can Crysis run on consoles?". The assimilation which are with same visuals as the provided images compared to CE3 techdemo Island part. A 7900GT 430MHz, 256MB VRAM at 1280x720 with assimilation settings ran at the scenery view part IIRC somewhere at 20-25fps. Though nor OS or HW was in optimal conditions (old drivers, faulty RAM stick etc). Scene had about 1-1,2m polygons per frame. It was also run on CE2 not CE3.
 
Interesting

Well if it is that interesting I was asked to help out with the DF article "can Crysis run on consoles?". The assimilation which are with same visuals as the provided images compared to CE3 techdemo Island part. A 7900GT 430MHz, 256MB VRAM at 1280x720 with assimilation settings ran at the scenery view part IIRC somewhere at 20-25fps. Though nor OS or HW was in optimal conditions (old drivers, faulty RAM stick etc). Scene had about 1-1,2m polygons per frame. It was also run on CE2 not CE3.

That is interesting my friend. Do you still have this construction?

What is it you mean by assimilation settings?
 
Back
Top