How much % acceleration is Cell for RSX

I have looked at other threads and I cannot find this answer.

Many developers now use SPU to remove load from RSX. This has effect of "accelerating" RSX by providing more RSX cycles for RSX only tasks. For RSX, this can have texture benefit too.

So now, with MLAA, even AA is off-loaded to RSX.

So I am thinking this.

If a GPU has no need to do AA, no need to do post-processing, what is effective performance gain? Also there is culling.

Can this be estimated?

For example:

If I have a shader heavy game on a PC which is 100% GPU rendering with standard MSAA and GPU post-processing.

If I disable AA, what is frame-rate gain?

Then I disable post-processing what is frame-rate gain?

etc.

So what is net gain.

Can we try this with moddable game like such as Crysis or and see what is frame-rate as curious experiment?
 
Well if you disable shadows in Crysis it can almost double your framerate.
 
Generating the shadow maps isn't something you'd want to do in SPU's because the RSX has dedicated hardware for rasterization and z-test. Sampling the shadow map for performing the visibility test is also a bad fit for SPU's, since you need multiple "random" texture accesses. SPU's work best when the memory access is predictable and coherent. The same is true for GPU's of course, but they at least handle it better since they have a cache structure. On SPU's you'd basically have to build your own software cache.
 
Generating the shadow maps isn't something you'd want to do in SPU's because the RSX has dedicated hardware for rasterization and z-test. Sampling the shadow map for performing the visibility test is also a bad fit for SPU's, since you need multiple "random" texture accesses. SPU's work best when the memory access is predictable and coherent. The same is true for GPU's of course, but they at least handle it better since they have a cache structure. On SPU's you'd basically have to build your own software cache.
It is doable, but I would bet that SPUs are usually used to cull invisible polygons when RSX is rendering the shadow map.
There should be quite nice speedup in common scenes.
 
This is another of those unanswerable questions, as it all depends what you're trying to speed up. If you're wanting an answer that equates RSX+Cell is a higher clocked RSX, "Cell overdrives RSX to performance equivalent to 750MHz," you won't get anywhere. Cell takes workloads to enable RSX to run as fast as it can with other workloads. It's not about accelerating RSX, but optimising it, so it can run closer to 100%, but it can't ever run faster than it can actually run! If a workload is a poor fit for RSX (vertex work) then having Cell do that because it's faster means the overall workrate is faster, but RSX isn't.

So actually I started this post wrong as the question is answerable. 0%. RSX runs at the same speed as always, only smart devs get it working on problems it's good at instead of expecting it to do everything.
 
Misunderstanding

It is doable, but I would bet that SPUs are usually used to cull invisible polygons when RSX is rendering the shadow map.
There should be quite nice speedup in common scenes.

Of course it is not possible to increase clock speed.

What I ask is this.

On a PC, you can increase frame-rate of COD4, Crysis, etc... simply turn off shadows, depth of field, etc.

Cell taking over such functions from GPU is similar to turning off those functions (unlike PC, without actually losing those functions).

So what is % performance (frame-rate) speed up one can expect when moving those functions to the SPU (similar effect on GPU performance to turning those features off on PC games).

I hope that is not confusing.

Nebula and Alstrong have got what I ask I think.

As for shadowmap on SPU, don't games do partial shadowmap processing already?

Also for depth of field, this doc says GPU alone is 15ms but with SPU preprocess it is only 3ms. That is a big improvement no?

Also it says RSX post-processing is >33% of frame-time. So if post-processing is on SPU only then is that not >50% improvement in RSX availability?

http://www.technology.scee.net/file...09/DeferredLightingandPostProcessingonPS3.ppt

Also no they do AA on SPU, not GPU. Will that also have help for frame-rate?

Thank you.
 
Cell taking over such functions from GPU is similar to turning off those functions (unlike PC, without actually losing those functions).

So what is % performance (frame-rate) speed up one can expect when moving those functions to the SPU (similar effect on GPU performance to turning those features off on PC games).
That depends entirely on what you're doing and how though! If you're using RSX to calculate 3D volume shadows and tying up 80% of RSX's performance, moving that to Cell would be equivalent to a 5x increase in GPU as that 20% leftover you had for rendering the scene sans volumetric shadows is now up to 100%. Cell can be worth anything from 1% to...100%, hypothetically.
 
That depends entirely on what you're doing and how though! If you're using RSX to calculate 3D volume shadows and tying up 80% of RSX's performance, moving that to Cell would be equivalent to a 5x increase in GPU as that 20% leftover you had for rendering the scene sans volumetric shadows is now up to 100%. Cell can be worth anything from 1% to...100%, hypothetically.

Looking at PC games like COD4 and Crysis, I wonder maybe if >100% is possible.

For example:

If a PC game is normally 30fps but, with shadows, post-processing, msaa, etc turned off it is now 90fps, thats 200% increase, no?
 
Of course it is not possible to increase clock speed.

What I ask is this.

On a PC, you can increase frame-rate of COD4, Crysis, etc... simply turn off shadows, depth of field, etc.

Cell taking over such functions from GPU is similar to turning off those functions (unlike PC, without actually losing those functions).

So what is % performance (frame-rate) speed up one can expect when moving those functions to the SPU (similar effect on GPU performance to turning those features off on PC games).

I hope that is not confusing.

Nebula and Alstrong have got what I ask I think.

As for shadowmap on SPU, don't games do partial shadowmap processing already?

Also for depth of field, this doc says GPU alone is 15ms but with SPU preprocess it is only 3ms. That is a big improvement no?

Also it says RSX post-processing is >33% of frame-time. So if post-processing is on SPU only then is that not >50% improvement in RSX availability?

http://www.technology.scee.net/file...09/DeferredLightingandPostProcessingonPS3.ppt

Also no they do AA on SPU, not GPU. Will that also have help for frame-rate?

Thank you.

But you are asking here a best case scenario, right!

You imply here that moving stuff to the SPU does not increase the overall frame time! You imply that you do not introduce latency (IIRC the GOWIII dev team mentioned that this was difficult for their MLAA implementation) and that CELL can truely co-work in parallel to RSX, without either of the two stalling or something like this!

If we assume this best case scenario, that you really get 100% gain when you are moving stuff to CELL, I think that your idea with the PC setup (maybe one could take even a similar graphics card) is really cool...as we get an estimation for the maximum benefit of moving something from RSX to CELL!
 
Not to mention that a PC's bottlenecks are very different from those of the RSX. A super high end system might produce significant increases just because its other resources (vertex processing) are far less limited and it could push ten times as many polygons. And so on.

The point is that you can't make any assumptions based on how a game's PC version behaves, particularly if it's an unspecified configuration.
 
Not to mention that a PC's bottlenecks are very different from those of the RSX. A super high end system might produce significant increases just because its other resources (vertex processing) are far less limited and it could push ten times as many polygons. And so on.

The point is that you can't make any assumptions based on how a game's PC version behaves, particularly if it's an unspecified configuration.

What if we had a PC with a 7900GT to test latest versions of COD, UT, Doom, Crysis, Elder Scrolls, Assasins Creed, etc?

Maybe start with 30fps settings, then turn things off to see results?

Or maybe start with max settings and turn things off to see if playable frame-rate is possible?

Can be fun to experiment and see.

Sadly, I do not have such a PC or such games.
 
Seems to be Shifty summed this up best. RSX isn't going to be accelerated. You offload the things that RSX isn't as good at to Cell. RSX hasn't gotten any faster, it's just doing the work it's better at and hopefully being fed enough work that it isn't underutilized. I'm not sure you could ever boil that work down into a real meaningful metric. You could probably speak to total utilization of the console, but those numbers can be remarkably meaningless. I could utilize a CPU near 100% with inefficiency.
 
Acceleration is metaphor

Seems to be Shifty summed this up best. RSX isn't going to be accelerated. You offload the things that RSX isn't as good at to Cell. RSX hasn't gotten any faster, it's just doing the work it's better at and hopefully being fed enough work that it isn't underutilized. I'm not sure you could ever boil that work down into a real meaningful metric. You could probably speak to total utilization of the console, but those numbers can be remarkably meaningless. I could utilize a CPU near 100% with inefficiency.

Not literal, my friend.

If you have played PC games you know you can greatly increase performance (FPS) by turning off AA, shadows, depth-of-field, etc. This is the heart of the question.

Utilization is not the question. Rather it is performance increase (FPS/resolution) for given original work load. For example, you have a game like COD4 on 7900GT. What is your peformance on max settings at 1024x768 (maybe similar to console settings?). You turn off AA, what is your performance? You turn of post-processing, what is your performance? etc.

We know all this can be done on Cell now so we can test to get rough idea of what is the performance gain.

If original FPS is 30fps, after changes is it 30% (40fps)? Is it 100%(60fps)? Is it 200%(90fps)?
 
Last edited by a moderator:
What if we had a PC with a 7900GT to test latest versions of COD, UT, Doom, Crysis, Elder Scrolls, Assasins Creed, etc?

Maybe start with 30fps settings, then turn things off to see results?

Or maybe start with max settings and turn things off to see if playable frame-rate is possible?

Can be fun to experiment and see.

Sadly, I do not have such a PC or such games.

Here's a bench showing how a 7900 GTX (considerably faster and *massively* more bandwidth than RSX) fares with Modern Warfare 2:

http://www.pcgameshardware.com/aid,...nchmarks-of-the-latest-Call-of-Duty/Practice/

And that's with an overclocked Core i7 as CPU.

Horrible. Anyone complaining about "lazy ports" would do well remember just how badly RSX's brethren have faired outside the sheltered waters of PS3 land.
 
Here's a bench showing how a 7900 GTX (considerably faster and *massively* more bandwidth than RSX) fares with Modern Warfare 2:

http://www.pcgameshardware.com/aid,...nchmarks-of-the-latest-Call-of-Duty/Practice/

And that's with an overclocked Core i7 as CPU.

Horrible. Anyone complaining about "lazy ports" would do well remember just how badly RSX's brethren have faired outside the sheltered waters of PS3 land.

That's great, my friend.

Now we can divide resolution by 2, cancel AA, use much lower AF quality, cancel post-processing, etc and see new frame-rate.

Also, that graph is for Max detail.

What detail level do you feel is made for console versions?

If there is such chart for 7900GT that will be great but if not we can fudge/adjust for clock speed, no?

Thank you.

___________

ps. I am surprised by poor COD MW2 performance of 7800GTX. Maybe something unique about PC version of this game for 7-series Nvidia cards. I know people can run Crysis (high settings) on 7900GT with better performance than that.
 
Last edited by a moderator:
Utilization is not the question. Rather it is performance increase (FPS/resolution) for given original work load.
And it's the nature of the workload that defines the performance gain via your metric. Using the tired but ever appropriate car analogy, you are in essence asking what is the speed advantage gained by using a Corvette with a JVB digger. If you're wanting the JCB to dig holes, having a Corvette available will gain you no advantage. If you want to win a drag race, using the Corvette entirely will fetch you much better results. If you want to carry some furniture between cities, the two will achieve more working together than the JCB would manage on its own.

I can see your intentions are to try and pin how much benefit Cell is to rendering, how much it can help a GPU, but the real answers don't lie with some nice straightforward metrics. Comparing your PC game for example, you have no idea of the state of optimisation in the underlying code, so cannot conclude that if disabling shadows on the PC gains 20% FPS, moving shadows from RSX to Cell would gain 20% FPS. Maybe it would gain more, depending on where the bottlenecks are for RSX, or maybe it'd gain less. Maybe it'd gain performance that'd be repurposed for adding more objects which doesn't fetch you any FPS increase. The benefits are case-by-case only, and will be ever changing as new ideas grow. eg. MLAA is an option now that wasn't available a year ago. Developers will always be finding better ways to share workloads between the components of a console, and what provides 10% more framerate shifting work from GPU to Cell in 2007 may enable a 30% increase with the same sort of workload in a current game thanks to better SPE code.
 
I think you are overcomplicating a bit.

A precise answer is, of course, impossible. No one has asked for that.

However, shadows, post-processing, etc have large impacts and unlikely that console and PC versions will have significantly different GPU code.

Also, benefit of doing test on PC is that it is code that is not optimized for specific hardware. This means that whatever gains we get will maybe even underestimate what is possible in fixed hardware environment of consoles.

So, if Crysis can run in high settings at 1024x768 on a 7900GT at X avg fps, what if we disabled AA? What if we lowered AF? etc
 
Back
Top