Don't know how I missed this thread... but I will take a small stab at this. nVidia and ATI does not use the same defination of "Shader Operations". Rather nVidia's usage of the term is about shader "instructions"... meaning they are taking the individual instructions that comprise a component of a shader operation. A true shader operation is comprised of vector operations and a scalar operation, and it takes instructions to process those operations. Now both nVidia's NV50 and XENOS both are capable of Vec4+Scalar... or 4 Vector operations and one scalar operation per shader operation and do so in one cycle per shader unit. A shader unit consists of ALUs that perform vector operations and another ALU that performs a scalar operation... and each shader unit can perform ONE shader operation per cycle.
Now the NV50 has 24 pixel pipelines and each pixel pipeline has 2 shader units for a total of 48 shader units... and assuming the RSX is an uncrippled NV50 it would be the same on the RSX. XENOS has 48 pipelines that can be used for either pixel or vertex programs... and each one has 1 shader unit each. That means that all 24 pixel pipelines would pump out around 48 shader operations per cycle and so would XENOS if all pipelines was used for pixel shader programs. As each of the shader operations are 5 component (4 Vector and 1 scalar) that would come out to around 240 shader components per cycle for both. Given the estimated clock speeds of the RSX of 550Mhz and XENOS given clock speed of 500Mhz that would come out to around 132 Billion shader components per second or 26.4 Billion shader operations per second for the RSX (again assuming an uncrippled NV50 at 550Mhz) and around 120 Billion shader components per second or 24 Billion shader operations per second for XENOS.
So there you have it... 132 billion shader components per second (RSX) versus 120 billion shader components per second (XENOS). Is this the end of the story? Nope... there are two big areas that must be considered... architectural caveats and efficiency. Now I said that XENOS has 48 pipelines that could be used as either pixel or vertex data... that means just that... and you don't want to use all of your pipelines for pixel data as that would mean you wouldn't have any pipelines for geometry... that is unless XENON is capable of processing geometry in lieu of the GPU (and while I think the capacity is there... I am not sure, so for now I won't say it does for the sake of this arguement). So the actual amount of pipelines for processing pixel shader programs will likely be less than 48... so that would be upto 24 billion shader operations per second for XENOS... and assuming your average XBox360 game takes about 8-10 pipelines (this is not a set value... it REALLY depends on the game) for vertex programs that would leave around 38-40 pipelines for pixel shader programs or around 19-20 Billion shader operations per second on average... again this is not a set number as it is dependant on the game. As for the RSX... as it is based on the Geforce NV3x/4x/5x marchitecture there is one important thing to remember about how the shader units are arranged in the pixel pipelines. You remember that I said there was 2 shader units per pixel pipeline right? What I didn't say is that one of those shader units is coupled to a texture unit in each of the pixel pipelines... meaning which if a texture operation needs to be done you LOSE one shader unit from the pixel pipeline... and typically in a game texture operations are being done nearly all of the time. So given an uncrippled NV50 at 550Mhz that would actually be around 13.2-19.8 billion shader operations per second.
So if anything both the RSX and XENOS are pretty close in terms of pixel shader operations that they can perform per second... with perhaps a small edge for XENOS on average. Then we goto the second concern... efficiency. Now this is a factor that is a little harder to judge, but based on experience in dealing with Geforce cards (which I use at home and at work) their average efficiency is around 30-40%. Shader units will NEVER operate at their peak speed... so it is important to improve the efficiency and utilization of those shader units. Shader units stall VERY often during a game for various reasons... and ATI has taken upon themselves to make changes to their marchitecture to improve their utlization by introducing concepts such as "Ultra Threading". Through extensive multithreading of the shader units you almost completely remove most of the stalling shader units so they are nearly always doing something now instead of doing nothing. This was introduced on XENOS as well as ATI's newest PC GPU the Radeon x1000 series. ATI's Radeon x800 series GPUs was already more efficient compared to nVidia's in terms of their actual shader performance even though they was at a major disadvantage in shader units (as the Radeon cards only have 1 full shader unit per pixel pipeline). Even though the Geforce 7800GTX has more than twice the shader power of the Radeon x1800XT the x1800XT still manages to completely destroy the Geforce 7800GTX in actual pixel shader performance (as reflected in pixel shader heavy games that was benchmarked). It was stated by ATI that the average efficiency for XENOS was 95%... and based on what I have seen the Radeon x1000 series comes close to that and probably slightly exceeds the utlization rate of XENOS. The previous Radeon x850 had an average efficiency (stated by ATI) of around 50-60% and the Radeon x1800XT had the same shader arrangement and number of shader units so the x1000 series nearly doubled it's efficiency over the previous generation Radeon. Why am I saying all of this? Well lets see what happens to those effective numbers when we start to talk about efficiency and how it will act in real world applications.
I stated that XENOS is capable of a maximum of 24 billion shader operations per second, but an average scenerio would see only 19-20 billion shader operations per second as not all pipelines would be used for pixel shader programs. I also said that the RSX (assuming an uncrippled NV50 at 550Mhz) is capable of a maximum of 26.4 billion shader operations per second, but an average scenerio would see only 13.2-19.8 billion shader operations per second due to the fact texture operations would remove one shader unit per pixel pipeline. Now let's see what happens when we apply efficiency claims... the RSX would have an similar average efficiency compared to the Geforce 7800 GTX (as they are basically the same GPU)... and that would mean roughly 50-60% average efficiency. That would mean that the RSX would exhibit roughly 6.6-11.9 billion shader operations per second in a real world application on average. XENOS is stated to have an efficiency of around 95%, and after seeing the effects of their implementations of threading on the Radeon x1000 series I am inclined to believe this... but for the sake of argument I am going to use an 80-95 percent efficiency range. If what was stated is accurate by ATI then that would mean that XENOS would exhibit roughly 15.2-19 billion shader operations per second in a real world application on average.
Here is a recap on the numbers...
GPU Configuration...
*RSX > 24 pixel pipelines with 2 shader units each (1 coupled to a texture unit) at 550Mhz
*XENOS > 48 unified pipelines with 1 shader unit each at 500Mhz.
Maximum theoretical shader operations (a shader operation is comprised of vector operations and a scalar operation)...
*RSX > 48 shader units at 550MHz = 26.4 billion shader operations per second.
*XENOS > 48 shader units at 500Mhz = 24 billion shader operations per second.
Accounting for architectural caveats...
*RSX > When texture operations are done you lose a shader unit per pixel pipeline... so you would have only 13.2-19 billion shader operations per second of the maximum 26.4 billion shader operations per second available.
*XENOS > Pipelines have to be used for geometry (as far as I know) so considering your average x360 game using roughly 8-10 of those 48 pipelines for geometry you would have roughly 19-20 billion of the maximum 24 billion shader operations per second available.
Accounting for architectural efficiency...
*RSX > Much like the Geforce line of GPUs they would exhibit roughly 50-60% of their effective shader performance in a real world application. So you would have only 6.6-11.9 Billion shader operations per second of the 13.2-19 billion shader operations per second maximum effective.
*XENOS > It is stated it has an efficiency of around 95% and stands to be extremely efficient as it achieves a very high utlization rate, but I am using a 80-95% range for this argument. So you would have only 15-19 billion shader operations per second of the 19-20 billion shader operations per second maximum effective (on average).
So considering everything XENOS should perform roughly 2-3 times faster than the RSX at 550Mhz in pixel shader operations in a real world enviroment. Of course it remains to be seen if this will be the case as unified shaders are a new evolution of GPU pipelines.
Ah... one more thing... about pixel shader operations, they have nothing to due with the performance of GRAPHICS as a whole. Rather... they are important for games that rely heavily on pixel shader programs... but for games that don't they are more or less useless (after all a pixel shader program does just that... it modifies the color of a pixel). What *IS* important is the number of pipelines in a graphics processor (and with that said XENOS can push nearly twice as many pixels compared to the RSX per cycle... again assuming you are using all 48 for pixel data), your texture performance (XENOS can perform 16 texture operations per cycle (and another 16 for vertex textures) in parallal to the shader units for a total of 8 billion texals/second and the RSX can perform UPTO 24 texture operations per cycle (but if both shader units are being used you lose that texture operation) so you have anywhere from 0-13.2 billion texals/second (or a running average of 6.6 billion texals/second)). Finally... geometry performance... while a lot of PC games are not geometry limited... console games typically are, and it is important to keep this in mind in the next generation of games which will be pushing many many polygons per frame. The RSX with it's 8 vertex pipelines can perform a theoretical maximum of 1.1 billion polygons/sec (though it will never come anywhere close to this) and XENOS with it's 48 pipelines can perform upto 6 billion polygons/sec, though is hardware limited to 500 million. Though due to the number of pipelines in XENOS the Xbox360 is *FAR* more likely to achieve that 500 million polygon per second claim... the XBox360 will have an significant advantage when it comes to geometry in the next generation of consoles.
It helps if the same math is used across the two GPUs to derive the correct numbers, but as always there are smaller details that can affect things. I am giving the bigger picture here with those numbers... but it still remains to be seen how they actually perform (and we already know how the Geforce 7800 GTX performs).
There you have it... enjoy... have a good night.