In my opinion the thrue secret of RSX is worst fill rate bandwidth against Xenos with MSAA and probably some superiority in pixel shader processing and texel rate(translated by Altavista):
Xenos Shader Performance
With Shawn HargreavesBlog "The Xbox GPU is a shading monster! "With being to say, your own PC environment (XPS M1210) with you tried comparing.
As for using the number of orders exceeds 500 orders lightly with Julia gathering シェーダ.
Being there is no texture order, purely operational efficiency (ALU efficiency) it becomes to calculate.
So result is as follows.
XPS M1210: GeForce Go 7400 (450MHz) -> 2.308fps (for RSX results maybe add -> 6* for more ALUs and * 1.22 for clock advantage)
XBOX360: Xenos (500MHz) -> 17.657fps
About approximately 7.651 times it became the result, high speed in comparison with GeForce Go 7400.
GeForce Go 7400: 4pixel shader * 2ALU = 8ALU (in pixel shaders RSX 6*)
Xenos: 3shader pipe * 16ALU = 48ALU
When it converts from the quantity of ALU, almost equal っ て thing to GeForce 7800GTX?
http://texhnologix.blogzine.jp/texhnologix/2006/12/xenos_shader_pe.html
Xenos MADD Performance
Using シェーダ only of product-sum operation, it tried measuring Xenos peak efficiency.
As for シェーダ sufficient simple ones product-sum operation 4 element vectors 1024 times.
However in order for the highest efficiency to appear with NV40 architecture, the line of sum of products order is adjusted.
So result is as follows.
GeForce Go 7400 (450MHz) -> 2.885fps (add for RSX 6* for more alus and *1.22 for more clock = 450MHz to 550MHz in pixel shader pipe)
Xenos (500MHz) -> 19.543fps
(19.543fps/2.885fps) & (450MHz/500MHz) = 6.10 times
Approximately, it reached 6 times and matched ratios of the quantity of ALU beautifully.
(1280 * 720) * 1024op * 2.885fps = 2.722Gops
(1280 * 720) * 1024op * 19.543fps = 18.443Gops
This becomes peak value っ て thing of effective efficiency.
It meaning that effective value is approximately 75% of theoretical values, you think that considerably it is excellent.
Simply as for NV40 system unless you must pay attention order and in order to move ALU1 and ALU2 efficiently difficult point.
http://texhnologix.blogzine.jp/texhnologix/2006/12/xenos_madd_perf.html
Xenos Fill-Rate Performance
Because from before the effect of EDRAM of Xenos had become matter of concern, when 64 times making overdraw with D4 resolution, it tried measuring frame rate.
It means with the drawing of 1 time 58.982 Mpixels to draw.
(1280 * 720) pixels * 64 overdraw = 58.982 Mpixels
When it is with there is no α blend is, like below in the result.
-- G72M --
MSAAx1: 58.982 Mpixels * 15.953 fps = 941 Mpixels/sec (11.292 GB/s) (for RSX adds -> * 1.84 and for more bandwidth and *1.11 for more clock)
MSAAx2: 58.982 Mpixels * 8.010 fps = 472 Mpixels/sec (11.328 GB/s)
MSAAx4: 58.982 Mpixels * 3.997 fps = 236 Mpixels/sec (11.328 GB/s)
-- Xenos --
MSAAx1: 58.982 Mpixels * 56.497 fps = 3332 Mpixels/sec (39.984 GB/s)
MSAAx2: 58.982 Mpixels * 55.814 fps = 3292 Mpixels/sec (79.008 GB/s)
MSAAx4: 58.982 Mpixels * 54.895 fps = 3238 Mpixels/sec (155.415 GB/s)
(Z reading 込 + Z entry + Color entry) = 12bytes/pixel
InteliSample of G72M has decreased it meaning that compressed function is excluded, being proportionate to the number of samples of MSAA, performance.
Being 11GB/s to simply actual memory zone being 7.2GB/s is puzzle. When it judges, that it is necessary to draw clearly with the early Z test by the hierarchical Z buffer early of the pixel unit cancelling the Z test, you exclude the Z reading 込 kana.
Consequently there is α blend, when like below in the result.
-- G72M --
MSAAx1: 58.982 Mpixels * 9.150 fps = 540 Mpixels/sec (8.635 GB/s)
MSAAx2: 58.982 Mpixels * 4.615 fps = 272 Mpixels/sec (8.710 GB/s)
MSAAx4: 58.982 Mpixels * 2.301 fps = 136 Mpixels/sec (8.686 GB/s)
-- Xenos --
MSAAx1: 58.982 Mpixels * 56.497 fps = 3332 Mpixels/sec (53.317 GB/s)
MSAAx2: 58.982 Mpixels * 55.814 fps = 3292 Mpixels/sec (105.345 GB/s)
MSAAx4: 58.982 Mpixels * 54.895 fps = 3238 Mpixels/sec (207.220 GB/s)
(Z reading 込 + Z entry + Color reading 込 + Color entry) = 16bytes/pixel
The α blend of Xenos as for cost free っ て however you have known, when really it tries trying, is enormous. As for G72M about approximately 40% as for Xenos completely there is no change to filling efficiency falling. As for the effect of EDRAM tremendous shelf.
http://texhnologix.blogzine.jp/texhnologix/2007/06/xenos_fillrate_.html
Xenos Fill-Rate Performance (2)
Next texture there is (a 1024*1024 32bpp), when it tried measuring.
-- G72M --
ROP: 58.982 Mpixels * 8.010 fps = 472 Mpixels/sec (5.664 GB/s) (for RSX adds -> * 4.1 to 6 times for more bandwith for textures at least counting with XDR/FlexIO acess)
TEX: (1024 * 1024 * 32bpp) * 64 * 8.010 fps = 2.147 GB/s
-- Xenos --
ROP: 58.982 Mpixels * 45.024 fps = 2656 Mpixels/sec (31.872 GB/s)
TEX: (1024 * 1024 * 32bpp) * 64 * 45.024 = 12.079GB/s
It is the proper result, but as for G72M memory zone has become the problem. Xenos about 20% performance has decreased with texture fetch. Because well the cost of 12GB/s is paid with texture fetch, if you mention the proper, but naturally what
http://texhnologix.blogzine.jp/texhnologix/2007/06/xenos_fillrate__1.html
(Xenos maybe in certain moments can process 50% to 10 times more fill rate with MSAA than RSX... and RSX with FlexIO /XDRAM acess can surpass C1/R-500/Xenos in Texel Rate)