Is this accurate?

I found this on this gamespot forum thing, never seen it before but i just wondered if this held its weight.... ( p.s is the forum any good?? i might register there)

On the note of GPUs in the XBox360/PS3... assuming a uncrippled Geforce 7800GTX in the PS3 at 550Mhz that would give the PS3 4.4GFLOPs vertex + 264GFLOPs pixel shader or a total of 268.4GFLOPs total programmable performance versus XENOS's 240GFLOPs at 500MHz. Seems pretty even... until you find out that XENOS could allocate any portion of that 240GFLOPs towards geometry while the RSX would be limited to at most 4.4GFLOPs of geometry performance... and more than that as you need to do texture operations you use a shader unit so you can effectively cut that 264GFLOPs down nearly in half. Under normal cicumstances I will say that half of the pipelines will need to do a texture operation at any point in time so I will estimate a reduction from 268GFLOPs to roughly 201GFLOPs overall graphics performance versus XENOSs 240GFLOPs. On top of that we have to account for architectural efficiency as XENOS was designed to achieve near maximum floating point efficiency (as the ALUs responsible for those floating point operations are in use nearly all the time). It is estimated that XENOS will achieve 90-95% efficiency versus conventional shader units of 50-60%. So accounting for that XENOS's 240GFLOPs of theoretical performance becomes more like 216-228GFLOPs... but the RSX's theoretical performance drops further from 201GFLOPs to roughly 100.5-120.6GFLOPs of theoretical overall performance.

But XENOS has the advantage when it comes to hardware features as well as geometry and pixel performance. The only real area that the RSX exceeds XENOS would be in fillrate, which is important for high resolution displays... though that is limited by graphics memory bandwidth... and that is something the RSX lacks severely compared to XENOS. The Cell processor does have higher theoretical performance, but that performance is less accessible than it is on XENON (The XBox360 CPU)... and XENON has higher integer performance and is easier to use. The Cell has... certain architectural limitations that a lot of people are not aware of, and the biggest problem with the Cell is the fact that those SPEs are asynchronous and not synchronous... further hampered by the fact those SPEs have no direct access to system memory (system memory must be paged to and from the local memory)

The Cell processor does have merits... it is a processor that is built to be scalable. This processor will be incredible for render farms as well as super computer clusters... but it is not a gaming processor and it is not designed to even be one (according to Ken Kutaragi himself whom has stated the Cell processor was NOT designed for the PS3, but rather other systems).
 
Clever marketing

!eVo!-X Ant UK said:
I found this on this gamespot forum thing, never seen it before but i just wondered if this held its weight.... ( p.s is the forum any good?? i might register there)

There are so many "mistakes" it is too much like well-known earlier marketing ariticle comparison of PS3 and Xbox360 of similar tone and conclusions.
 
Well those are ATI's numbers, 95% efficiency compared to 60ish with the conventional shaders, I assume they're referring to their own architecture, not sure what nvidia's numbers are like.
 
When he says efficiency. I'm thinking of one dedicated shader vs. one unified shader. The dedicated shaders are more efficient.

The numbers he's comparing depends on the workload which seems like more of a software problem.
 
I believe it means the amount of time shaders are sitting idle.

Isn't the whole idea of USA to improve efficiency? With a USA you can dedicate 100% of the resources to either vertex or pixel shading or split it roughly among them to ensure there's always something being worked on. Conventional shaders sit idle when there's a bottleneck at either the vertex or pixel pipelines.

The real question is, are these numbers accurate? How much time do the vertex or pixel shaders sit idle on currrent architectures, and how efficient really are ATI's USA's.

K, now the dev's please take it away!
 
Last edited by a moderator:
A unified shader architecture should be more efficient. Instead of your vertex shader always sitting around doing nothing while you're cranking on shading pixels you have twice as many shaders working on pixels, and some can be diverted away when some vertex shading needs to be done.
 
yes this is from The GameMaster posted at teamxbox forums. there are so many mistakes and numbers based on opinion that i would take Bobbler's advice.

heres the thread.
http://forum.teamxbox.com/showthread.php?t=385922&page=6&pp=15

and heres his full post.

The GameMaster said:
Don't know how I missed this thread... but I will take a small stab at this. nVidia and ATI does not use the same defination of "Shader Operations". Rather nVidia's usage of the term is about shader "instructions"... meaning they are taking the individual instructions that comprise a component of a shader operation. A true shader operation is comprised of vector operations and a scalar operation, and it takes instructions to process those operations. Now both nVidia's NV50 and XENOS both are capable of Vec4+Scalar... or 4 Vector operations and one scalar operation per shader operation and do so in one cycle per shader unit. A shader unit consists of ALUs that perform vector operations and another ALU that performs a scalar operation... and each shader unit can perform ONE shader operation per cycle.

Now the NV50 has 24 pixel pipelines and each pixel pipeline has 2 shader units for a total of 48 shader units... and assuming the RSX is an uncrippled NV50 it would be the same on the RSX. XENOS has 48 pipelines that can be used for either pixel or vertex programs... and each one has 1 shader unit each. That means that all 24 pixel pipelines would pump out around 48 shader operations per cycle and so would XENOS if all pipelines was used for pixel shader programs. As each of the shader operations are 5 component (4 Vector and 1 scalar) that would come out to around 240 shader components per cycle for both. Given the estimated clock speeds of the RSX of 550Mhz and XENOS given clock speed of 500Mhz that would come out to around 132 Billion shader components per second or 26.4 Billion shader operations per second for the RSX (again assuming an uncrippled NV50 at 550Mhz) and around 120 Billion shader components per second or 24 Billion shader operations per second for XENOS.

So there you have it... 132 billion shader components per second (RSX) versus 120 billion shader components per second (XENOS). Is this the end of the story? Nope... there are two big areas that must be considered... architectural caveats and efficiency. Now I said that XENOS has 48 pipelines that could be used as either pixel or vertex data... that means just that... and you don't want to use all of your pipelines for pixel data as that would mean you wouldn't have any pipelines for geometry... that is unless XENON is capable of processing geometry in lieu of the GPU (and while I think the capacity is there... I am not sure, so for now I won't say it does for the sake of this arguement). So the actual amount of pipelines for processing pixel shader programs will likely be less than 48... so that would be upto 24 billion shader operations per second for XENOS... and assuming your average XBox360 game takes about 8-10 pipelines (this is not a set value... it REALLY depends on the game) for vertex programs that would leave around 38-40 pipelines for pixel shader programs or around 19-20 Billion shader operations per second on average... again this is not a set number as it is dependant on the game. As for the RSX... as it is based on the Geforce NV3x/4x/5x marchitecture there is one important thing to remember about how the shader units are arranged in the pixel pipelines. You remember that I said there was 2 shader units per pixel pipeline right? What I didn't say is that one of those shader units is coupled to a texture unit in each of the pixel pipelines... meaning which if a texture operation needs to be done you LOSE one shader unit from the pixel pipeline... and typically in a game texture operations are being done nearly all of the time. So given an uncrippled NV50 at 550Mhz that would actually be around 13.2-19.8 billion shader operations per second.

So if anything both the RSX and XENOS are pretty close in terms of pixel shader operations that they can perform per second... with perhaps a small edge for XENOS on average. Then we goto the second concern... efficiency. Now this is a factor that is a little harder to judge, but based on experience in dealing with Geforce cards (which I use at home and at work) their average efficiency is around 30-40%. Shader units will NEVER operate at their peak speed... so it is important to improve the efficiency and utilization of those shader units. Shader units stall VERY often during a game for various reasons... and ATI has taken upon themselves to make changes to their marchitecture to improve their utlization by introducing concepts such as "Ultra Threading". Through extensive multithreading of the shader units you almost completely remove most of the stalling shader units so they are nearly always doing something now instead of doing nothing. This was introduced on XENOS as well as ATI's newest PC GPU the Radeon x1000 series. ATI's Radeon x800 series GPUs was already more efficient compared to nVidia's in terms of their actual shader performance even though they was at a major disadvantage in shader units (as the Radeon cards only have 1 full shader unit per pixel pipeline). Even though the Geforce 7800GTX has more than twice the shader power of the Radeon x1800XT the x1800XT still manages to completely destroy the Geforce 7800GTX in actual pixel shader performance (as reflected in pixel shader heavy games that was benchmarked). It was stated by ATI that the average efficiency for XENOS was 95%... and based on what I have seen the Radeon x1000 series comes close to that and probably slightly exceeds the utlization rate of XENOS. The previous Radeon x850 had an average efficiency (stated by ATI) of around 50-60% and the Radeon x1800XT had the same shader arrangement and number of shader units so the x1000 series nearly doubled it's efficiency over the previous generation Radeon. Why am I saying all of this? Well lets see what happens to those effective numbers when we start to talk about efficiency and how it will act in real world applications.

I stated that XENOS is capable of a maximum of 24 billion shader operations per second, but an average scenerio would see only 19-20 billion shader operations per second as not all pipelines would be used for pixel shader programs. I also said that the RSX (assuming an uncrippled NV50 at 550Mhz) is capable of a maximum of 26.4 billion shader operations per second, but an average scenerio would see only 13.2-19.8 billion shader operations per second due to the fact texture operations would remove one shader unit per pixel pipeline. Now let's see what happens when we apply efficiency claims... the RSX would have an similar average efficiency compared to the Geforce 7800 GTX (as they are basically the same GPU)... and that would mean roughly 50-60% average efficiency. That would mean that the RSX would exhibit roughly 6.6-11.9 billion shader operations per second in a real world application on average. XENOS is stated to have an efficiency of around 95%, and after seeing the effects of their implementations of threading on the Radeon x1000 series I am inclined to believe this... but for the sake of argument I am going to use an 80-95 percent efficiency range. If what was stated is accurate by ATI then that would mean that XENOS would exhibit roughly 15.2-19 billion shader operations per second in a real world application on average.

Here is a recap on the numbers...

GPU Configuration...
*RSX > 24 pixel pipelines with 2 shader units each (1 coupled to a texture unit) at 550Mhz
*XENOS > 48 unified pipelines with 1 shader unit each at 500Mhz.

Maximum theoretical shader operations (a shader operation is comprised of vector operations and a scalar operation)...
*RSX > 48 shader units at 550MHz = 26.4 billion shader operations per second.
*XENOS > 48 shader units at 500Mhz = 24 billion shader operations per second.

Accounting for architectural caveats...
*RSX > When texture operations are done you lose a shader unit per pixel pipeline... so you would have only 13.2-19 billion shader operations per second of the maximum 26.4 billion shader operations per second available.
*XENOS > Pipelines have to be used for geometry (as far as I know) so considering your average x360 game using roughly 8-10 of those 48 pipelines for geometry you would have roughly 19-20 billion of the maximum 24 billion shader operations per second available.

Accounting for architectural efficiency...
*RSX > Much like the Geforce line of GPUs they would exhibit roughly 50-60% of their effective shader performance in a real world application. So you would have only 6.6-11.9 Billion shader operations per second of the 13.2-19 billion shader operations per second maximum effective.
*XENOS > It is stated it has an efficiency of around 95% and stands to be extremely efficient as it achieves a very high utlization rate, but I am using a 80-95% range for this argument. So you would have only 15-19 billion shader operations per second of the 19-20 billion shader operations per second maximum effective (on average).

So considering everything XENOS should perform roughly 2-3 times faster than the RSX at 550Mhz in pixel shader operations in a real world enviroment. Of course it remains to be seen if this will be the case as unified shaders are a new evolution of GPU pipelines.

Ah... one more thing... about pixel shader operations, they have nothing to due with the performance of GRAPHICS as a whole. Rather... they are important for games that rely heavily on pixel shader programs... but for games that don't they are more or less useless (after all a pixel shader program does just that... it modifies the color of a pixel). What *IS* important is the number of pipelines in a graphics processor (and with that said XENOS can push nearly twice as many pixels compared to the RSX per cycle... again assuming you are using all 48 for pixel data), your texture performance (XENOS can perform 16 texture operations per cycle (and another 16 for vertex textures) in parallal to the shader units for a total of 8 billion texals/second and the RSX can perform UPTO 24 texture operations per cycle (but if both shader units are being used you lose that texture operation) so you have anywhere from 0-13.2 billion texals/second (or a running average of 6.6 billion texals/second)). Finally... geometry performance... while a lot of PC games are not geometry limited... console games typically are, and it is important to keep this in mind in the next generation of games which will be pushing many many polygons per frame. The RSX with it's 8 vertex pipelines can perform a theoretical maximum of 1.1 billion polygons/sec (though it will never come anywhere close to this) and XENOS with it's 48 pipelines can perform upto 6 billion polygons/sec, though is hardware limited to 500 million. Though due to the number of pipelines in XENOS the Xbox360 is *FAR* more likely to achieve that 500 million polygon per second claim... the XBox360 will have an significant advantage when it comes to geometry in the next generation of consoles.

It helps if the same math is used across the two GPUs to derive the correct numbers, but as always there are smaller details that can affect things. I am giving the bigger picture here with those numbers... but it still remains to be seen how they actually perform (and we already know how the Geforce 7800 GTX performs).

There you have it... enjoy... have a good night.
 
Hey, he's a MS ******. What's the problem? We certainly have a few on the Sony side.

He's like Jawed, he makes me feel better about the state of Xbox hardware.

If we could get some damn PS3 games to compare, we'd start learning something!
 
makaveli87 said:
yes this is from The GameMaster posted at teamxbox forums. there are so many mistakes and numbers based on opinion that i would take Bobbler's advice.

Would anyone happen to have a picture of Goose, post-ejection, from Top Gun ?
 
dantruon said:
is this guy the descendant of Major Nelson?

No it sounds more like the world according to ATI.

A unified shader architecture should be more efficient. Instead of your vertex shader always sitting around doing nothing while you're cranking on shading pixels you have twice as many shaders working on pixels, and some can be diverted away when some vertex shading needs to be done.

Now that I read the whole thing, I think Nvidia repsonse to all of this was unified=competing=unpredictable. It sounds great on paper but the problem of generating the correct workload becomes more complicated than with fixed shaders.


I'm not too familiar with shader stalling but what if I send 500 vertex shading operations to Xenos. Is every shader going to switch over to work on vertices and stall any pixel shader ops?
In the dedicated shaders I'd guess it would stall the vertex ops up to the number of vertex shaders while still processing pixels in the meantime.

Any devs here can enlighten us on this?
 
Bill said:
Hey, he's a MS ******. What's the problem? We certainly have a few on the Sony side.

He's like Jawed, he makes me feel better about the state of Xbox hardware.

If we could get some damn PS3 games to compare, we'd start learning something!

I'd say that's insulting to Jawed. Jawed at least knows what hes talking about (if not a bit overzealous about guestimating performance ;)).

This clown must have gone to school at the Major Nelson academy of FUD.
 
Yeah XENOS is a monster, after all playing with Kameo tonight, I saw blurry backgrounds (crappy depth of field technique?), aliasing!!! (read: jaggies), and a total lack of anistropic filtering, and all that from a MS first party developer.

You think if anyone who could have gotten the graphics right, it would have been them!

Nice shader power though, but too bad all those other flaws. I really thought the next generation would solve all those problems, but I guess not.
 
Back
Top