A primer on the X360 shader ALU's as I understand it

superguy

Banned
I had always thought, you can break it down to ALU count, it's 56-48 Nvidia. RSX>Xenos.

Then I had further though, each ATI ALU was weaker.

Turns out thats not the case.

The key instruction I believe is MADD's.

R580 has twice as many ALU's, but the second can mainly do ADD instructions, which are less crucial I believe.

In MADD isntructions, you can get 48 a cycle out of Xenos, 56 R580, and 56 RSX. R580's 48 pipelines can however only execute 48 MADD's you see. Just like Xenos could.

So I believe more or less it appears impossible for RSX to be largely more powerful than Xenos.

Then you will have the varying architecture's minor strengths to sort out as well. RSX has free norm, Xenos might have better branching, etc.

I believe this sheds more light on the systems and why the latest 360 games really look quite good.
 
Why are you comparing R580 with Xenos? The R580 can do 56 MADD and 48 ADD per cycle, Xenos can do 48 MADD or 48 ADD per cycle.
 
I don't see much worth in trying to compare them. Too many things to account for, since there's the extra scalar on Xenos ALUs, there's no guarantee that the R580 primary ALU and Xenos ALU are the same for executing instructions, even moreso between the ATI and NV ALUs... and then the NV structure as a whole, the possibilitiy/probability of having to deal with register pressure, making it unlikely that dual madds will be issued in full precision, requiring partial precision where possibler to get around that..... yeah.
 
Is MADD the equal of multiply two numbers and add a third to the result, in this case we can say that 1 MADD= 2 instructions?
 
Well really Sony will have also 10% more clock.

So add 6 (5.6) ALU's in effect. It becomes 62-48 effective.

BUT 8 of those are locked in to vertex..both have 48 for shading. Xenos has to break off some for vertex though. But I think a lot of times it will be less than 8..since vertex shading is seen as rarely a bottleneck on PC. So Xenos might well use say, 42-44 ALU's for shading much of the time (I think). The other 4-6 for vertex.

So Xenos will have to hope to rely on memory bandwidth of the EDRAM to make up ground.

Aso perhaps: Better dynamic branching, unified architecture efficiency, lack of texturing handicap, one extra processing component on each ALU (5D) which may or may not be useful, etc.

And as Turn says, this is all a bit of an exercise in futility. Since there are so very many variables we're not educated enough to know about.
 
PS3=MORE POWER+STANDARD HARD DRIVE
360=EASIER\CHEAPER DEVELOPMENT, BETTER TOOLS\SUPPORT

We'll see who wins out in the end

I just wish Sony would get their show on the road gettin tired of waitin for a PS3, Glad Ghost Recon, Oblivion and Battlefied to keep me busy until the next hardware release.
 
c0_re said:
PS3=MORE POWER
I doubt that'll be true for graphics. Bandwidth, dynamic branching, alpha blending, and vertex shader heavy areas will all be very big strengths of Xenos (i.e. factors of 2 or more).

Texturing could be faster on RSX if bandwidth allows, but even scenarios like MADD crazy pixel shaders with FP16 nrm's would only be tens of percent faster at best, IMO.
 
Mintmaster said:
I doubt that'll be true for graphics.

Depends what you consider "graphics" to be a function of..

Mintmaster said:
Bandwidth

Not to nitpick, but framebuffer bandwidth is an advantage for Xenos. PS3 has the advantage with every other kind of bandwidth.

As for vertex shading, for bursty vertex work it has an advantage, but if you were generally expending more power on vertex shading over the entire frametime than RSX is capable of, you'd be leaving yourself with quite a lot less power for pixel shading. I think it's fair to say that overall, games are (much) heavier on pixel shading anyway..

Mintmaster said:
even scenarios like MADD crazy pixel shaders with FP16 nrm's would only be tens of percent faster at best, IMO.

"Tens of percent" could be pretty significant! Well, if you were bound by these things, it'd be all you need to go from playable to non-playable, or 60fps to 30fps. Though I doubt you often would be..
 
Last edited by a moderator:
Titanio said:
......"Not to nitpick, but framebuffer bandwidth is an advantage for Xenos. PS3 has the advantage with every other kind of bandwidth.

As for vertex shading, for bursty vertex work it has an advantage, but if you were generally expending more power on vertex shading over the entire frametime than RSX is capable of, you'd be leaving yourself with quite a lot less power for pixel shading. I think it's fair to say that overall, games are (much) heavier on pixel shading anyway...."

Curious, the comment you made refering to RSX. Is this just your opinion? Or has there been information released on the actual RSX chip at GDC? If so could you point me to a link?
 
zRifle1z said:
Curious, the comment you made refering to RSX. Is this just your opinion? Or has there been information released on the actual RSX chip at GDC? If so could you point me to a link?

Which part in particular? Having more bandwidth for everything else? Or the bit about vertex vs pixel work? The latter assumes a 8:24 pipe configuration in RSX, which isn't too much to assume I don't think.
 
Titanio said:
Which part in particular? Having more bandwidth for everything else? Or the bit about vertex vs pixel work? The latter assumes a 8:24 pipe configuration in RSX, which isn't too much to assume I don't think.

Only the comment about RSX. The way your statemtent was worded it seemed to be based on factual RSX information. I figured Sony may have finally released information on the GPU for the PS3. If it was in your opinion, as you stated above, then never mind.
 
Mintmaster said:
I doubt that'll be true for graphics. Bandwidth, dynamic branching, alpha blending, and vertex shader heavy areas will all be very big strengths of Xenos (i.e. factors of 2 or more).
Texturing could be faster on RSX if bandwidth allows, but even scenarios like MADD crazy pixel shaders with FP16 nrm's would only be tens of percent faster at best, IMO.

Factor SPEs in (sharing in Vertex work) and Xenos may actually be at a disadvantage in this area--I mean this would technically only be for vertex burst situations, which are much more rare than the demand for pixel shaders. For graphics, I think you need to look at the system collectively (CPU/GPU) for PS3...at least much more so than for X360. In this way PS3 could sustain 48 pixel shader ALUs and 8 vertex shader ALUs +SPE vertex assistance at all times, while Xenos constantly needs to be dynamically load balancing between the shaders need for vertex and pixel work in a pool of 48 ALUs total.

The external interactions between CELL and RSX will be just as important and whats going on inside of the RSX IMO. Maybe even moreso to the system as a whole.
 
Last edited by a moderator:
Titanio said:
....I think everyone can now agree that much of 360's early stuff, the stuff we were comparing PS3 to back then, wasn't quite "blow me away" amazing.


Titanio, I apologize if it seems as though I'm picking your threads out in particular, but beside the MGS4 real time tech demo what actual PS3 gameplay is there to use in comparison with actual XBOX 360 games. I found it very amusing to see everyone comparing actual in game Gears of War gameplay video to the CGI and tech demo's created by Sony.
 
zRifle1z said:
Titanio, I apologize if it seems as though I'm picking your threads out in particular, but beside the MGS4 real time tech demo what actual PS3 gameplay is there to use in comparison with actual XBOX 360 games.

I'm not sure what excuses MGS4, but its irrelevant - I'm talking about the general quality on show on 360 at the time. Most of the "not truly next-gen" jibes came out of quickie ports from current gen games, and that was well deserved. Compare the general quality on 360 then to the general quality of emerging titles now, and you can see the difference also. There were exceptions back then, though, and the press did INDEED celebrate them and their visuals, rightly so (e.g. PGR3, Kameo), which makes the claim that the press labelled all 360 stuff as "not truly next-gen" utter BS.

I'd just ask certain people to cut the "victim" act.

Anyway, this is horribly horribly OT, and I apologise for precipitating further OT discussion. I'll leave it at that, if anyone wants to respond further to what I said, feel free to PM me.
 
Last edited by a moderator:
binky said:
Aso perhaps: Better dynamic branching, unified architecture efficiency, lack of texturing handicap, one extra processing component on each ALU (5D) which may or may not be useful, etc.

In the context of your argument, I would say that the Unified SA is a disadvantage, not an advantage. Thats because you have already discounted the G70/G71's vertex shaders from the equation which is the only area I see the USA improving efficiency.

i.e. I would expect that a large portion of G70's vertex shader array is constantly idle because it needs to be overkill to allow the far more transistor intensive pixel shader array to stay constantly active.

Therefore if you remove vertex shaders from the equation and compare the G70's 48 ALU's to the Xenos's 44-46 (minus 2-4 for vertex shading) then the G70 ALU's are going to be more efficient since they are designed for pixel shading where-as the Xenos ALU's are more generally designed and so likely to be less efficient.

Thats assuming of course that the pixel shader array isn't stalled by the vertex shader array but I would find it incredibly poor design if that were the case for any significant percentage of the time.
 
scooby_dooby said:
That's exactly what I'm talking about, Sony blows smoke up your ass,and when/if they don't deliver you just rationalize it and make excuses for them. Simple, if PS3 is not visibly better on screen than 360 then Sony is full of shit and they lied.

Will I still buy one? Yup. Do I care it's not more powerful? Nope. But that doesn't change the fact they are full of crap...

What the heck is that supposed to mean? You think people actually base their decisions off of PR non-sense. Most casual gamers don't even hear 99% of this crap.

Every MS/Sony/Nintendo evangelist here that preaches based off PR from each company is lacking in mental capacity. In all honesty, the difference will probably be like the difference graphically between PS2/GC (X360) and XBOX (PS3), which is marginal by casual standards (which is the only standard that matters, anyway). Everyone in the know understands this and has made a value-judgement based on this realistic assesment. Besides that, 100% of gamers will decide which system they want based on intellectual properties available on said sytem. Dwelling on PR speak from 5 years ago makes you look like an overly-cynical, jaded fool who no longer enjoys actually playing the technology he argues over.
 
BUT 8 of those are locked in to vertex..both have 48 for shading. Xenos has to break off some for vertex though. But I think a lot of times it will be less than 8..since vertex shading is seen as rarely a bottleneck on PC. So Xenos might well use say, 42-44 ALU's for shading much of the time (I think). The other 4-6 for vertex.

No not really as the Xenos has to order either Vertex or Pixel Shading in groups of 16 ALU's.
So for example 16 Vertex and 32 Pixel, 32 Vertex and 16 Pixel or 0 Vertex and 48 Pixel.

I don't see much worth in trying to compare them. Too many things to account for, since there's the extra scalar on Xenos ALUs, there's no guarantee that the R580 primary ALU and Xenos ALU are the same for executing instructions, even moreso between the ATI and NV ALUs... and then the NV structure as a whole, the possibilitiy/probability of having to deal with register pressure, making it unlikely that dual madds will be issued in full precision, requiring partial precision where possibler to get around that..... yeah.

Well I'm quite sure partial precision is always gonna be used on the Xenos as it can't even manage 16bit Floating Point HDR.
I don't think the Xenos is quite as capable as Ati would like us to think it is.
Just think about it the Xenos needs the eDRAM as doing HDR and FSAA on the GPU itself would slow it too a crawl which it already does anyway.:LOL:
Even with not rendering HDR and FSAA the Xenos still isn't capable of doing Trilinear Filtering much less Anisotropic Filtering.
Any games on the Xbox 360 pushing the Xenos a little result in crappy framerates.
I mean the Xbox 360 is a brand new piece of hardware and yet still almost every game runs on 30fps.

Okay sure in theory the Xenos can do much more shader excutions than any current GPU out there but that's just what it is theory.
I don't see the Xenos push anything a X1800XL/XT couldn't push in a closedbox architecture not even mentioning the X1900XT/XTX.

Futhermore the RAW computational power of the Xenos is really quite pathetic as it can only rival a X800XT at best.
500 Million Vertices per second (1 Vertex instruction x 500MHz) 48 Billion Shader operations/s (96 Shader ops x 500MHz), 8GTexel/s (16 Textures x 500MHz), 4GPixel/s (8 ROPs x 500MHz) and 240GFlop/s (48 ALU's x 10 instructions per ALU x 500MHz) is really quite pathetic.
Compare it to a 7900GTX @550MHz (thinking the RSX is a G70/71 @550MHz) which does 1100 Million Vertices (2 Vertex instructions x 550MHz), 74,8 Billion Shader operations/s (136 Shader ops x 550MHz), 13,2GTexel/s (24 Textures x 550MHz), 8,8GPixel/s (16 ROPs x 550MHz) and 400,4GFlop/s (24 ALU's x 27 Pixel instructions + 8 ALU's x 10 Vertex instructions = 728 x 550MHz).
Comparing to the X1900XTX gives pretty much the same results as it does 1300 Million Vertices (2 Vertex instructions x 650MHz), n/a Billion Shader operations/s (as it is unknown), 10,4GTexel/s (16 Textures x 650MHz), 10,4GPixel/s (16 ROPs x 650MHz) and 426,6GFlop/s (48 ALU's x 12 Pixel instructions + 8 ALU's x 10 Vertex instructions = 656 x 650MHz)
 
Guilty Bystander said:
Okay sure in theory the Xenos can do much more shader excutions than any current GPU out there

This makes no sense, The 7900GTX and X1900XT/X all push significantly more shader instructions/GFLOPs than Xenos and you posted those figures later in your post. The GTX512 is a little ahead aswell although once you account for texture addressing and the inefficiency of vertex shader use its difficult to call.
 
This makes no sense, The 7900GTX and X1900XT/X all push significantly more shader instructions/GFLOPs than Xenos and you posted those figures later in your post. The GTX512 is a little ahead aswell although once you account for texture addressing and the inefficiency of vertex shader use its difficult to call.

Well the amount of Shader ops don't have a thing to do with a Shader effect can be.
For instance a SM2.0 Pixel Shader can do 96 Shader executions whereas a SM3.0 Pixel Shader can do 512 Shader executions.
Doing lot's of SM2.0 Shaders on a SM3.0 card with lot's Shader ops computational power means you can use lot's of SM2.0 Shaders.
Doing SM3.0 Shaders can lead to much more complex and thus better looking game scenarios but take a much higher toll on the GPU's amount to process Shader ops.
It's just like playing a game on a higher resolution requiring more Pixel and Texel fillrates because of the higher res (more pixels) and the need for more textures (because of the higher res).
The Xenos can do 4096 Shader executions which means Shaders can be more complex but as the figures I posted point out that even the Xenos might be able to do more complex Shaders it will never be used as the Xenos can't calculate enough Shaders to make good use of that kind of complex Shaders.
 
Guilty Bystander said:
<snip>Lots of very wrong numbers
Where do you get those numbers from? They are way way too high (4-10 times a few cases).
 
Back
Top