Anandtech - Inside the XBox 360 article

Anand's comparing the XGPU in power to a 24 pipe R420 is the most ridiculous thing I have heard in a long time. I would like to understand his basis for this assumption.

First he made that old (now removed) X360 post and now this.

www.Anandtech.com respect -4
 
Master-Mold said:
Anand's comparing the XGPU in power to a 24 pipe R420 is the most ridiculous thing I have heard in a long time. I would like to understand his basis for this assumption.

First he made that old (now removed) X360 post and now this.

www.Anandtech.com respect -4

thats odd esp since the r420 is only sm2.0
 
There are 48 shader units in the Xbox 360 GPU, but given that we're dealing with a unified shader architecture, you can't compare that number directly to the 24 shader pipelines of the GeForce 7800 GTX for example. We roughly estimated the shader processing power of the Xbox 360 GPU to be similar to that of a 24-pipeline ATI R420 GPU.


Roughly estimate? Real roughly apparently.
 
I find it odd that two of the RAM chips are under the heatsink while the other two are out in the open. I thought all four chips would get some heatsink love.
 
Sorry about the thread title, didn't knew there were rules about it.

Anyway, yeah this article is soooo wrong in many ways, i hadn't even saw the Cell commentary.

Really, all around, this article must be some prank article, or a joke. "Roughly estimate" ROLFMAU
 
Alpha_Spartan said:
I find it odd that two of the RAM chips are under the heatsink while the other two are out in the open. I thought all four chips would get some heatsink love.


There is 8 chips, 4 are on the bottom (you can see the bottom four in one of the pictures -- the picture right before the pic showing the top four). I checked after my earlier post. Those 2 chips probably aren't even touching the actual heatsink anyways.
 
These RAM chips probably don't need to be heatsunk anyway. GDDR simply doesn't seem to get all that hot, the memory on my seriously overclocked GF6800 doesn't. They feel air temperature, even under full load.

As for shading power... A traditional ATi (or Nvidia) chip probably has the edge on shorter shaders. Xenos only works on 8 fragments at a time and would get hamstrung compared to a traditional chip that can work on three times as many. It simply won't be able to leverage all 48 shading units in these cases. Longer shaders things will start to equalize, but it will of course have to dedicate some shading power for vertices, so depending on the amount of geometry the balance between either chip might shift.

Still, it's a pointless discussion, we know in the end x360 will win in a straight battle against a contemporary PC because it won't be hampered by MS's clumsy directx function calls that limits 3D performance at the CPU level, nor will it have to run the typical 25-ish background services for windows or the myriad of other user threads that spins aimlessly round and round in most peoples' PCs.
 
Anand calculated the 24 pipe R420 thing in his original Xenos article. It's been in the back of my mind for months.

Basically he totalled up the ALU's, and how "many wide" vectors they could crunch. Xbox only has a 48-38 ALU edge on R420, but it can calculate wider vectors.

But I'm not sure that on shaders, it helps any, according to some comments here. It might only be because they have to vertex shade.

Interestingly, if you add 50% to R420 benches, it's still like 30% less powerful than a 512 7800.

You can still make the case for Xenos though. You just have to assume it jumped a lot in efficiency over R420 due to unified shaders, or something like that.
 
Last edited by a moderator:
How does a "tiling engine" works for x360 ? Is the game engine is optimized , will it be able to deliver true 720p with AA ???
 
rosman said:
How does a "tiling engine" works for x360 ? Is the game engine is optimized , will it be able to deliver true 720p with AA ???
That's the theory. For Xenos to apply it's eDRAM magic effecitvely it needs a predicated tiling rendering engine. This isn't supposed to be too hard to implement but needs to be worked in from the ground up as I understand it. Hence early titles that were in development long before Xenos was around to write for aren't at all optimized for the hardware.
 
"There are 3 parallel groups of 16 shader units each. Each of the three groups can either operate on vertex or pixel data. Each shader unit is able to perform one 4 wide vector operation and 1 scalar operation per clock cycle. Current ATI hardware is able to perform two 3 wide vector and two scalar operations per cycle in the pixel pipe alone. The vertex pipeline of R420 is 6 wide and can do one vector 4 and one scalar op per cycle. If we look at straight up processing power, this gives R420 the ability to crunch 158 components (30 of which are 32bit and 128 are limited to 24bit precision). The Xbox GPU is able to crunch 240 32bit components in its shader units per clock cycle. Where this is a 51% increase in the number of ops that can be done per cycle (as well as a general increase in precision), we can't expect these 48 piplines to act like 3 sets of R420 pipelines. All things being equal, this increase (when only looking at ops/cycle) would be only as powerful as a 24 piped R420."

That is the original Anand qoute.

I've gone round and round on Vec3 vs Vec4 though, and nobody seems to have a straight answer on what it means.

I also tried to figure out if Nvidia ALU's are vec3 or 4. Different sites say different things.

Jawed commented that Vec4 may not be useful for shaders. It's there for vertexs. Then he said it might be useful for shaders.
 
Last edited by a moderator:
Bill said:
You can still make the case for Xenos though. You just have to assume it jumped a lot in efficiency over R420 due to unified shaders, or something like that.

You should also remember the ATI email that said Xenos would run the Toy Store demo faster at HD reolutions than the X1800 because of it's "slightly higher shading power"

That email could've easily been a fake though.
 
X1800 has large efficiency gains over R420, though.

A X1800XL has similar clocks and pipelines count as a X850XT, yet is more powerful. That's the efficiency gains.

These are probably due to the memory controller and thread dispatcher.

Both take a lot of transistors, apparantly.

I'm sceptical of ascribing those improvements to Xenos..

If you compare it to X1800, the shader power should be close, X1800Xt is 625mhz. That's what the email said as well.

Another thing to consider is RSX may be bandwidth limited.
 
Last edited by a moderator:
Bill said:
"There are 3 parallel groups of 16 shader units each. Each of the three groups can either operate on vertex or pixel data. Each shader unit is able to perform one 4 wide vector operation and 1 scalar operation per clock cycle. Current ATI hardware is able to perform two 3 wide vector and two scalar operations per cycle in the pixel pipe alone. The vertex pipeline of R420 is 6 wide and can do one vector 4 and one scalar op per cycle. If we look at straight up processing power, this gives R420 the ability to crunch 158 components (30 of which are 32bit and 128 are limited to 24bit precision). The Xbox GPU is able to crunch 240 32bit components in its shader units per clock cycle. Where this is a 51% increase in the number of ops that can be done per cycle (as well as a general increase in precision), we can't expect these 48 piplines to act like 3 sets of R420 pipelines. All things being equal, this increase (when only looking at ops/cycle) would be only as powerful as a 24 piped R420."

That is the original Anand qoute.

I've gone round and round on Vec3 vs Vec4 though, and nobody seems to have a straight answer on what it means.

I also tried to figure out if Nvidia ALU's are vec3 or 4. Different sites say different things.

Jawed commented that Vec4 may not be useful for shaders. It's there for vertexs. Then he said it might be useful for shaders.

From what I understand Nvidia's ALUs are dual (Vec4 + scalar) for the pixel ALUs.

If thought MADDS a Vec4 op were very common in pixel shaders?
 
Both Anand and Major Nelson have mentioned this vec4 concept. The idea Xenos ALU's process more.

I cant figure it out though. The terms are thrown around so loosely.


It seems vertex pipes are always vec4, while shader pipes are vec3 you see. In order to do both, Xenos pipes must be vec4 also.


Note what Anand said about R420:

"Current ATI hardware is able to perform two 3 wide vector and two scalar operations per cycle"

That is because each has two ALU's, that do a vec3 and a scalar.

Major Nelson says:

"It is important to note that if the RSX ALUs are similar to the GeForce 6800 ALUs then they work on vector4s, while the Xbox 360 GPU ALUs work on vector5s."

It just get more confusing, but there's the concept again.
 
Last edited by a moderator:
Bill said:
I've gone round and round on Vec3 vs Vec4 though, and nobody seems to have a straight answer on what it means.

I also tried to figure out if Nvidia ALU's are vec3 or 4. Different sites say different things.

Jawed commented that Vec4 may not be useful for shaders. It's there for vertexs. Then he said it might be useful for shaders.

As far as I've ever seen, nV's is Vec4/Vec3+scalar/Vec2+scalar+scalar/Vec2+Vec2/etc. with that mini-alu hanging off the side. And because it can do any one of those (technically?), you can say that it does a 4D operation compared to Xenos 5D (potentially, i.e. if you can find a use for it!) operation. I wouldn't put too much in the wording used by MN or anything, though. That was mostly PR-speak, IMO.

But, if Jawed said anything about usefulness, I'll bet you $50 that it was along the lines that Vec4 + scalar is wasted on pixel shading, but it's still necessary for vertex shading. Vertex shaders are shaders too!
 
He said something about it possibly being useful for 4 color values. RGB and something else, or something.
 
Last edited by a moderator:
I think I remember reading that. RGB = Vec 3. And he or somebody else suggested (there or elsewhere) that some might try RGBA (adding the alpha channel) and do something with that. But, I'm not a programmer. :oops:
 
Back
Top