Xbox2 graphics 10 times higher geometry perf. than X800 XT ?

Xmas · Nov 19, 2004

Ostsol said:
Doesn't ATI have 2 ALUs per pixel pipeline? 16x2 + 6 = 38. . .

One full ALU and one "mini-ALU", whatever that means.
Also, the VS pipes are vec4 + scalar currently.

R5x0 will reportedly have 48 "full" 3+1 ALUs, and I *guess* there will be 24 TMUs, or maybe 32.

All in all, I think 2.5 times the shader performance of R420 is an optimistic estimate (average, though VS bound scenes should fly).

Considering unified shader pipelines, I think it most likely won't be worth it until WGF, from a pure peak performance POV. Because currently, you only have to balance two loads, PS and VS, and you're most likely PS bound (on the PC platform at least). In WGF, you'll have to balance pre-tessellation and post-tessellation VS, geometry shader and PS. And you'll be able to write intermediate output streams, i.e. you are going to bypass some stages later.

E.g., if you tried to replace the VS and PS of NV40 with unified pipelines, you'd have to implement the special abilities of the VS as well as those of the PS, and on top of that the control logic to distribute the work loads. Overall, you might end up with 16 "unified" pipelines taking up the same transistor count as the 16+6 separate pipelines. The pipelines would be able to do a bit more per clock, but you most likely would end up with less performance on average.

But there might be other reasons for unified pipelines, like scalability and ease of design. And, of course, different requirements for different platforms.

psurge · Nov 19, 2004

Well that Xbox2 diagram was said to be outdated, so perhaps there are more shader units. The description given is pretty unclear in any case :

The shader core of the R500 was reported to have 48 Arithmetic Logic Units (ALUs) that can execute 64 simultaneous threads on groups of 64 vertices or pixels.

Can a group contain both vertices and pixels? How many such groups are in-flight?

Does each shader unit get its own group of vertices/pixels (requires 32KB per unit for registers alone), or do all 48 units operate on a single group at a time? If so, does each vertex/pixel or batch of vertices/pixels get it's own instruction stream, or is it doing SIMD style execution across all 64 elements at a time? How quickly can units switch between groups (every cycle, or...)?

compres · Nov 19, 2004

psurge said:
Well that Xbox2 diagram was said to be outdated, so perhaps there are more shader units. The description given is pretty unclear in any case :

The shader core of the R500 was reported to have 48 Arithmetic Logic Units (ALUs) that can execute 64 simultaneous threads on groups of 64 vertices or pixels.

Click to expand...

Can a group contain both vertices and pixels? How many such groups are in-flight?

Does each shader unit get its own group of vertices/pixels (requires 32KB per unit for registers alone), or do all 48 units operate on a single group at a time? If so, does each vertex/pixel or batch of vertices/pixels get it's own instruction stream, or is it doing SIMD style execution across all 64 elements at a time? How quickly can units switch between groups (every cycle, or...)?

Are you aware that most of this posts are based on speculation?

Your question seems unapropiate since no one knows such kind of details here(or are on an NDA).

psurge · Nov 19, 2004

yes i am aware of that. But to me they are still interesting questions. I wasn't expecting any concrete answers, but hoping for more interesting architectural speculation (such as why one approach is not feasible or desirable, etc...)

compres · Nov 19, 2004

OK, I see what you mean.

Since we are talking about a console here, they can do whatever they want differently from a pc add in board, since they dont have to deal with compatibility issues. I'm guessing they are doing a lot of general pipelines but each one of them should be less capable at pinxel shaders than a dedicated pixel pipeline as well as less capable at vertex shaders than a dedicated vertex pipeline.

Mintmaster · Nov 19, 2004

psurge said:
Can a group contain both vertices and pixels? How many such groups are in-flight?

As compres says, it's all speculation, but my guess is that it'll switch back and forth between tasks, sort of like a threads on a CPU (except with finer time division). It'll likely take a bunch of vertices, do transformation, clipping, shading, etc to them and then store results in a buffer. When the buffer fills up, switch to pixel rasterization and shading.

The buffer may even be RAM as opposed to on chip to save die space, since vertex bandwidth is far less than texture/framebuffer/z-buffer bandwidth, and you'll be limited by triangle setup rate anyway.

Rapid branching is likely given that this is a forward looking platform, so I don't think it'll take many cycles to switch back and forth. Splitting the load such that both pixels and vertices are processed simulaneously seems too unnecessarily complicated to me.

Wunderchu · Nov 20, 2004

Megadrive1988 said:
it's about more geometry and more shader power.

Xbox2 will have to last 4-5 years and at least keep pace with another console that's been in development for ~5 years and has had billions of $ poured into it (PS3).

I agree

in 2008, Xbox 2 must still be able to output relatively good looking graphics [much as games like Halo 2 and Dead or Alive Ultimate (which still look good, even though the Xbox was released in 2001), today]

Frank · Nov 20, 2004

In my opinion, unified shaders are the step in-between the fixed-function pipeline model with additional computation freedom on sub-stages (the current model) and the total computational model, in which high-level curved polygons are morphed, shaded and skinned and directly broken into sub-pixel triangles, which have additional lighting, weighting and transparency calculations. Like the CELL model.

Wunderchu · Nov 21, 2004

DiGuru said:
In my opinion, unified shaders are the step in-between the fixed-function pipeline model with additional computation freedom on sub-stages (the current model) and the total computational model, in which high-level curved polygons are morphed, shaded and skinned and directly broken into sub-pixel triangles, which have additional lighting, weighting and transparency calculations. Like the CELL model.

cool, I didn't realize before that that model was the CELL model.. thanx for the info

dxp969 · Nov 27, 2004

Will the Revolution's GPU have many of the features that have been talked about in this thread? (Seeing as how it will come a year after the xbox2 gpu and R500 and 520) A PPP for example. If yes, what kind of improvements or other hardware features would be ready for a gpu coming out in the 2006 time range, as opposed to the 2005 releases of r500 and r520.

Nintendo may not be after alot of raw power but they like to have highly effecient hardware, and are usually good for keeping up to date with current hardware features... so I would think that if a ppp can improve efficiency, bringing out more polys in game, that Nintendo would have it under consideration for the revolution gpu, as they are sticklers for efficiency in hardware.

Xbox2 graphics 10 times higher geometry perf. than X800 XT ?

Xmas

Porous

psurge

compres

psurge

compres

Mintmaster

Wunderchu

Frank

Certified not a majority

Wunderchu

dxp969

Similar threads