1280x720/60fps - How many polygons are "enough"?

Acert93 said:
Originally Posted by expletiveOk, and just clarify, if a pipe is doing a vertex op, it cant do a shader op, or can it do one of each simultaneously?

J


In a "traditional" GPU design you have Vertex Shader Units (VS) and Pixel Shader Units (PS).

PS do pixel shading; VS do vertex shading; the two shall never twine. In a Unified Shader Architecture (USA) you have Shader ALUs (for lack of a better word); these ALUs can work on either vertex or pixel shader code. Xenos does this through a 3 shader array (each with 16 ALUs) and a dynamic scheduler can allocate each thread to VS or PS work so overall the three arrays can either be doing all VS, combo PS/VS, or all PS.

In a normal scene of data the vertex and shader load changes. Currently GPUs have a 4:2, 8:4, 16:10, etc... PS:VS ratio. Some games and scenes in a game (and even stages of a frames rendering) are more VS dependant, others are more PS dependant. So while PS and VS are both working at the same time in general the load between them changes. Sometimes VS are maxed out and some PS pipes are sitting idle, and vice versa. The goal is to have as much pixel shading and vertex shading work going on at any given time to maximize the static balance between PS and VS units.
Or the short version...yes. ;) A pipe works only on vertex or pixel shading, never both at the same time.
 
A Xenos pipeline can be running a pixel shader instruction on one clock cycle, and on the next clock cycle a vertex shader instruction.

The scheduler that controls this allows pixels and vertices to execute out of order. It uses buffers and various measurement techniques to decide what proportion of the clocks should be handed over to pixels or vertices. That's the automatic load balancing.

Obviously, triangles have to be generated before the pixels covering that triangle can be generated. So there is an ordering there. But triangles can vary in size, and the pixels on them can vary in complexity (i.e. does the shader take 10 instructions to run or does it take 100?).

The actual scheduling algorithm is unknown, and I'm feeling too lazy right now to explain how it might work (lunch beckons).

The main point is that the scheduler has to keep an eye on several queues and aim to ensure that no queue ever gets completely full or completely empty. Having complete freedom to prioritise individual batches of vertices or pixels makes that task possible!

Jawed
 
Acert93 said:
In a "traditional" GPU design you have Vertex Shader Units (VS) and Pixel Shader Units (PS).

PS do pixel shading; VS do vertex shading; the two shall never twine. In a Unified Shader Architecture (USA) you have Shader ALUs (for lack of a better word); these ALUs can work on either vertex or pixel shader code. Xenos does this through a 3 shader array (each with 16 ALUs) and a dynamic scheduler can allocate each thread to VS or PS work so overall the three arrays can either be doing all VS, combo PS/VS, or all PS.

In a normal scene of data the vertex and shader load changes. Currently GPUs have a 4:2, 8:4, 16:10, etc... PS:VS ratio. Some games and scenes in a game (and even stages of a frames rendering) are more VS dependant, others are more PS dependant. So while PS and VS are both working at the same time in general the load between them changes. Sometimes VS are maxed out and some PS pipes are sitting idle, and vice versa. The goal is to have as much pixel shading and vertex shading work going on at any given time to maximize the static balance between PS and VS units.

My understanding of the Xenos and 550mhz G70 is that they are roughly at 50bill shader ops a second. If the Xenos loses shader ops as the geometric complexity of a scene increases wouldnt the G70 have a big advantage since its vertex pipes dont encroach on its shader pipes?

I feel like i must be missing something becuase the impression i get is that no one really feels that one card is significantly more powerful than the other (and if they do they seem to prefer xenos).

I guess main question at the crux of this is:

Does one vertex pipe in the G70 = a US pipe doing vertex calculations in Xenos
and
Does onne shader pipe in the G70= a US pipe doing shader operations in Xenos

If they are roughly equivalent then i understand youre talking about 48 pipes instead of 32.

One more Q, does the xenos have to assign a minimum of one array (16 pipes) to either vertex or shader ops?

Thanks.

J
 
It will really depend on the situation, but in theory the difference is that the maximum vertex and pixel shader performance on RSX will always be the same. On Xenos, the maximum performances are a range, so to speak. The vertex performance can be improved at the cost of the pixel shader performance and vice versa.

Personally, i can't see many instances when we'll be vertex limited, since these monsters can push more polygons than we'll ever be able to see.

In the end, after the dust has settled, after each fanboi has explained why each chip is 109 times more powerful than the other, it all boils down to this: the developers will develop to each architecture strength. MGS4 will look amazing on either platform. UE3 will look amazing on either platform.

And ultimately, both platforms will be CPU, bandwidth and fillrate limited most of the time (and starved for RAM), regardless of vertex or pixel shading performance, regardless of 2TFlops, regardless of bloody Madre Teresa.
 
expletive said:
Does one vertex pipe in the G70 = a US pipe doing vertex calculations in Xenos
and
Does onne shader pipe in the G70= a US pipe doing shader operations in Xenos
All things being equal yes. A unfied shader works like a fixed function shader except that it can switch between vertex and pixel shading. Other than that they do the same job in the same way. Of course differences between specifics on but GPU's will mean one shader will have advantages and disadvantages over the other GPU's shader at certain jobs, but I don't know what these are.
One more Q, does the xenos have to assign a minimum of one array (16 pipes) to either vertex or shader ops?
Yes. As such if you need less then 16 vertex/pixel operations in a clock it does take up all 16 shaders to achieve that and some sit idle. This is a consideration of granularity as to what's the smallest group size to work with with minimal scheduler overhead and the like. Going with independant ALU's would mean no wasted shading power but much higher costs for organizing their operation. ATi's research found three groups of 16 was optimal.
 
expletive said:
My understanding of the Xenos and 550mhz G70 is that they are roughly at 50bill shader ops a second.
Shader operations/sec is a useless benchmark.

If the Xenos loses shader ops as the geometric complexity of a scene increases wouldnt the G70 have a big advantage since its vertex pipes dont encroach on its shader pipes?
Shader Ops = Pixel & Vertex Shader Operations. Triangle setup is different from vertex transformation. Basically Xenos can setup 500M triangles--one per clock--without using any shading power. (That is my understanding of that at least.) EDIT: To answer your question, *if* Xenos and G70 had the same ALU resources for shading overall, Xenos would not be losing anything (theoretically would gain based on effeciency advantage over a fixed ratio). If Xenos and G70 both had 48ALUs for Shading, both need to dedicate silicon (ALUs) to this task. The difference is one is static in its approach and one is dynamic.

That all assumes they have the same number of ALUs and that the scheduling, load balancing, and effeciency work and are not broken.

Does one vertex pipe in the G70 = a US pipe doing vertex calculations in Xenos
Pipeline != Fragment Shader != ALU != Shader Array.

Go to Dave's Xenos article and read it again. It will answer these questions to a degree. It will at least give you a framework to ask questions in (you will realize comparing pipes lines to fragment shaders to ALUs to Shader Arrays is nonsense at this point).

One more Q, does the xenos have to assign a minimum of one array (16 pipes) to either vertex or shader ops?
No.

3:0 PS:VS
2:1 PS:VS
1:2 PS:VS
0:3 PS:VS

Again, this is answered in Dave's article. It REALLY would benefit you to understand how Xenos works--and how a traditional GPU works--to understand the similarities and contrasts. Once you understand how they are the same, and yet different, you can look at how a game renders a scene.

It is a bit of work, but once you read up on that kind of stuff that actual hardware makes more senese. It really is not a matter of "My card has X number of pipes at X MHz so it is faster". Architecture is really important.
 
Last edited by a moderator:
Hey Acert. We've both given two different answers here! Of course we've both interpretted the question differently. I wonder what Expletive really wants to know? :mrgreen:
 
london-boy said:
Normal maps are baked from models with millions of polygons, yes. I'm not sure what you mean with "Isn't that what the industry trying to head to???". You mean the "millions of polys" or the normal maps?
I think the industry is trying to head towards Displacement maps, or real geometry, though i could be wrong. Normal maps will stick around for a looong time, and they should, given the performance.

Oops! Sory for the fragmented post...

I meant real geometry on where the industry is headed.
 
Shifty Geezer said:
Hey Acert. We've both given two different answers here! Of course we've both interpretted the question differently. I wonder what Expletive really wants to know? :mrgreen:

Hehe well regardless i think you guys covered all the bases.

Let me try to simplify my thoughts (operative word being try).

Is the # of polys per scene a zer sumo game with vertex and shader operations?

I.e. the more polygons per scene, the less vertex and shader ops the xenos has.

J
 
Shifty Geezer said:
Hey Acert. We've both given two different answers here! Of course we've both interpretted the question differently. I wonder what Expletive really wants to know? :mrgreen:
I think we are trying to figure out his question...

I think before he can draw G70/Xenos comparison/questions it is important to know how both work. It is really counter intuitive to ask questions with Dave's article really answers a lot technically. From an application standpoint I think more will be known, at least some (like scheduling), come October 5th. Other stuff we have discussed before (like vertex:pixel shader load shifts dependant upon game-to-game balance shifts, scene-to-scene balance shifts, even parts of a single frames rendering) and can only really be discussed when the architecture is understood.

And even then, pitting it as "G70 vs Xenos" really REALLY is asking for a flame war in most cases. Not that it could not be a good discussion, but the forum :):cough::consoles::cough::) is not as condusive to it. That is why I think this is almost better discussed in the 3D Technology Forum. The problem (3D rendering), issues (pluses and minuses to current architectures), and new solutions (hardware, and how it tackles the software differently and what gains and losses we get) are really best discussed technically on their own merits. There is really this logical flow in dealing with the issue, but the console forum invites a LOT of side conflicts that make it hard to discuss.

Oh well. If you can weed you can find the good stuff :D

Ps- Where we gave different answers was on equivalence. You were tackling it from the "do they do the same work" angle where I was tackling it from the "they are not a 1:1 comparison; e.g. fragment shaders != piplelines because a pipeline is more powerful in many circumstances).

Basically depended on WHAT he wanted out of the question.
 
Last edited by a moderator:
LunchBox said:
Oops! Sory for the fragmented post...

I meant real geometry on where the industry is headed.

Well it really depends. Today's top CGI obviously has more geometry than any game. But a wall will still be a flat surface. The tiny little grooves you might encounter in a wall will not be fully 3D for a long time if ever.
Shaders will rule our world for a long long long time, and we already have machines capable of calculating more polygons than they can show us.

Until it's cheaper to move a 2M polys model than it is to move a 100K one with normal maps that looks pretty much the same, we'll stick to faked geometry. The 100k model will become 200k in time, to a point where we can't distinguish the 2 versions of the model - could be 2M and 1M, don't know.

Displacment maps will help, like Parallax mapping is helping a lot too.
 
Completely crap post removed - I'm obviously not thinking straight today. Hahahaha

Jawed
 
Last edited by a moderator:
expletive said:
Is the # of polys per scene a zer sumo game with vertex and shader operations?
No offense intended: A complete sentance with recognizable words will help us understand your question better. Right now I can BARELY make out what you are saying, let alone know what you mean.

I.e. the more polygons per scene, the less vertex and shader ops the xenos has.
Depends on the game and what you are trying to do. But a lot of polygones means a lot of vertex work in general. Think of the old days where the CPU did the transformation and lighting (T&L) and the 3D chip (not even a GPU!) did the hardwired pixel tasks. It had a hard limit on the number of triangles it could work with, but the CPU was doing all the actual work in regards to vertices.
 
Acert93 said:
No offense intended: A complete sentance with recognizable words will help us understand your question better. Right now I can BARELY make out what you are saying, let alone know what you mean.

Depends on the game and what you are trying to do. But a lot of polygones means a lot of vertex work in general. Think of the old days where the CPU did the transformation and lighting (T&L) and the 3D chip (not even a GPU!) did the hardwired pixel tasks. It had a hard limit on the number of triangles it could work with, but the CPU was doing all the actual work in regards to vertices.


Ya sorry i posted that last one on my way out the door and now realize on its own it doesnt make much sense.

i'm going to go off and read everything suggested before going on with this.

I'm just trying to understand the relationship between two spec: the 500 milllion triangles/sec and the 50 billion shader ops/sec. Understanding these are theoreitcal maximums, i'm just trying to understand how increasing one towards its maximum impacts the other (or if they even impact each other at all).

That said ill read whats available here and hopefully its all there. Thanks for attempting to decrypt thus far.

J
 
LunchBox said:
I meant real geometry on where the industry is headed.

Well, I'm not sure about ongoing research, but the practical side is that we'd rather not get into multi-million polygon meshes yet.

First there's the hardware limit. There are a few new releases from the traditional 3D apps that will finally support 64 bit hardware and WinXP, which will finally make working with 5+ million polygon scenes actually possible. But loading all that data from disk, and especially moving it around the network when you send it to the renderfarm would still be a nightmare. Large movie VFX studios like ILM are already close to the limits of storage technology as far as I've heard.
And even with 64 bit, manipulating an order of magnitude (or more) higher amount of data will be slow, even something as simple as moving objects around. Most highend studios nowadays actually try to minimize workign with the real data and use highly simplified representations. PRMan for example supports things like archiving out an interpreted RIB file of your geometry (static or animated) to speed up the scene's parsing. It also allows several mechanisms to generate data at rendering time; even something as complex as an animated character can be built and articulated from a sort of "parts library". The hard truth is that artists tend to push the technology so far that the 3D application barely manages to handle one detailed character at a time, or about a dozen of medium detailed ones. Crowd scenes require techniques like the ones I've mentioned above.

So the tough stuff comes with the workflow. You want to animate those models: bind them to a skeleton and pant the weights for each vertex, apply simulations, morph between blendshapes etc. This would be close to impossible with a model that has millions of vertices... the current tools were not developed with such complexity in mind. Just think about how much disk space a facial blendshape library would take up - a hundred shapes are pretty common for a single character even in everyday movie VFX!

The trend nowadays is to work with a relatively simple mesh, somewhere between 5.000 to 200.000 polys, that gets subdivided at render time, and static detail is added through displacement and normal maps. Research goes into applications like Zbrush, that can replace the time consuming process of modeling something from clay, scanning it, then rebuilding an animation-friendly mesh from the pointcloud data and extracting displacement textures for it. The whole workflow can still use a lot of streamlining, even if it can now be done on a single PC-based workstation with off the shelf software.


Of course using a 20-25 million poly mesh (it seems to be the sweet spot for a highly realistic character) and having the ability to manipulate it would be very cool. But neither the hardware is strong enough for it, nor are there any clues on how it would be possible AFAIK. We'll probably see a lot more of displacement stuff in the upcoming years instead - so PS4/X3 will probably concentrate on HOS and subpixel tesselation/displacement, too. But that's like 5-6 years away :)
 
Good info Laa-Yosh. It sounds like tools need to mature a bit before we get massive meshes.
Laa-Yosh said:
Research goes into applications like Zbrush, that can replace the time consuming process of modeling something from clay, scanning it, then rebuilding an animation-friendly mesh from the pointcloud data and extracting displacement textures for it. The whole workflow can still use a lot of streamlining, even if it can now be done on a single PC-based workstation with off the shelf software.
Zbrush is sooo cool! Just had to say that :) I have seen a couple demos. It looks like fun to use and user friendly. I wish I was an artist because Zbrush looks like an awesome creative palette!
 
Back
Top