Pixel or Vertex more important, looking forward?

Vertex or Pixel Shading Prowess of greater importance?

  • Pixel Shading

    Votes: 0 0.0%
  • Balance between the two

    Votes: 0 0.0%
  • or, like me, have no clue

    Votes: 0 0.0%

  • Total voters
    232
Oh, I think NVIDIA will eventually "unify" the two. You wouldn't want to talk to Kirk when he knows almost everything he says to media outlets would be published.

As for the subject title : More is always better. However, we're looking at the synthetic performance of every card's pixel and vertex shading power. That means we're measuring true 3D performance. Stressing on the subject title's "looking forward" phrase, I'd have to say the concentration should be on pixel shading power (just my opinion, of course). We hardly see complex vertex shaders in games atm. The rate of CPU speed progress has an impact, of course, but since we don't really need complex vertex shaders for setting up pixel shaders in order to achieve good "visual impact", what's the point of powerful (and an abundance of) vertex units when they won't be taken advantage of in the only thing that really matters, games? Polygon throughput is not a problem anyway.
 
Chalnoth said:
I'm assuming you're talking about dynamic allocation of a "unified" pipeline to process either vertex or pixel data? Well, if this is the case, then there's really not much need to make this allocation choice anything complex, and if the queueing/switching structure implemented in hardware

One could, for instance, merely have one quad of pipelines act upon one stream of data at a time. For example, one triangle goes in, all vertex calculations and then all pixel calculations on that triangle are done, and you output the results.

Or, alternatively, you could simply have an input queue, a loopback device (a way for the output of the pipeline to be re-inserted into the queue), and a way to quickly and easily switch between vertex and pixel processing (some sort of two-state pipeline system). This might be more amenable to cache coherency and, therefore, memory bandwidth usage, as all pipelines could possibly share the same caches more easily.
I certainly don't speak for anyone else on the topic, but I've given up designing nVidia's and ATi's hardware and drivers for them. Those things sound reasonable to me, but on the other hand, I don't know the implications of those schemes. If a simple algorithm works quite well for all cases, awesome. If not, that's fine too.

Until then, I'm not going to presume myself smart enough to answer the question on their behalf. So I hold out the possibility of driver intervention until proven otherwise.

But again, that's just me.
 
Well, the only real question that needs to be answered on whether they should be unified is an efficiency question.

That is, if you unify the pipelines, it can be seen as taking the current pixel pipelines, and making each one require more transistors. So, in the end, if the efficiency improvements of load balancing between vertex and pixel processing turn out to be a win over the extra transistors required to do it (assuming the same total number of pipelines, of course), such that an architecture with unified pipes is faster than a more traditional programmable architecture with more pipelines, then it should be done. Otherwise not.

That stated, the next question to ask is, will content in the future be better-suited to a unified pipeline design? I mean, it seems to me that it might be easier to have more flexible programmability features in a unified design, but there's no reason that this need be limited to a unified design. So what, in the future, might make a unified design more desireable than it would be now?
 
Reverend, that is until the hardware has to render shadows ... and all of a sudden it is completely vertex limited.
 
MfA said:
Reverend, that is until the hardware has to render shadows ... and all of a sudden it is completely vertex limited.
Actually, I think that's probably the best argument for unification of the pipelines. After all, if shadow maps come into wide usage, there's no getting around that if the card is balanced for "normal" rendering, rendering the shadow map will be vastly more vertex-limited...
 
Reverend said:
So what was your position at NVIDIA and ATI?
Oops. That was a tongue-in-cheek remark. I've never worked at either company.

I was refering to the people who casually make huge, sweeping design changes like, "I should be able to read in the render target's current value at my current pixel," or "Why can't we have direct access to the Z buffer?" Etc. I've seen so many of those questions get shot down, I try to not make them.

Or put more simply, what makes sense to me or would be nice for me as a graphics programmer has no relation to what's feasible with graphics hardware.
 
Scali said:
It's hard to translate this back to Direct3D hardware at the moment, since it works so differently.

I believe that you should first follow your own conclusion... but if you're still trying to twist everything into a way that can support your ideas, than do your homework, too.

But if we look at the general idea, then we see that a REYES renderer is very much geometry-based, and doesn't try to do a lot of per-pixel hacks.
So, I suppose if we want to get closer to the realism of a REYES renderer, with current hardware we should just use more polygons. However, in our case this does not mean we should do all shading per vertex, we should still use the per-pixel shading, since that will give better quality with the amount of polygons we can handle in the near future.
Which is more or less my first point in my first post.

Of course Reyes is geometry oriented - their first guideline was to produce realistic detail levels for movie effects.
But adopting the Reyes way requires the ability to work with micropolygons, there really is no other way. PRMan simply chugs on high poly scenes, because it will generate huge amounts of scene data (static and animated) - it prefers NURBS, subdivs, any other parametric primitives, because they're compact and can be detailed by tesselating and displacing them.

And have you read nothing that I wrote? Pushing more polygons into the source art introduces a lot of complications through the pipeline, from UV mapping through rigging through skinning/dynamics calculations on the GPU. Games won't get too far beyond the source art detail displayed in current high-end techdemos; they should rather switch to HOS too, as soon as the primitive processor gets introduced in next-gen hardware. But that won't add shading detail, only smooth out the curves, so either displacement mapping or normal mapping has to be used...

D'oh, I'm wasting my time explaining things a second time. Please add 2 and 2 together on your own.
 
What about that elusive programmable primitive processor ?

Or, what about abandoning the OGL pipeline and move on to something like REYES maybe more robust for real time ? When would that happend ?
 
I don't think it's all that realistic to ask developers to move away from OpenGL-style programming for 3D graphics. If you want to add any new functionality, you really need to add it on as a modification to current API's...
 
Games won't get too far beyond the source art detail displayed in current high-end techdemos; they should rather switch to HOS too
How very true. I'm really suprised that neither Nvidia nor ATi seem to have shown much interested in implementing robust HOS solutions (NURBS and Sub-Ds) in hardware. It seems such a logical step to take and would allow for much better scalability between low-end and high-end hardware, as well as making things so much easier for the artists. There's a good reason film 3D gets produced with HOSs, when you hit high detail levels (and current cards can push a lot of polys) they are faster, easier and all round nicer to work with.
 
I don't think it's all that realistic to ask developers to move away from OpenGL-style programming for 3D graphics. If you want to add any new functionality, you really need to add it on as a modification to current API's...

What do you mean by OGL style programming ? there is nothing special about OGL style programming. I was talking about the hardware not software side.

The way that Developers programmed are dictated by the hardware so, if the hardware change to give better quality CG image, I am sure there will be developers that'll adopt to give themself the edge over other.

The question then becomes to what sort of hardware will it be ? Is the current or near future (3 years) silicon technology viable ?
 
MfA said:
Well Id like to see occlusion culling move to the graphics card. It could be done relatively painlessly in OpenGL using linked display list (just add a way to do optional rendering of a display list based on a bounding box test ... you would translate the scenegraph to a tree of display lists, display lists can call eachother, and tell the graphics card to start at the root). Display lists have gone totally out of fashion though, so I dont see it happening.

This might be available sooner than you think. Nvidia has an experimental NV_conditional_render extension for OpenGL (not available in any public driver) that, as far as I can tell, allows the GPU to skip a marked block of rendering commands based on a conditional. Presumably (I haven't used the extension) this allows you to issue an occlusion query for each object and then tell the GPU to only execute the commands for rendering it if the occlusion test returns that more than 0 fragments pass. This is actually probably MORE efficient than what you proposed since you can use much more data on the CPU to avoid unnecessary occlusion tests by exploiting temporal coherence.

As for the original question, in the near future I think pixel shading power will be more important (at least as far as games are concerned). From a pure content creation perspective, it's hard to generate enough triangles so that it becomes a bottle neck on modern hardware. Assuming you go the extra mile and add good hierarchical frustum and occlusion culling and LOD to your rendering engine.

And, it is possible to do subdivision surfaces in hardware on current GPUs by using a fragment program and rendering to a vertex buffer. You can do this in OpenGL on nvidia hw right now if you feel so inclined. ATI doesn't support the required extension but are waiting to implement the ARB version of sorts (it 's really a new system for render to texture but anyway).
 
Laa-Yosh said:
I believe that you should first follow your own conclusion... but if you're still trying to twist everything into a way that can support your ideas, than do your homework, too.

Excuse me?!
At any rate, you brought up PRMan yourself, so you were trying to say something about how PRMan relates to the hardware mentioned in this thread.
Oh, and it's 'then', not 'than'. If you want to make posts in English, then do your homework, too.

And have you read nothing that I wrote? Pushing more polygons into the source art introduces a lot of complications through the pipeline, from UV mapping through rigging through skinning/dynamics calculations on the GPU. Games won't get too far beyond the source art detail displayed in current high-end techdemos; they should rather switch to HOS too, as soon as the primitive processor gets introduced in next-gen hardware. But that won't add shading detail, only smooth out the curves, so either displacement mapping or normal mapping has to be used...

Why do you assume I didn't read it? I just didn't comment on it because well, we don't HAVE subdivision hardware, so our only choice is to increase polycount. Obviously we will have to move to some kind of subdivision system eventually, perhaps even a complete REYES system in hardware... But that's not going to happen anytime soon.
As I said before, for now, I think more vertexshader power is the best way to increase geometry detail and animation (ofcourse eventually you'll run into memory bandwidth problems, so then that would need to be improved aswell... And ofcourse subdivision surfaces are a better solution there, but I don't see that happening anytime soon).
3DMark05 also seems to make a huge leap in polycount from the last version. Some of their screenshots could actually be mistaken for Pixar renders.

D'oh, I'm wasting my time explaining things a second time. Please add 2 and 2 together on your own.

Excuse me?!

What is your problem? Why are you being so rude and arrogant, and making all kinds of weird assumptions?
 
Meh, I prefer front to back rendering over temporal coherence with screen based occlusion culling ... much more generally applicable. Worst case it culls just as much, 99% of the time it does better.
 
MfA said:
Meh, I prefer front to back rendering over temporal coherence with screen based occlusion culling ... much more generally applicable. Worst case it culls just as much, 99% of the time it does better.
Well, sure, but if you're losing on batch size in order to cull more, it may not be a win.
 
V3 said:
What about that elusive programmable primitive processor ?

I think what he was trying to say, is that a PPP today would be useful only in OpenGL; for D3D it needs API support and it's not going to come prior to WGF-"whatever". IHVs are most likely not going to invest hardware resources to a unit (or number of units) if the necessary API support isn't there.

The question then becomes to what sort of hardware will it be ? Is the current or near future (3 years) silicon technology viable ?

In WGF mentions a geometry shader. I'm just not sure if that includes a tesselation unit or if it's a seperate or even optional unit.

However it seems that we'll get advanced HOS support with WGF and IHVs should be long past chalk-board designs considering those type of future architectures. I wouldn't expect anything though before NV6x/R6xx to be honest.

GameCat,

Presumably (I haven't used the extension) this allows you to issue an occlusion query for each object and then tell the GPU to only execute the commands for rendering it if the occlusion test returns that more than 0 fragments pass.

I've heard of experiments with early object culling before, but I've no idea if there's any relevance to said extension.
 
MfA said:
Reverend, that is until the hardware has to render shadows ... and all of a sudden it is completely vertex limited.
What shadow algorithm are you talking about? What about fillrate?

With stencils, fillrate scale with resolution because the shadow polygons become bigger, so you can be, er, vertex-limited. With buffers (or projectors), shadow resolution is independent of screen resolution.

You honestly cannot separate pixels and vertices from fillrate requirements at any given resolution.
 
Reverend said:
MfA said:
Reverend, that is until the hardware has to render shadows ... and all of a sudden it is completely vertex limited.
What shadow algorithm are you talking about? What about fillrate?

With stencils, fillrate scale with resolution because the shadow polygons become bigger, so you can be, er, vertex-limited. With buffers (or projectors), shadow resolution is independent of screen resolution.

You honestly cannot separate pixels and vertices from fillrate requirements at any given resolution.

He's talking about depth maps, typically you effectively have no pixel shader at all when drawing the depth map.

Personally I think this is a none argument, IME you end up about 50/50 fill / vertex limited with the basic algorithm at reasonable resolutions, and I believe going forwards you will want to store additional information with the raw depth value, and theat requires pixel processing.
 
Back
Top