Future of fragment shaders

Enbar

Newcomer
I'm going to talk in terms of fragments shaders here, but I think (as do others here who've stated as much) that fragment shaders and vertex shaders will be unified in the not too distant future so most of this applies to vertex shaders as well.

It seems from my browsing of some threads here that quite a few people believe fragment shaders are headed more and more towards being a general cpu type architecture. From my experience I don't fully agree with that sort of theory. Don't get me wrong, I think gpus will continue to more programmable and handle longer programs faster, but that doesn't necessarily mean becoming like a cpu. For instance modern general-purpose cpus are very efficient at branching and sparse accesses. I don't believe gpus will/should focus on those areas. Handling these cases fast would require a tremendous effort by gpu manufactures. While these examples might be useful for some fringe cases in graphics, for most graphics algorithms you don't want branching since it often leads to aliasing of your surfaces.

What I would like to see are larger programs being allowed, I've at times felt limited by the 9700s 64 instruction limit unless I'm willing to complicate things by breaking my shader into multiple passes. I'd also like to see faster fragment shaders because currently if I want to use 100 instruction programs on more than a few small select objects in my scene things become unbearably slow. So now that I've stated what I do want, what I don’t care about right now is more functionality. Looking at d3d and it's ps2.0 vs ps3.0 is there any thing I'd really like there? Sure guess the derivatives would be nice, but those first 2 things I mentioned are much more important to me. I know that most people have only seen the very tip of the iceberg of what current fragment shaders can do. And I hope that current hardware makers make what we have faster so it's more usable before they focus on new features.

Of course this is just one man's opinion, and I'm curious to hear what others think. I'd be really interested if someone who knows fragment shaders disagrees and thinks that we need more features before focusing on speed, and what those features should be.
 
I think there's still a little bit further to go in functionality. Specifically, I think we need some robust higher-order surfaces support (perhaps the rumored primitive processor in the NV40 will do it?), and expanded vertex shader support.

Anyway, speaking of instruction limits, what ever happened to the f-buffer?
 
Dynamical branching is in, hardware manufacturers will need to learn to live with it regardless of the objections :) PS-3.0 is the future.

I dont think it is such a big deal though. You dont need branch prediction and speculative execution, things like split-branches/delay-slots are helpfull with the relatively short pipelines (need a smart compiler of course) and when that doesnt work you can also use SMT to keep the hardware busy while the conditional resolves and the branch target is fetched.
 
In terms of functionality we need (IMO) dynamic branching, texture access in the vertex shader, dynamic tesselation (displacement mapping etc.) preferrably programmable, derivates in the fragment shader.
 
MfA said:
Dynamical branching is in, hardware manufacturers will need to learn to live with it regardless of the objections :)
.... and programmers will have to learn that performance characteristics with dynamic branching will be dramatically different from CPUs....
 
Seems like people want their dynamic branching. I doubt dynamic branching will be very dynamic with the length of graphic's pipelines, but lets forget about that. One thing this would be nice is with some procedural textures but I don't see that being too popular with games. Most branching that I do that doesn't introduce aliasing is adjusting a bias value of a shader and this doesn't really require dynamic branching. I'm curious what sort of things people want dynamic branching for. Remember if it's an algorithm dealing with sparse access it's going to be dog slow with our without dynamic branching.

Edit: I just realized I didn't explain some of my thought process that went into the above post. I view current conditionals as a poor man's branch and that's might be why the above didn't make much sense on first read.
 
I prefer to see branches be the poor man's conditional... but I've been doing too much vector processing I guess.
 
Treating each pixel seperately makes procedural texturing too much of a pain, something fast which uses inter pixel redundancy like bresenham noise is out of the question. Now if the position register was writeable you could really start to abuse the hardware :)

One thing I can think of which could use dynamic branching well at the moment, apart from applications which try to have the hardware do things it was totally not meant to like solving FEM equations and raytracing, is adaptive texturing (quadtree/octree textures).
 
If a branch lets you skip a large chunk of code it certainly can improve performance over evaluating everything and SGE/SLT/MUL form the final result.
 
Humus said:
If a branch lets you skip a large chunk of code it certainly can improve performance over evaluating everything and SGE/SLT/MUL form the final result.
One easy way to see where branching might help is for a "while" loop, where the number of times a block of code is executed changes dramatically from pixel to pixel. Anisotropic filtering would be a good example of something that fits into this category.
 
MfA said:
Now if the position register was writeable you could really start to abuse the hardware :)

To steal from Apple.... think different :)

While you can not modify the position register in the pixel shader, you can in the vertex shader. Of course it's not so useful now, but with VS3.0 and Vertex Textures, things will be a lot more intereting in Vertex Shader land.
 
Dont see how you would implement bresenham noise in vertex shaders either ... I mentioned making the position register writeable because that could also imply the ability to write to more than 1 pixel.
 
Humus said:
In terms of functionality we need (IMO) dynamic branching, texture access in the vertex shader, dynamic tesselation (displacement mapping etc.) preferrably programmable, derivates in the fragment shader.

That's, word for word, how I'd describe the NV40 feature set, LOL :)
That is, AFAIK, of course. Texture access in the VS is a given of course, was even planned for the NV30 but got canned.

OT: BTW, I've recently recieved a very, very juicy confirmation on something non-GPU related at nVidia. Very reliable, too. Will leak it in a few days unless my source decides to suddently have objections about it.

Uttar
 
Uttar said:
OT: BTW, I've recently recieved a very, very juicy confirmation on something non-GPU related at nVidia. Very reliable, too. Will leak it in a few days unless my source decides to suddently have objections about it.
A sound card? Please be a sound card.
 
OT: BTW, I've recently recieved a very, very juicy confirmation on something non-GPU related at nVidia. Very reliable, too. Will leak it in a few days unless my source decides to suddently have objections about it.

A reliable set of drivers? ;)

Sound card would be interesting but so would a decent Opteron chipset (or have they already been announced by N?) along with dependable IDa drivers.
 
Back
Top