OGL 3.0: Programmable filtering, blending, stencil, etc. - Sooner than we think?

Arun

Unknown.
Moderator
Legend
http://www.gamedev.net/columns/events/gdc2006/article.asp?id=233
Dated March 23st, but I had missed it, and the importance of that thing is fricking huge imo. Bolding is mine.
ATI and NVIDIA have been working together for more than six months on a proposal to change the direction of OpenGL. Much of that time has been spent talking to developers to find out what they want from OpenGL, and this was the first time they presented what they want to propose publicly.

Disclaimer: All of this information is pre-workgroup; none of it has been officially proposed to the ARB yet.

OpenGL has matured over 13+ years. It's not always representative of hardware anymore, but it is fully backwards compatible (unlike other graphics APIs), which has obvious pros and cons. Significant new hardware functionality is on the way, and programmability is just the start. OpenGL needs to be market and timeline driven to provide only what is needed in a timely manner.

There has been talk for a few years of creating a version of OpenGL that doesn't maintain backward compatability, and it has been suggested that this might happen with OpenGL 3.0. But developers and IHVs have indicated that there is too much out there depending on existing functionality to make a fully clean break.

The proposal being worked on would divide OpenGL two profiles. The first, dubbed OpenGL Lean and Mean (or OpenGL LM) is the core API that is the right hardware abstraction, providing optimal performance. There would also be a full OpenGL profile that supports all existing functionality, as well as the LM profile. Once this division is made, most future efforts would be put into the LM profile.
They'd like to add:
  • Geometry instancing
  • Geometry-only display lists (for small static data, for instance) (no GL_COMPILE_AND_EXECUTE,
  • since it's very problematic to implement efficiently)
  • Arrays activated based on vertex shader usage
GLSL enhancements
  • Integer Support
  • Full 32 bit integers full bit-wise integer ops integer texure lookups integer vertex attributes

  • Offline compile (already in OpenGL ES)
    [*]Enables layering, faster startup, (limited) IP protection
    [*]Binary shader interface
    [*]Intermediate language interface
  • Vertex position invariance across shaders (replaces ftransform())
  • Enhanced interpolant control
  • Flat/smooth
  • center/centroid
  • perspective correct, or not

  • Vertex ID (what vertex item am I?)

  • Binding texture object IDs to samplers (skip texture image unit)
What they'd like to add
  • seamless cub map neighborhoods
  • texture arrays (aka stacks)
  • texture filter shaders
What features to consider layering:
  • Depth test
  • Stencil test
  • Blend

Instead, have sample shaders in OpenGL LM
If either R600 or G80 has full support for all that stuff, please put me in line to buy a few millions of them (I can hope!) on launch day, mkay? Much of that stuff is optional though if I'm reading this right (hmm?), so it's more likely both will support a subset of it.


Uttar
 
Yes, Nvidia is also very supportive of Linux, isn't it? And that would benefit much from OpenGL. And then with the PS3 and such adopting Linux potentially bringing it to a more mainstream level ...
 
>>> Offline compile (already in OpenGL ES)
excellant though i notice with nvidia there glsl compiles are >10x quicker than they were a year ago, but precompiled stuff is much better (still compiled on the endusers machine)
>>> texture arrays (aka stacks)
nice i proposed this ages ago itll make everything much cleaner
>>> Vertex position invariance across shaders (replaces ftransform())
cool i just posted about this recently + the illogic behind the current method

ild like to true global uniforms added as well (using the inbuilt ones is a PITA)

btw must be getting about time for the quarterly opengl info newletter
 
The lack of pre-compiled shaders was a joke.

Hash( Shader text + Driver Shader engine ) and check that against a stored hash of the precompiled shader if its not the same recompile otherwise its ready to go.
 
How much effort are the hardware vendors putting into OpenGL support these days? I don't mean to suggest that it isn't important, but it seems to be becoming a niche market.
 
Offline compiles would be good, but the compiler should still be part of the driver runtime provided by IHVs, because the compiler can do a better job optimizing with real knowledge of underlying hardware than optimizing against an abstraction. That is, devs should be able to invoke the compiler statically for each driver and save the resultant compilation for distribution, but the compiler is per-GPU, not 1 general purpose compiler for everyone. I'd envision a command line tool which links in a DLL from each IHV, and outputs the cached compiled code (e.g. glslc -driver g80c.dll -out shaders.cache *.glsl)

"Intermediate language interface" also seems to suggest that they are going to make the compiler spit out an IR dump or expose IR, and then let driver authors plug into the optimizer, like writing a new GCC backend target.

But making the compiler generate shader ASM akin to vertex/pixel shaders in DX would be a step backwards in architecture. There are ways to solve the compile time problems without sacrificing platform dependent optimizations. Otherwise, you just force them to hide the compiler inside the driver anyway, except that they must do more work to optimize assembly shaders and information from the intermediate stage is lost. The idea that static ASM generated in a platform independent fashion (not GPU specific) allows one to avoid compilation is fallacy. The compilation still occurs, except now it works on a relatively crappy ASM IR representation.

The approach I mention above however, could result in true "static" compilation, in the sense that the driver could load code already pre-compiled directly for the underlying (hidden) GPU instruction set, i.e. it's executable object code. It may even be encrypted/signed. No need to load asm shaders and re-optimize them and translate them to native GPU inst set.
 
It's a little more safe to say that OpenGL ES is getting to be more important... OpenGL on the PC is largely owned in the workstation and CAD/CAE department, where fancy shader trickery isn't really used so much -- robustness of the existing features and legacy is about what matters most. Not too many games outside of the mobile space (PS3 notwithstanding) are going to rely on OpenGL. The thing is though that if you try and rely on the GPU to accelerate a lot of things in a wider market (e.g. video editing, offline rendering a la Gelato), then supporting OpenGL can be meaningful.

Personally, I have to remain doubtful we'll see even a significant fraction of all this given the past history of OpenGL and its updates. Even though the ARB has relinquished control, I'll wait until I see it to say that things will be different now. I mean, they're talking on the 3.0 spec when we haven't even seen the 2.0 spec come to life yet (I mean the real GL 2.0 spec -- not even so far as the so-called "Pure" OpenGL 2.0).

That said, one thing in that list that really interests me is the texture filter shaders. I'm currently doing some work on offsetting texture sampling position based on edge detection, but I'm doing it the hard way, which means it's a challenge to make it cheap as well as effective.
 
I'd envision a command line tool which links in a DLL from each IHV, and outputs the cached compiled code (e.g. glslc -driver g80c.dll -out shaders.cache *.glsl)

I'd like to see as an opengl function so you can call it at run time however the wouldn't preclude any external tools doing it. And that you would load the compiled shader the verify the it was compiled with the correct compiler ( I'd except a compiler version tag in there ) and assume it was compiled with same compiler as what is reported by the driver then you would load it into the driver. If it was different I'd expect you to recompile it and then store the static compiled file. I'd disregard the it being used as an IP protector.
 
Does shader compiling really take a significant amount of time? I can't quite see why the availability of offline-compile is such a big deal. So, has anyone some numbers by how much this increases load-time?
 
Does shader compiling really take a significant amount of time? I can't quite see why the availability of offline-compile is such a big deal. So, has anyone some numbers by how much this increases load-time?
If you have to compile hundreds if not even thousands of shaders the answer is yes..it takes time..too much time (and memory..) ;)
 
Depends nAo, this is partly why some games take along time to load up, half the time is compiling of the shaders and this is some precompiling also. They take up a bit of mem, I wouldn't say more then 30 mb though even in next gen games.
 
If you're looking for the best performance possible you want to avoid static branching and you also want to allocate in the most optimal way all the interpolants your shaders need..on a per shader(s) (a VS and a PS) basis.
This means you'd need to tipically compile thousands of shaders..(or even milions..cough..cough.. ;) )
 
Does shader compiling really take a significant amount of time? I can't quite see why the availability of offline-compile is such a big deal. So, has anyone some numbers by how much this increases load-time?
That's pretty much the size of it. To give a rough idea in my case, which is pretty in line with what nAo was talking about with having thousands of shaders for everything, we have about 5 VS/PS pairs per material (some being Z-only passes, some being multipass lighting versions, some being single-pass lighting versions)... And given the amount of material repetition among objects and shader repetition among materials, that amounts to an average compilation time of around 3 seconds per object. The single-pass lighting pixel shaders, in particular take a long time to compile (as long as 20-40 seconds a piece).
 
Just to give some rough numbers, we needed 8 hours to compile about 60k VS + 60k PS on a modern CPU.
 
Does shader compiling really take a significant amount of time? I can't quite see why the availability of offline-compile is such a big deal. So, has anyone some numbers by how much this increases load-time?
from memory ~1-2 years ago with nvidia drivers (donno 50.xx-60.xx)
an average vs + fs program ~100 instructions were taking 600-700msec
now with 90.xx thats down to ~50msec on the same machine

so depends on the number of programs fwiw i use ~80 programs, doom3 uses a lot less (though asm) havent checked out other games, so it can take a few secs out of initial startup (not so bad for the end user but if youre developing its a PITA)

Just to give some rough numbers, we needed 8 hours to compile about 60k VS + 60k PS on a modern CPU.
interesting thus i assume u will create a seperate program rather that change a variable, thanks
 
Back
Top