OGL 3.0: Programmable filtering, blending, stencil, etc. - Sooner than we think?

from memory ~1-2 years ago with nvidia drivers (donno 50.xx-60.xx)
an average vs + fs program ~100 instructions were taking 600-700msec
now with 90.xx thats down to ~50msec on the same machine
With MS' HLSL compiler, it's gone in the opposite direction, FWIW. It used to be around 400-600 msec a year ago for a vs+ps of that size, but now that's around 650-950 msecs. Though flow control really throws it into hell -- loops bring it from ms to secs. Though I suppose that any case where you feel the need for flow control, you also have something pretty large (and I do have to deal with generated shaders that amount to a few thousand instructions). All the same, though, the end results are better than they were back then. The 2.0 targets were pretty darn good a year ago, but the 3.0 targets produced some really laughable excuses for optimization. Nowadays, they're about even.
 
zed said:
from memory ~1-2 years ago with nvidia drivers (donno 50.xx-60.xx)
an average vs + fs program ~100 instructions were taking 600-700msec
now with 90.xx thats down to ~50msec on the same machine
almost a second for 100 instructions? That seems really terribly slow - if c compilers were that slow I'd still be compiling linux kernel 1.0... 50msec sounds more reasonable, though it doesn't look terribly fast to me neither.

nAo said:
Just to give some rough numbers, we needed 8 hours to compile about 60k VS + 60k PS on a modern CPU.
That's a hell of a lot of shaders (well compared to doom3 at least :) )! Ok I see now why offline-compiles would be useful... However, if you want to recompile if there's a new compiler, you'd still have a problem.
 
almost a second for 100 instructions? That seems really terribly slow - if c compilers were that slow I'd still be compiling linux kernel 1.0... 50msec sounds more reasonable, though it doesn't look terribly fast to me neither.

Often the output of the compiler bears little resemblence to the input.
The shader assemblers IMO do some extremly impressive transforms on the specified data during optimisation. My guess would be there are several bruteforce algorithms involved.
 
Knowing the ARB if they were to release it they'd probably call it OpenGl 2.1, just to emphasise the anticlimatic buzz they build along the year around their API.

Let's just hope that this division happens fast and that they get to work on the new extensions and the GLSL improvements.
 
That's a hell of a lot of shaders (well compared to doom3 at least :) )! Ok I see now why offline-compiles would be useful... However, if you want to recompile if there's a new compiler, you'd still have a problem.
new compilers arrive all the time...but now we are not as dumb as we used to be: we support much more shaders (I did not compute the right number but it's in the order of milions) but they are generated per level using a system which queries our database and generates only shaders we need for that level. This make things much more quickier a what was taking hours now takes minutes.
 
Back
Top