Draft for OGL2 shading language specification is available for comments;
http://www.opengl.org/developers/documentation/gl2_workgroup/
http://www.opengl.org/developers/documentation/gl2_workgroup/
7) Is alpha blending programmable?
Fragment shaders can read the contents of the frame buffer at the current location using the built-in
variables gl_FBColor, gl_FBDepth, gl_FBStencil, and gl_FBDatan. Using these facilities,
applications can implement custom algorithms for blending, stencil testing, and the like. However,
these frame buffer read operations may result in a significant reduction in performance, so
applications are strongly encouraged to use the fixed functionality of OpenGL for these operations if
at all possible. The hardware to implement fragment shaders (and vertex shaders) is made a lot
simpler and faster if each fragment can be processed independently both in space and in time. By
allowing read-modify-write operations such as is needed with alpha blending to be done as part of the
fragment processing we have introduced both spatial and temporal relationships. These complicate
the design because of the extremely deep pipelining, caching and memory arbitration necessary for
performance. Methods such as render to texture, copy frame buffer to texture, aux data buffers and
accumulation buffers can do most, if not all, what programmable alpha blending can do. Also the
need for multiple passes has been reduced (or at least abstracted) by the high-level shading language
and the automatic resource management.
RESOLVED on October 12, 2001: Yes, applications can do alpha blending, albeit with possible
performance penalties over using the fixed functionality blending operations.
REOPENED on July 9, 2002: This issue is related to Issue (23) which remains open, so this issue
should also remain open.
Another possibility would be to create an extension that allows more flexibility than the current alpha
blending allows, but would still be considered fixed functionality.
RESOLUTION: Issue 23) is resolved as allowing frame buffer reads, so this is once again resolved
allowing alpha blending, with the caveats listed above.
REOPENED on December 10, 2002. Issue 23 is re-resolved to disallow frame buffer reads.
RESOLUTION: No, applications cannot do alpha blending, because they cannot read alpha.
CLOSED on December 10, 2002.
Should precision hints be supported (e.g., using 16-bit floats or 32-bit floats)?
DISCUSSION: Standardizing on a single data type for computations greatly simplifies the
specification of the language. Even if an implementation is allowed to silently promote a reduced
precision value, a shader may exhibit different behavior if the writer had inadvertently relied on the
clamping or wrapping semantics of the reduced operator. By defining a set of reduced precision
types all we would end up doing is forcing the hardware to implement them to stay compatible.
When writing general programs, programmers have long given up worrying if it is more efficient to
do a calculation in bytes, shorts or longs and we do not want shader writers to believe they have to
concern themselves similarly. The only short term benefit of supporting reduced precision data types
is that it may allow existing hardware to run a subset of shaders more effectively.
This issue is related to Issue (30) and Issue (6 8 ).
RESOLUTION: Performance/space/precision hints and types will not be provided as a standard part
of the langauge, but reserved words for doing so will be.
CLOSED: November 26, 2002.
I'm not quite sure what these requirements say:Joe DeFuria said:I presume this means that nvidia's FP16 pipeline can not supported for GL? (Other than through possibly proprietary extensions?0
glslang paper said:As an input value to one of the processing units, a floating-point variable is expected to match the IEEE
single precision floating-point definition for precision and dynamic range. It is not required that the
precision of internal processing match the IEEE floating-point specification for floating-point operations,
but the guidelines for precision established by the OpenGL 1.4 specification must be met.
A s10e5 fp16 number should certainly suffice to represent 2^10, but I don't know what 'about 1 part in 10^5' should exactly mean.glspec14 said:The GL must perform a number of floating-point operations during the course of
its operation. We do not specify how floating-point numbers are to be represented
or how operations on them are to be performed. We require simply that numbers’
floating-point parts contain enough bits and that their exponent fields are large
enough so that individual results of floating-point operations are accurate to about
1 part in 10^5. The maximum representable magnitude of a floating-point number
used to represent positional or normal coordinates must be at least 2^32; the maximum
representable magnitude for colors or texture coordinates must be at least 2^10.
The maximum representable magnitude for all other floating-point values must be
at least 2^32. x0 = 0 x = 0 for any non-infinite and non-NaN x. 1 x = x1 = x.
x+0 = 0+x = x. 00 = 1. (Occasionally further requirements will be specified.)
Most single-precision floating-point formats meet these requirements.
Yes, that's obvious, but how apply this to float numbers? Does this mean the next greater number to any given number must be less than 1.00001 times that number? That would mean you need a 17bit mantissa, right?gokickrocks said:1 part in 10^5 = .00001
the more 0s between the decimal and 1, the more accurate the calculation will be
Xmas said:Yes, that's obvious, but how apply this to float numbers? Does this mean the next greater number to any given number must be less than 1.00001 times that number? That would mean you need a 17bit mantissa, right?gokickrocks said:1 part in 10^5 = .00001
the more 0s between the decimal and 1, the more accurate the calculation will be
Chalnoth said:For example, if a variable is defined at FP16, but a Radeon 9700 is the render target, then FP32 will be used for that storage.
DISCUSSION: Standardizing on a single data type for computations greatly simplifies the
specification of the language. Even if an implementation is allowed to silently promote a reduced
precision value, a shader may exhibit different behavior if the writer had inadvertently relied on the
clamping or wrapping semantics of the reduced operator. By defining a set of reduced precision
types all we would end up doing is forcing the hardware to implement them to stay compatible.
When writing general programs, programmers have long given up worrying if it is more efficient to
do a calculation in bytes, shorts or longs and we do not want shader writers to believe they have to
concern themselves similarly. The only short term benefit of supporting reduced precision data types
is that it may allow existing hardware to run a subset of shaders more effectively.
I must say that I totally disagree with their position, especially on the point of not supporting 'half floats'. There is simply no 'clamping or wrapping semantics' a programmer could rely on when using floats (because the GL1.4 spec only contains minimum requirements), so why should that be different for half floats?Joe DeFuria said:So, they decided that yes, the obvious benefit is to allow existing hardware to possibly runa set of shaders more "effectively", but the drawbacks of having inadvertant behvior occur because of actually running at different precisions is not worth it.
Joe DeFuria said:So, they decided that yes, the obvious benefit is to allow existing hardware to possibly runa set of shaders more "effectively", but the drawbacks of having inadvertant behvior occur because of actually running at different precisions is not worth it.
Chalnoth said:Oh, one other thing on rendering errors for FP16:
Two things:
First, it looks like the GeForce FX, even if FP16 is used in the calculations, will usually use a normal 32-bit framebuffer for the output. I'm not aware of any ability of the FX to output a floating-point buffer.
Secondly, even if the output was 10-bit, the errors are only certain (assuming the last bit is always in error...which is an erroneous assumption since the internal calculation is certainly at higher precision) for the brighter half of the spectrum. Dimmer color values will not show as much error, which is where it counts (our eyes can see banding much more easily at lower brightness levels).