NV30 Integer / FP shader operations

darkblu said:
hmm, it just cought my attention: the format either does not have a designated integer bit or it does not span the [-2, 2] range. if it had a designated integer bit the format's range would be (-2, 2), or pecisely, given the 10 fraction bits, [-1.9990234375, 1.9990234375].

Actually, the range is [-2, 2), from OpenGL Extension Specification for CineFX
Additionally, many arithmetic operations can also be
carried out at 12-bit fixed point precision (fx12), where values in
the range [-2,+2) are represented as signed values with 10 fraction
bits.
and
In the 12-bit fixed-point (fx12) format, numbers are represented as signed
12-bit two's complement integers with 10 fraction bits. The range of
representable values is [-2048/1024, +2047/1024].

Anyway, you raise an interesting point, until now in graphics 255*a = a. With the new shaders, you actually have "holes" in your numeric range when you mix 0-255 ranges with your fixed point arithmetic (what I call "off by one" problems):
Imagine you have a one-byte-per-component texture and you get a texel value of 255 in your fixed point shader. The usual thing is to convert texels with a byte value of 255 into 1.0 (fixed point) so you will never be able to get a texel value of 0.FF in your fixed point pipeline.
The problem is that if you read two texel values, one with 254 (0.FE in fixed point) and the other 1 (0.01 in fixed point) and add them, you will get 0.FF and not 1.0, this means that a 255 read from a texture is not the same as a 254 + 1 read from textures. You can think that this is a minor problem, but if the app is alpha-testing against that value (index-shadowmaps, for example) or if you accumulate enough times, you are going to see very weird artifacts.
 
Tonyo said:
Actually, the range is [-2, 2), from <I>OpenGL Extension Specification for CineFX</I>
Additionally, many arithmetic operations can also be
carried out at 12-bit fixed point precision (fx12), where values in
the range [-2,+2) are represented as signed values with 10 fraction
bits.
and
In the 12-bit fixed-point (fx12) format, numbers are represented as signed
12-bit two's complement integers with 10 fraction bits. The range of
representable values is [-2048/1024, +2047/1024].

ok, so it's a two's complement - that clears up the matter.

Anyway, you raise an interesting point, until now in graphics 255*a = a. With the new shaders, you actually have "holes" in your numeric range when you mix 0-255 ranges with your fixed point arithmetic (what I call "off by one" problems):
Imagine you have a one-byte-per-component texture and you get a texel value of 255 in your fixed point shader. The usual thing is to convert texels with a byte value of 255 into 1.0 (fixed point) so you will never be able to get a texel value of 0.FF in your fixed point pipeline.
The problem is that if you read two texel values, one with 254 (0.FE in fixed point) and the other 1 (0.01 in fixed point) and add them, you will get 0.FF and not 1.0, this means that a 255 read from a texture is not the same as a 254 + 1 read from textures. You can think that this is a minor problem, but if the app is alpha-testing against that value (index-shadowmaps, for example) or if you accumulate enough times, you are going to see very weird artifacts.

what you say is a valid concern, and IMO this matter is of general significance.
problem comes from fact that a [0, 1] range of N discrete values is mapped onto the same range of [0, 1] but of N+1 discrete values. this breaks the 1-to-1 correspondence among the discrete values of the two ranges, and the 'off-by-one' condition gives the worst problems, which get emphasised even further for "small" N. as depicted in your example this brings to serious mapping errors. so 254 is actually not 0.FE, but more like 0.FEF.. (255/256 = 0.99609375, 254/255 = 0.996078..).

now, actual situation may not be that bad, as it appears that all (TTBOMK) available fixed-point solutions provide at least 10 bit of fraction, i.e. you don't get the 'off-by-one' problem in practice (i.e. you never get mapping like [0, 1] of 8 bits onto [0, 1) of 8 bits), so mapping error would never reach the magnituded of 'off-by-one' for small N. IOW, with the increase of fractional bits in the fixed-point representation the mapping error will diminish.
 
Luminescent said:
Yes, but if the 2 combiners are there, what would stop Nvidia from accessing both of them.
Because it wouldn't be likely that any software would be able to use the proper extensions to make use of both of them.

As a side note, however, I think you brought up an excellent point. It would seem to be much better to just generate a single flexible register combiner than two different ones for different codepaths. In fact, I would tend to believe that this is actually done, and that that graph is just an indication that the register combiners must handle the FP path differently than the INT path.
 
It could be that displaying 2 combiners indicates the 2 different functions of the combiner, and not necessarily that there are 2 of them. In the diagram there are actually other units which are repeated and are only found in the FX in numbers of 1/pipeline.
 
Back
Top