To support Floating-Point is meanless,isnt it?

JC
When the output goes to a normal 32 bit framebuffer, as all current tests do, it is possible for Nvidia to analyze data flow from textures, constants, and attributes, and change many 32 bit operations to 16 or even 12 bit operations with absolutely no loss of quality or functionality. This is completely acceptable, and will benefit all applications, but will almost certainly induce hard to find bugs in the shader compiler. You can really go overboard with this -- if you wanted every last possible precision savings, you would need to examine texture dimensions and track vertex buffer data ranges for each shader binding. That would be a really poor architectural decision, but benchmark pressure pushes vendors to such lengths if they avoid outright cheating. If really aggressive compiler optimizations are implemented, I hope they include a hint or pragma for "debug mode" that skips all the optimizations."
Change many 32 bit operations to 16 or even 12 bit operations with absolutely no loss of quality or functionality.

Even FX12 , no loss of quality or functionality???
It means
FP is useless,doesnt it?

And it means IQ of NV30path is identical to IQ of ARB2???
I am right,aint I?
 
jerry_enCater said:
JC
When the output goes to a normal 32 bit framebuffer, as all current tests do, it is possible for Nvidia to analyze data flow from textures, constants, and attributes, and change many 32 bit operations to 16 or even 12 bit operations with absolutely no loss of quality or functionality. This is completely acceptable, and will benefit all applications, but will almost certainly induce hard to find bugs in the shader compiler. You can really go overboard with this -- if you wanted every last possible precision savings, you would need to examine texture dimensions and track vertex buffer data ranges for each shader binding. That would be a really poor architectural decision, but benchmark pressure pushes vendors to such lengths if they avoid outright cheating. If really aggressive compiler optimizations are implemented, I hope they include a hint or pragma for "debug mode" that skips all the optimizations."
Change many 32 bit operations to 16 or even 12 bit operations with absolutely no loss of quality or functionality.

Even FX12 , no loss of quality or functionality???
It means
FP is useless,doesnt it?

And it means IQ of NV30path is identical to IQ of ARB2???
I am right,aint I?

'Fraid not.

Basically -

MANY does not equal ALL.

MANY does not even necessarily equal MOST.

In some cases MANY is approximately NONE.

It's all very dependent on what the shader is doing.

- Andy.
 
No, it depends.
See mandelbrot demo for a good example of a place where using low precision is catastrophic.


Uttar
 
In most case to make use of high precision calculations you need high precision inputs. These can be:

1. High precision constants
2. High precision texture coordinates
3. Data from high precision texture coordinates

You do not on the other hand have to have high precision render targets to see the difference.

For example the mandelbrot shader starts with (2) and so it can (and does) make use of high precision calculation.

Doom3 has high precision texture coordinates, but since it's got into dot product with the normal vector coming from a low precision texture, it looses that advantage.
That's why it won't look _much_ better on R300 than on NV30.
To look much better it would need high precision normal maps.
(The 9700 car paint demo clearly shows the difference it can make!)
 
Hehehe. . . I bet you were one of those guys who pimped NVidia's FP32 as making the NV30 superior to the R300, back during the hype. :p

Anyway, 32 bit output makes a lack of precision have less of an effect on the image that it would if output were in 64 bit or 128 bit, but there still is a visible difference in many circumstances. It may not necessarily be in procedural textures, either. I've seen several examples of a distinct quality difference in shading and specular reflections between PS1.x and PS2.0. It's also very likely that any dependant texture lookups would suffer from inaccuracy caused by low precision.
 
Hyp-X said:
Doom3 has high precision texture coordinates, but since it's got into dot product with the normal vector coming from a low precision texture, it looses that advantage.
That's why it won't look _much_ better on R300 than on NV30.
To look much better it would need high precision normal maps.
(The 9700 car paint demo clearly shows the difference it can make!)

Hey, a very good point I forgot all about: Just imagine when the Doom III engine get used in other games with high precision normal maps (FP16 should do nicely). Drool... 8)
 
Well, for HDR in the rthdribl demo you can see some banding on the HDR when using partial precision (yes the demo always ask for partial precision but since the Radeon always use full precision you can compare partial vs full precision using an NVIDIA and an ATI card).

Perhaps with some tweaks you can have beautiful HDR even on partial precision, personally i don't know.
 
Well, for HDR in the rthdribl demo you can see some banding on the HDR when using partial precision (yes the demo always ask for partial precision but since the Radeon always use full precision you can compare partial vs full precision using an NVIDIA and an ATI card).
I pointed this out in another thread but NV3x cards currently don't support FP render targets, which the demo uses with R3xx ones. Until they do, one can't state specifically that such banding is purely down to the differences between full and partial precision rendering. The average number of passes for that demo is around 40 (although it does depend on what effects are being used); for a good number of those, the NV3x will be losing out to the R3xx as it is having to use integer render targets instead.
 
You dont necessarily need to have high precision inputs to require greater range in output. Arithmetic can increase the range needed quite a bit.

An example in Doom3 (or similar engine) is with multiple overlapping lights.
 
AndrewM said:
You dont necessarily need to have high precision inputs to require greater range in output. Arithmetic can increase the range needed quite a bit.

An example in Doom3 (or similar engine) is with multiple overlapping lights.

Of course Doom3 renders each light in a different pass, so that's doesn't make much difference.

The range can increase if the light has higher value than 1.0 but Doom3 won't use too bright lights. :)
 
Reverend said:
jerry_enCater said:
Thank you all.
One more question.
Does that mean IQ of NV30path is identical to IQ of ARB2 according to what JC said???

http://www.beyond3d.com/interviews/jcnv30r300/index.php?p=2

I think you're forgetting the words "specific instances" in your one-dimensional and generalized thinking about IQ in a general manner.
I am confused.
Can you tell me how to comprehend "Change many 32 bit operations to 16 or even 12 bit operations with absolutely no loss of quality or functionality. "?

Absolutely no loss of quality or functionality????
 
jerry_enCater said:
I am confused.
Can you tell me how to comprehend "Change many 32 bit operations to 16 or even 12 bit operations with absolutely no loss of quality or functionality. "?

Absolutely no loss of quality or functionality????
It depends on the situation, but it can be done (and should only be done by the developer, of course).

Typically, texture operations (such as dependent texture reads) will require 32-bit precision to work properly.

Most color operations, for relatively short shaders, will require no more than 16-bit FP.

Weighted averages of no more than about 4 values should work just fine with FX12 (as long as the shader doesn't later multiply the colors by a number greater than one).

As long as the output is just 8-bit, different operations will require different amounts of precision for optimal visual quality.
 
jerry_enCater said:
Does that mean IQ of NV30path is identical to IQ of ARB2 according to what JC said???

No.

The NV30 path (like the NV20 and R200 paths) uses an approximation of the power function which is the heart of the specular highlight calculation.
It also approximates normal vector normalization with a lookup from a cube-map.

The ARB2 path uses a native power function, and arithmetic vector normalization.

The place where you'll see a difference is specular highlights.

Edit: NV20 and R200 path,
 
In my own experimentation I've found that arithmetically normalizing the interpolated vector in the fragment program and using a cubemap produce quite similar results, which aren't too accurate once the light gets close to the surface. The brightest point on the surface tends to move in a wave as it moves across from one edge to the other. Of course, the advantage of the former is that you don't have to store a bigass cubemap.

Back on my Radeon 8500 I was able to develop a way to calculate the light vector in the fragment program, but it was limited by the -8 to 8 range that GL_ATI_fragment_shader allowed (though trying this in PS1.1-1.3 would be even more troublesome, if not impossible). Also, the lack of being able to normalize the vector led to some inaccuracies. With the high dynamic range of floating point precision, however, this should be no problem at all. The result should be light calculated for each individual pixel, with none of the problems that come from interpolating the light vector.
 
Hyp-X said:
The NV30 path (like the NV20 and R300 paths) uses an approximation of the power function which is the heart of the specular highlight calculation.
I believe you meant R200 above, because the "R300 path" is the ARB2 path.
 
OpenGL guy said:
Hyp-X said:
The NV30 path (like the NV20 and R300 paths) uses an approximation of the power function which is the heart of the specular highlight calculation.
I believe you meant R200 above, because the "R300 path" is the ARB2 path.

Yes I meant that, thanks.
 
Hyp-X said:
The NV30 path (like the NV20 and R200 paths) uses an approximation of the power function which is the heart of the specular highlight calculation.
It also approximates normal vector normalization with a lookup from a cube-map.

The ARB2 path uses a native power function, and arithmetic vector normalization.

The place where you'll see a difference is specular highlights.

I know this was the state of things as of Carmack's last .plan on the subject, but I wonder why he wouldn't make this change to the NV30 path as well. After all, these were part of a list of "little tweaks" he'd added to the ARB2 path because "working with ARB_fragment_program is really a lot of fun". At the time my impression was that the features that made it into the final engine would also be ported to NV30, which you are asserting is not the case here. I wonder why not...

...well, presumably because the NV30 path uses FX12 and doesn't have the dynamic range to pull it off properly? (Am I correct to assume it's an issue of range, not precision?) Back when the .plan was written we thought the 100% performance advantage of the NV30 path over the ARB2 path (on NV30 hardware) was due to so-called "driver twitchiness" or FP16 vs. FP32, but looking back with the knowledge we've gained about NV30's architecture, it seems more likely to be FX12 vs. FP16 that's at issue.

(Actually, Carmack's discussion in the .plan takes on a slightly sinister cast today: "Nvidia assures me that there is a lot of room for improving the fragment program performance with improved driver compiler technology." :devilish:)

In a certain way, this lends credibility to the notion that NV35 will use the ARB2 path and not the NV30 path. I can't see it defaulting to a mode which compromises IQ (even slightly) if it gains little to no performance benefit from doing so. Given that NV30 never had a significant retail presence, the GPUs that will run the NV30 path are really the NV31 and NV34, where the slight IQ drop may be considered more acceptable because they don't claim to be top-of-the-line.

Not that that will prevent huge controversies over the proper mode to use when benchmarking GFfx 5600s vs. Radeon 9600s. Unless, of course, no one cares about those cards anymore by the time Doom3 is released.
 
Back
Top