If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Junior Member
Join Date: Nov 2006
Posts: 27
|
I remember reading somewhere that bilinear texture filtering is effectively "free" on modern GPUs. Is that basically correct? For example, I'm working with a scalar texture (L16) in HLSL and want to perform some specific weighting of the values along the x-axis but simple linear filtering in the y-axis.
Originally, I went with point sampling to gather 4 texels in row1 and 4 texels in row2, weighted them, then linearly interpolated between the two. This meant 8 tex2d() calls and a bunch of math. When I tried to expand the filtering to 8 texels in each row, the 16 tex2D() calls + one tex1d() color lookup call made fxc choke with a "texture dependency chain is too complex" error when compiling for SM2.0. I wasn't happy with all the tex2D() calls and the fact that the 8X sampling would only work on SM3+ so I decided to turn on bilinear filtering on the texture, force the x-axis coord to be on the texel center, but leave the y-axis coord unmodified so that the bilinear filtering in the GPU would do it for me. This cut out quite a few instructions in the pixel shader along with half the tex2D calls. Obviously, I would have to write a short test app to determine which approach offers the best performance. However, I have to think that a highly optimized GPU function (bilinear filtering) has to be better than all those extra arithmetic and point-sampled tex2D() instructions. Any thoughts? Thanks, Mike |
|
|
|
|
|
#2 |
|
Member
Join Date: May 2002
Location: Slovenia
Posts: 420
|
Yes bilinear filtering is efectively free (there is virtualy no aditional cost for filtering) as long as you don't trash the texture cache.
|
|
|
|
|
|
#3 |
|
Junior Member
Join Date: Apr 2008
Posts: 32
|
Hardware filtering may have limited precision, depending on hardware.
For example: Tex3D returned only about 8-10 bits precision when sampling L8 texture on G71 (GeForce 7900), even when texcoords were fp32. (not sure about Tex2D) It works much better now on G92 in DX10 , but probably still less precision than lerp(). And yes it's free in 2D! |
|
|
|
|
|
#4 |
|
Junior Member
Join Date: Nov 2006
Posts: 27
|
I've noticed that the hardware bilinear filtering in ATI GPUs (up to my 3870) is far more grainy than Nvidia GPUs (7900) when zooming in tightly.
I've been testing the new technique and it appears to make a noticeable difference in performance on my Go7800, so I'll stick with it and toss the completely manual filtering I tried first. Note that this is a scientific app, not a game, so data display accuracy and quality is most important. That might appear to conflict with the type of filtering I'm talking about in this thread (to say the least) but I'm working with data samples that have a continuously varying aspect ratio (1:8 up through 8:1 or more). Simple bilinear filtering can't handle the extreme aspect ratios so I needed an adaptive technique. Thanks for the info. |
|
|
|
|
|
#5 | |
|
Member
Join Date: Nov 2007
Posts: 995
|
If you only need your program to work on ATI hardware, you can use Fetch4 (proprietary ATI DX9) or Gather (DX10.1) functionalities. Gather gets the four samples (red component only) that would be used for bilinear interpolation when sampling a texture (2x2 single channel texels packed to argb channels). It's very useful if you need to sample adjacent texels in a single channel format (such as the L16 you are using).
Also, you can get bilinear filtering limited on newer chips also, even on ATI's new 4000 series. Here is a quote from Rage3Ds review: Quote:
|
|
|
|
|
|
|
#6 |
|
Junior Member
Join Date: Apr 2008
Posts: 32
|
I found some official info about precision, this is from CUDA but should give some idea about current or future GPUs:
http://developer.download.nvidia.com...e_2.0beta2.pdf Code:
tex(x, y) = (1−α )(1− β )T[i, j] +α (1− β )T[i +1, j] + (1−α )βT[i, j +1] +αβT[i +1, j +1] for a two-dimensional texture, ... for a three-dimensional texture, ... i = floor(xB ) , α = frac(xB ) , xB = x − 0.5 , j = floor( yB ) , β = frac( yB ) , yB = y − 0.5 , k = floor(zB ) , γ = frac(zB ) , zB = z − 0.5 . α , β , and γ are stored in 9-bit fixed point format with 8 bits of fractional value. |
|
|
|
|
|
#7 |
|
Member
Join Date: Feb 2002
Posts: 409
|
|
|
|
|
|
|
#8 |
|
Junior Member
Join Date: Nov 2006
Posts: 27
|
Here are a couple of images showing the differences between ATI hardware bilinear sampler and Nvidia's when I zoom in tightly on the data. The first is from my Nvidia Go7800:
http://www.grlevelx.com/downloads/bi...nvidia7800.png Not bad at all. A little choppy on the upper-right area. Now here's the same data and zoom level on my ATI 3870: http://www.grlevelx.com/downloads/bilinear_ati3870.png Ugly! You can clearly see the banding in the sampling in both x and y. From Enforcer's post, it looks like Nvidia basically subdivides a texel into 256x256 point samples. I wonder what ATI does. I'm not sure where the "graininess" comes into the ATI bilinear sampler. They might use less precision in the position interpolator or less precision in the texture value equations. I know that performing the bilinear filtering manually in a pixel shader eliminates all the graininess in the hardware bilinear filtering. Mike |
|
|
|
|
|
#9 |
|
Senior Member
|
Might be worth a try: Disable "Catalyst A.I." in the driver's options - maybe there's some... unexpected behaviour wrt your application.
According to AMD, AI is not to reduce image quality. If it does, it's a bug.
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts. Work| RecreationWarning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration! |
|
|
|
|
|
#10 | |
|
Senior Member
Join Date: Mar 2002
Posts: 3,786
|
Quote:
ATI has about 1.5 fewer bits of precision when determining the filter weights. It's rarely a problem for image quality because it's not often that textures are that magnified with that much contrast between adjacent texels. Generally you only use filtering for games, as even NVidia's precision is insufficient for anything beyond that. mikegi, I assume you used a CMP function on one channel of a texture to choose between yellow and green, right? Try using the alpha channel to see if there's any difference. If not, then there's a real possibility of alpha tested foliage or fences looking more jagged on ATI's hardware than NVidias when it's close to the camera. |
|
|
|
|
|
|
#11 |
|
Senior Member
|
Some levels of A-Interference allegedly force some kind of texture compression - possibly not a lossless one. Thought that running a quick test with just a switch turned the other way might be worth a try. *shrugs*
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts. Work| RecreationWarning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration! |
|
|
|
|
|
#12 | |
|
Junior Member
Join Date: Nov 2006
Posts: 27
|
Quote:
I'm not sure if the grainy ATI bilinear filtering is due to the hardware tex coord interpolation or the hardware data filtering (or both). As I noted earlier, if I do the bilinear filtering manually in the pixel shader then the result is perfectly smooth. I'd say this is just an academic exercise but I actually saw the ATI grainy output on TV when a user zoomed in very close to a storm cell (my program is a weather radar viewer). Mike |
|
|
|
|
|
|
#13 | |
|
Senior Member
Join Date: Mar 2002
Posts: 3,786
|
Quote:
It's very rare that filtering weight precision is a problem. Even with EVSM (exponential variance shadow mapping) or SAT-VSM, which are AFAIK the graphics techniques that need more texture precision than any other, don't really care about the precision of the weights. For your application, is NVidia's precision good enough? If L16 is your data source for your lookup, manual filtering makes sense to me. I doubt you'll see an inordinant performance drop. ATI has fetch4 as well to simplify everything. |
|
|
|
|
|
|
#14 |
|
Junior Member
Join Date: Nov 2006
Posts: 27
|
Nvidia's filtering is good enough and ATI's is fine 95+% of the time. Note that in the images I posted earlier you're looking at just a couple of texels. That's that level of zoom where the chunkiness becomes visible on ATI. I don't imagine that many 3D programs would do such a thing so you never see the precision problem in them.
|
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|