The Nature of Scaling (Signal Processing Edition)

betan · May 15, 2008

Xalion said:
No, I did not. Unfortunately you did. However, let me answer the question you are having trouble with.

Finally

Yes. If you have trouble accepting my word for it try the following references:
IEEE, Transactions on Consumer Electronics, Volume 51, Issue 1. Feb 2005 - Masking Noise in up-scaled video on large displays. Hcesch et al.
Journal of Electronic Imaginc, October-December 2005. Image resolution upscaling in the wavelet domain using directional cycle spinning. Temizel. et al

I cannot explain how much I laughed when I saw you reference that particular paper.

That's about addressing the directional bias of WZP upscaling.

Grass Detection for Picture Quality Enhancement of TV Video. Springer Berlin/Heidelberg. ISBN 978-3-540-74606-5
High Frequency Component Compensation based Super-Resolution Algorithm for Face Video Enhancement

Shame on me but I bothered checking that paper, which is available online too.
Of course if you had even read the abstract you would immediately see that the high frequency component that's being compensated is not lost during upscaling, but lower resolution sampling (undersampling). That's Nyquist for you.

from the 17th international conferance on Pattern Recongition
Journal of Electronic Imaging, October 2000, Volume 9, Issue 4 pp. 534-547. Nonlinear resampling for edge preserving moire suppression.

I can give you another 40 references over the last 20 years to verify this claim as well.

Let me put it this way: Although I undersampled your references, I got all the information I need.
Don't bother googling anymore.

Or you can pick up an introductory level signal analysis book. Either will do.

Unlucky for you, I used to ...,
heh never mind.

Is this an attempt at a joke? It is hard to detect humor over the internet - smiley or no. Angular frequency is 2*pi*f where f is the oscillatory frequency. In other words, you can write the Fourier transform in regards to either. All it changes is a constant in the exponential part of the integral which changes normalization.

So, what is that fixed well defined Fourier Transform you were talking about?

You are joking right?

no

The Fourier expansion has been proven correct so many times

Fourier expansion is not a statement but a definition thus not something to be proven,

that claiming this is ridiculous. Plancherel's theorem works not only on R, but also on locally compact abelian groups and has even been extended to non-commutative locally compact groups.

As if Plancherel's theorem has anything to do with "any function being representable with Fourier series".

This forms the very foundation for Harmonic Analysis. I have to assume you are joking.

Again I'm not joking, and that fundamental knowledge of math you were talking about earlier should have made his fallacy obvious. Unfortunately for him, he was working on Physics and there wasn't a clear definition of function at the time.

Otherwise there is really no point in continuing this conversation. If you are ready to reject over 200 years of math and science just because you say so then obviously there is more at issue here than frequency degradation in upscaling.

I was just trying to see if you had math background, since we know it's not signal processing.

You know what? I had a long point by point response typed that pointed out how you were wrong in just about everything you typed. From your fundamental understanding of math to your ridiculous claims about interpolation. However, at this point I cannot post something like that without plain out outright insulting you - which I am trying really hard to avoid. These last two statements are perfect examples of why that is difficult though. It is just plain ignorant to claim that over 200 years of proofs, research, and practical application that prove Fourier correct is "wrong" because you said so.

So let's clarify, you are saying that last 200 years of proofs, research and practical applications proved that any function is "uniquely" representable with Fourier series.
So God forbids, I couldn't give you two different functions that has the same Fourier coefficients because of the last 200 years of bla bla? Right? Right?
I cannot even laugh any more.

At this point, I have provided you with reference journal papers you can examine at your leisure

Yeah right, I saw them

that all confirm the statement that there is high frequency reduction in upscaled images and discuss various techniques used to get around them. If you read enough people saying it in peer reviewed journals, maybe you can bring yourself to accept it. Once you get to that point, we can go back and discuss all of the math that proves it.

Just ignore my claims at this point, because the proof of preservation of highest fricking frequency is clearly not enough for you. Just answer my questions in the previous posts.
While you are at it, go check blur implementation in GIMP or something.
And please stop giving examples, history lessons or definitions that I don't care.
As said, I don't like data mining.

Xalion · May 15, 2008

*edit* Stefan is right. Removing my comments. Instead, will just post one final piece of information and then I will leave this to whatever individuals want to believe for themselves.

http://ieeexplore.ieee.org/iel5/4145986/4099325/04146090.pdf?tp=&isnumber=&arnumber=4146090

And I quote:

High resolution HD displays become more popular and will be used for HD material as well as for SD material. HD displays have a much higher resolution which is poorly used by SD material even when up-scaled with linear interpolation and peaking methods. Nonlinear signal processing creates higher harmonics of the low resolution source signal, which may fill the additional frequency space up to the Nyquist limit of the display resolution. Harmonics beyond the Nyquist limit of the display resolution may cause aliasing. Therefore, the creation of bandwidth limited higher harmonics is an important issue.

Fig. 3 depicts the spectra up to the output Nyquist limit of the up-scaled baboon image. It is clearly visible that peaking increases the amplitudes of already existing frequency components, while the nonlinear approach creates new frequencies using the enlarged frequency space. This is supported by the baboon image parts in Fig. 4. Other material was tested with excellent results for other scaling factors and already distorted material by noise.

Now, if a direct statement that lower harmonics are increased and higher frequencies cause aliasing, then 3 figures depicting the changed frequency spectra for 3 different algorithms don't put this to rest, then there is nothing that will. At this point, you have cold hard evidence that your claims of no frequency shift are incorrect. There is nothing more than needs be said.

StefanS · May 15, 2008

Mod:

Folks, it's time to turn down the anger level. Arguments can be raised without shiving each other. Do you really think it elevates your points? Or that the other party will be more willing to accept points?

So let's be civil and get rid of the latent passive-aggressiveness. Discussing is fine, no need to get personal...

betan · May 15, 2008

Xalion said:
*edit* Stefan is right. Removing my comments. Instead, will just post one final piece of information and then I will leave this to whatever individuals want to believe for themselves.

I have your post in my mailbox. I don't mind the insults as long as the answers are there. Unfortunely, things like FT of n^2, Photoshops' blur filter, mysterious frequency 3, are still up in the air.

I can respond to any specific part of you like.

Here are some highlights after a quick review.

I don't care about upscaling noise, noise reduction filters, etc. They are indeed irrelevant, the topic at hand is whether upscaling by nature (not any specific upscaling) dampens the high frequency components of the original imagine.

While off topic, the claim from the paper you quoted regarding linear
interpolation having more blurry than higher orders is simply
wrong when interpreted in objective terms (blur=less high frequency). The higher orders have sharper impulse responses, thus do better job at filtering out high
frequency noise (again for integer upscaling factors).

Most of the other quotes or references to blurry images are because of
Nyquist theorem, not upscaling.

The word unique you spent so much time discussing was referring to "no two
functions having the same Fourier expansion", not "a single function having a single Fourier expansion" which is true by definition.

Orthogonal kernel wasn't a reference to orthogonality to inverse kernel, but a reference to the functions (harmonics in this case) of the kernel being orthogonal to each other. That's the reason Generalized Fourier Transforms work.

The proof you don't like because it's a single harmonic is actually complete and sound because the upscaling techniques that I described are of linear systems (but not time invariant if anyone cares).

I haven't read the example because we aren't even done with the last one.

http://ieeexplore.ieee.org/iel5/4145986/4099325/04146090.pdf?tp=&isnumber=&arnumber=4146090

And I quote:

Now, if a direct statement that lower harmonics are increased and higher frequencies cause aliasing, then 3 figures depicting the changed frequency spectra for 3 different algorithms don't put this to rest, then there is nothing that will. At this point, you have cold hard evidence that your claims of no frequency shift are incorrect. There is nothing more than needs be said.

First the scaling factor is 2.4 (non-integer) which may cause the heavy
noise for nonlinear filtering .
Second apparently we are not looking at the same paper, it doesn't say low
frequency components of the original image are amplified (with peaking), it says (already existing) frequency components of the original image are amplified. Figure 3 (middle) clearly shows that all frequencies are amplified as well.

If neither that nor the linear system proof is not enough for you and now that I see you're okay with images, if you like I can plot frequency components of any function
you like, before and after upscaling to see possible loss of high
frequencies under upscaling. I can tell you already though, there is no such thing.

Please pick a stable function with dominant high frequency components (and hopefully something that integrates to 0) since a function like n^2 has very dominant low frequencies and
relatively no high frequencies. It can still be used to demonstrate
preservation of low frequencies if you like.

Xalion · May 15, 2008

betan said:
I have your post in my mailbox. I don't mind the insults as long as the answers are there. Unfortunely, things like FT of n^2, Photoshops' blur filter, mysterious frequency 3, are still up in the air.

For the first, perform the integral:

X(w)=int(n^2*exp(-i*w*n), dn). I will assume you can work that out yourself. For the second, a convolution is a multiplication in the frequency domain. One way or the other, it is the same operation. For the third, you get a frequency corresponding to every point when you perform a discrete Fourier transform. I have explained that 3 times now.

Orthogonal kernel wasn't a reference to orthogonality to inverse kernel, but a reference to the functions (harmonics in this case) of the kernel being orthogonal to each other. That's the reason Generalized Fourier Transforms work.

The kernel is a single function. Period. It is not orthogonal to anything. A Fourier expansion of the kernel in terms of Sines and and Cosines does indeed expand the kernel in terms of orthogonal functions, but the word orthogonal in and of itself was and still is entirely useless in the sense you used it in.

The proof you don't like because it's a single harmonic is actually complete and sound because the upscaling techniques that I described are of linear systems (but not time invariant if anyone cares).

Let me restate your proof simply:
Function X has property A.
Therefore property B cannot exist.

I hope you can see the problem now.

First the scaling factor is 2.4 (non-integer) which may cause the heavy
noise for nonlinear filtering .

Irrelevant. The paper showed quite clearly that the effect I was saying existed exists. Scaling factor was never determined.

Figure 3 (middle) clearly shows that all frequencies are amplified as well.

Go back and reread the paper. You will discover that figure 3 middle represents the upscaling algorithm discussed in the paper that was designed to amplify all frequencies and to re add lost high frequencies in the expanded frequency domain. Of course it amplified all frequencies. You should be concerning yourself with Figure 3 bottom.

If neither that nor the linear system proof is not enough for you and now that I see you're okay with images, if you like I can plot frequency components of any functionyou like, before and after upscaling to see possible loss of high
frequencies under upscaling.

Don't bother. I did it for you. Actually, I did it before this discussion started just to check that what every journal article and textbook I have ever read was correct. Here are the results:
http://quantumbear.com/XalFiles/linearupscaling.html

I did it in Mathematica soley for ease of use. High frequency is lighter color in the two greyscale pictures. The upscaling is on the blue component of the picture only. In this case, the upscaling is done using first order interpolation. The scale factor is 3. So you could compare, these pictures are not shifted to the maximum as they would normally be so you can directly compare image section to image section. It is plotted on the log scale as you normally would.

Notice the repression of high frequency in the middle of the image (where there are a lot of sharp lines) and the amplification of low frequencies towards the edges (where there are few sharp lines). I can send you the mathematica book if you would like to go through the calculations for yourself.

Please pick a stable function with dominant high frequency components (and hopefully something that integrates to 0)

Sure - I'll even make it an easy one. f(x)={1 -> x<nmax/2, 8 -> x=nmax/2, 2 -> x>nmax/2}. Simple function with a high frequency component.

Perform first order linear regression

-tkf- · May 15, 2008

Still numb.. would it possible to show examples?

Xalion · May 15, 2008

Try the link I posted. It is a mathematica notebook that takes part of an image then upscales it to double the size. Maps of the frequency domain are posted.

As another very simple example, frequency is defined as the change in intensity between two pixels. So take any small set of pixels like the following:

{1,1,5,2,2}

You want to double the number of pixels. We are talking about linear transforms. In this case, choose a direct linear interpolation. Basically draw a line between each point. If you do that and then let n -> n/2 you get the following 9 points:

{1, 1, 1, 3, 5, 3.5, 2, 2, 2}

For now, stay at 9 because what you do on the edges can go either way. Now, frequency is the change in intensity between two points, so you can say that the frequencies are:

{0,4,-3,0}
{0,0,2,2,-1.5,-1.5,0,0}

This is the effect under question. In the first space, the highest frequency is 4, and it occurs once. In the second, the highest frequency is 2 and it occurs twice. This is the frequency suppression that I am referring too. If you were to think of 1 as red, 5 as blue, and 2 as green you can visualize a picture. In the smaller case the picture is a red field and a green field divided by a blue line. In the larger case, you still have a red field, and a green field - but there are now red/blue and blue/green lines between the two fields and the line dividing them. As you can imagine, this makes the picture look slightly "blurry".

Notice that without the sharp divide this wouldn't happen. So for instance if your intensities followed a Sin wave you could interpolate between each pixel without damaging the integrity of the picture.

You can ignore the math right now if you want. All I am saying is that linear interpolation blurs the edges of hard lines like I showed you above. Betan is arguing that it does not in a round about way.

corysama · May 17, 2008

Harsh thread. I haven't bother to read it. I'm just skimmimg, but I want to toss these bits in for everyone:

For anyone interested in getting a fast and deep lesson in sampling theory, I strongly recommend reading "A Pixel Is Not A Little Square!"
http://www.cs.princeton.edu/courses/archive/spr06/cos426/papers/smith95b.pdf

When resampling an image, you can do a lot of creative things -if you pick your battles carefully. Here's an example of an upsampler that was designed specifically to not blur sharp edges in the upsampling process: http://www.hiend3d.com/hq4x.html
Of course, it also only works on high-contrast, low resolution material like SNES games. If you tried to apply it to a realistically rendered modern game you would be disappointed in the results.

betan · May 18, 2008

Xalion said:
For the first, perform the integral:

X(w)=int(n^2*exp(-i*w*n), dn). I will assume you can work that out yourself.

Here are what your posts make you look like:
Not being able to calculate FT of n^2 wrong is understandable, but not knowing only constant functions have an FT of single delta at frequency 0 shows lack of FT familiarity.
Trying to avoid questions are lame and childish, claiming you have proof based on that bs is, well I don't know any polite word for this.
Plus, you were either doing continues transform which has no point for n^2, or don't know what Kronecker delta is. Pick whatever you want.

For the second, a convolution is a multiplication in the frequency domain. One way or the other, it is the same operation.

Nice try, what about answering the question next time? Why would you claim most filters including blur filter of PS are applied on frequency domain?

For the third, you get a frequency corresponding to every point when you perform a discrete Fourier transform. I have explained that 3 times now.

You don't get frequency components for every point, that is against locality, you get as many discrete frequencies as number of samples which still doesn't explain the meaningless sentence "frequency of [4 8 4] is 3". On DFT, the first frequency is 0 the next is 1 the last one is -1. They are all non-zero frequencies by the way.

The kernel is a single function.

If you say so.

Period. It is not orthogonal to anything. A Fourier expansion of the kernel in terms of Sines and and Cosines does indeed expand the kernel in terms of orthogonal functions,

Don't get the obsession with sines and cosines, the complex exponentials are orthogonal to each other as well.

but the word orthogonal in and of itself was and still is entirely useless in the sense you used it in.

I agree it was useless to you.

Let me restate your proof simply:
Function X has property A.
Therefore property B cannot exist.
I hope you can see the problem now.

Go read what a "linear system" is, than we can talk. I'm tired of explaining.

Irrelevant. The paper showed quite clearly that the effect I was saying existed exists.
Scaling factor was never determined.
Go back and reread the paper. You will discover that figure 3 middle represents the upscaling algorithm discussed in the paper that was designed to amplify all frequencies and to re add lost high frequencies in the expanded frequency domain. Of course it amplified all frequencies. You should be concerning yourself with Figure 3 bottom.

I'm getting old, I could swear you quoted some text from the paper and "described an effect" (rephrased actually, needless to say incorrectly), and now you are telling me to go, reread the paper? Nevermind.

Don't bother. I did it for you. Actually, I did it before this discussion started just to check that what every journal article and textbook I have ever read was correct. Here are the results:
http://quantumbear.com/XalFiles/linearupscaling.html

I did it in Mathematica soley for ease of use.

Thanks, unfortunately it sucks for numerical computation, relatively speaking.

High frequency is lighter color in the two greyscale pictures. The upscaling is on the blue component of the picture only. In this case, the upscaling is done using first order interpolation. The scale factor is 3. So you could compare, these pictures are not shifted to the maximum as they would normally be so you can directly compare image section to image section. It is plotted on the log scale as you normally would.
Notice the repression of high frequency in the middle of the image (where there are a lot of sharp lines) and the amplification of low frequencies towards the edges (where there are few sharp lines). I can send you the mathematica book if you would like to go through the calculations for yourself.

Let's do something better for us without eagle eyes and anything to compare to.
Here is what a scale factor of 3 looks like for blue (yours was 2, but who cares)
http://img2.freeimagehosting.net/image.php?8352ab3e4f.gif

Top row (may be flipped): upscaled images with 0th, 1st, 3rd order polynomial interpolation respectively
Bottom row: FTs of the original image (zero padded), 1st and 3rd respectively
FTs are 0 centered, yellow is the lowest value, blue is highest, purple is the middle value. ColorFunctionScaling->True (Ignore the scaling).

What is obvious from the images is that the frequency distribution of the original image (small rectangle) is also in the upscaled ones, and makes up the FT - noise. With no high frequency bias.

Sure - I'll even make it an easy one. f(x)={1 -> x<nmax/2, 8 -> x=nmax/2, 2 -> x>nmax/2}. Simple function with a high frequency component.
Perform first order linear regression

You want me to do linear (= first order) regression? Really? Really?

If upscalers did linear regression, all non-native 1080p games would look like gradients with some small noise. Forget I asked, we have your image now anyway.

Xalion said:
Try the link I posted. It is a mathematica notebook that takes part of an image then upscales it to double the size. Maps of the frequency domain are posted.

As another very simple example, frequency is defined as the change in intensity between two pixels. So take any small set of pixels like the following:

{1,1,5,2,2}

You want to double the number of pixels. We are talking about linear transforms. In this case, choose a direct linear interpolation. Basically draw a line between each point. If you do that and then let n -> n/2 you get the following 9 points:

{1, 1, 1, 3, 5, 3.5, 2, 2, 2}

For now, stay at 9 because what you do on the edges can go either way. Now, frequency is the change in intensity between two points, so you can say that the frequencies are:

{0,4,-3,0}
{0,0,2,2,-1.5,-1.5,0,0}

This is the effect under question.

No, you don't get it. The difference of pixels is a reasonable (but inaccurate technically*) metric for discrete frequencies. Discrete frequencies are meaningless by themselves.
The human eye detects continues spatial frequencies and the image on screen is a continues reconstruction of discrete samples. You need to normalize discrete frequencies with sampling rate to have an idea of continues frequencies. (You should have listened when I said there are more than one frequency definitions involved.)

So when you say the pixel rate of change decreases, it doesn't mean anything, rate of change per distance (the reasonable but still inaccurate metric for continues frequencies) stays the "same", since the increased number of pixels cover the same area. Got it?

To be honest, I don't care one way or another anymore,

Good luck with your studies.

*Strictly speaking difference operator is a poor high pass filter

Xalion · May 18, 2008

betan said:
Here are what your posts make you look like:
Not being able to calculate FT of n^2 wrong is understandable, but not knowing only constant functions have an FT of single delta at frequency 0 shows lack of FT familiarity.

This is just sophistry. If anyone here takes the function x^2 and goes to a math professor at their local university asking to take the Fourier transform, the professor will just do it. This is not a difficult calculation. Moreover, you will see I pointed out that the frequency is 0 and gave the value at that point in the frequency domain in my first example.

Stop trying to avoid the issue.

Nice try, what about answering the question next time? Why would you claim most filters including blur filter of PS are applied on frequency domain?

Because many are - especially in signal processing. A convolution is often slow compared to a fast Fourier transform. It is no more accurate, and does the same thing. You can build a fast Fourier transform directly into hardware with very simple circuits. Even then, when you are doing something like a convolution you are doing the exact same thing as the Fourier transform filters I explained. Now, the original post was speaking in terms of generalities. I didn't want to have to get down into the details of FFTs vs convolutions and discrete Fourier transforms. The basic theory is there.

You don't get frequency components for every point, that is against locality, you get as many discrete frequencies as number of samples

You have n points which you sample. You get n frequency "components". One for each point, a frequency "component" for each sample - say it either way and it is the same thing.

Don't get the obsession with sines and cosines, the complex exponentials are orthogonal to each other as well.

Yes, but the kernel is ONE exponential. ONE - not many. You can expand any function in terms of orthogonal functions. There is nothing special about Sines and Cosines. So yes, you could rewrite the Fourier transforms kernel in terms of Sines and Cosines. However, you could also rewrite it in terms of polynomials. There are a host of other functions that form an orthogonal basis that you could expand it in as well. None of that matters.

The kernel is orthogonal to nothing. It IS a single function. Yet you claimed you could write an infinite number of Fourier transforms from one orthogonal kernel. Those are YOUR words, not mine. Since the kernel that defines the Fourier transform is ONE function, the word orthogonal there has no meaning. Further, since the kernel that defines the Fourier transform is well defined, you could not write an infinite number of Fourier transforms from that kernel. If you had said an infinite number of transforms by varying the kernel, I would have been fine with it. You didn't say that though.

Throughout this thread, you have tried to jump on my for even the smallest "perceived" misuse of a word. See the 3 frequencies above - where I have clearly explained I was referring to each of the 3 values in the Fourier transform several times. However, you are just as guilty of using the same loose language in regards to the math, as this example clearly demonstrates.

Thanks, unfortunately it sucks for numerical computation, relatively speaking.

Yet is it perfect for showing the exact calculation to ensure it was done correctly. With little knowledge of programming, people can follow it through. It is also used and trusted in industry to calculate and verify things exactly like this. Yet you want to just dismiss it. Indeed, you try to cast doubts as to it's accuracy, even though people would be far better off trusting what they see there than anything you produce without source code backing it.

I personally do not think that is a valid argument.

Let's do something better for us without eagle eyes and anything to compare to.
Here is what a scale factor of 3 looks like for blue (yours was 2, but who cares)
http://img2.freeimagehosting.net/image.php?8352ab3e4f.gif

Mine was 3. I did upload a couple of versions, it is possible I had one with 2 uploaded. However, that would have been for less than 10 minutes. This indicates the maximum possible amount of time you could have spent trying to understand it. It is important to note because of what you tried to hide in your own comparisons.

Let me explain. You see, in your first picture, the only part of that picture that actually corresponds to the translated domain is the small square in the middle. Just for reference, I included the same in my sheet so people could see where it came from and the frequency shift. You may check if you desire, it is now there. While you accuse me of doing something that people would need "eagle eyes" to see, you intentionally included a yellow background trying to indicate high frequencies on your picture while hiding it in the middle.

The funny thing is it still shows exactly the effect I was talking about. If you look at the density of yellow points in the center of the first image and the density of yellow points in the center of the second, the second image clearly has far fewer. This means that high frequencies got compressed while low frequencies were magnified.

It is actually worse if people didn't know how to interpret your pictures properly. Because these charts should always be draw at the same size (we will get to this later), the yellow background you include in the first picture would all be high frequencies. In shifting to the second, the red bands going to the edge would denote a clear shift in high frequency signal. Of course, people should not look at the pictures like this. Instead, they should try to imagine the small part in the first picture as being the same size as the other two.

What is obvious from the images is that the frequency distribution of the original image (small rectangle) is also in the upscaled ones, and makes up the FT - noise. With no high frequency bias.

I'm sorry, but after basing so much on presumed expertise in the area did you really just claim this?

First, let us go over what these graphs for those who might not know. We are mapping the frequency space. The first thing you should always do when making these graphs is make them the same size. You see, the information contained in the graph is contained in relative location of pixels and in the color. In this case, you can retain information by drawing 1 pixel as a 2x2 block in the first picture. Relative location and color remains the same. You will note if you look at any textbook, paper, publication, or website that they always show the graphs as the same size. This is to prevent people from trying to claim what you just did.

The actual frequency distribution is not contained in the shape (note - I should put in at this point that depending on the type of diagram this does contain spacial information about the location of frequencies in the picture. IE - a picture with different graph. Especially if you are doing phase/amplitude rather than position/amplitude. However, that is another subject entirely) - it is contained in the COLOR of the pixels. To see what the shape indicates, you should take an image and rotate it. What you will see is that the main lines indicating peaks in frequency will also rotate in the frequency domain. Therefore, the "square" in the middle of the second picture does not contain the first picture. That is actually obvious at first glance, as that square and the first picture are clearly different.

An aside at this point - the square you referred to in the second picture and the first picture are clearly different. So even under your interpretation, the frequency shift is there....

Back to the real interpretation for now though. As your pictures are not rotated, stretched, skewed, ect, you would expect the overall form (ie - that small rectangle in the middle) to be the same in ALL graphs. This is an indication that you just increased scale without altering the actual picture. So far so good. If the general shape was different it would indicate that you did something wrong while upscaling - not that the frequencies have changed.

Instead, what you are looking at what happens to relative color. These graphs are generally normalized so that the highest and lowest frequency in each graph share the same color. So you have to compare the patterns within color to see what happened rather than just say "red is in both pictures!". If you look at the corner of the small graph to the corner of the large graph, you will notice that your first is "fuzzy", while your second has large solid yellow areas. Let us think about these for a minute.

There are two reasons this happens. The first is because of the Nyquist theorem that you have cited incorrectly throughout this thread. The Nyquist theorem limits the highest possible frequency according to the number of samples. If you double the number of samples, you raise the highest possible frequency. Note that the Nyquist theorem says nothing about frequencies remaining constant under transforms. Part of that yellow - or "high" area is because you have expanded the frequency space and have nothing to fill it with. So it appears solid.

The second part is easier to see the cause for if you look at the peak lines. If you examine them in the small picture, you see that the frequency along those lines actually rises very fast, meaning the hard lines become more difficult to see rather quickly. In the larger, this is not the case though. There is a very hard line throughout the entire picture. What is even more telling is the structured "noise" that shows up in several places just off to the side. The only way for this to happen is a shift in high frequency allowing low frequency structures to show through.

With your first picture being so much smaller than the other two, it is hard to compare them. So I have gone back in my notebook and included the two pictures at similar size - greatly increased so you don't have to be "eagle eyed" to see it. I have normalized the color scale so that it is similar to yours. I have used blue for high frequencies, red for low, and white for mid though - to make the differences more apparent.

Your very own pictures show the shift in frequency. It is obvious in both of them.

You want me to do linear (= first order) regression? Really? Really?
If upscalers did linear regression, all non-native 1080p games would look like gradients with some small noise. Forget I asked, we have your image now anyway.

I used linear regression because it is easy for anyone to do and displays the effect quite well. This is down right amazing. Now you are trying to claim we can't use certain types of upscaling because 1080p wouldn't look good in them. Isn't that the entire point of this discussion?

What is more interesting is that you specifically claim in this sentence that the effect I am talking about exists ("all non-native 1080p games would look like gradients" - as you must know a gradient will have much lower frequency components than a picture with sharp lines.....), yet you continue to post claiming it doesn't. You are conflicting yourself here.

No, you don't get it. The difference of pixels is a reasonable (but inaccurate technically*) metric for discrete frequencies. Discrete frequencies are meaningless by themselves.

The human eye detects continues spatial frequencies and the image on screen is a continues reconstruction of discrete samples. You need to normalize discrete frequencies with sampling rate to have an idea of continues frequencies. (You should have listened when I said there are more than one frequency definitions involved.)

This is the largest piece of nonsense you have thrown in yet. It is a blatant attempt to shift the topic of discussion to another arena entirely. The human eye can differentiate spacial differences that correspond to spaces of approximately 1/60th of a degree. While that changes person to person, that is the standard generally used when trying to decide if a human can see something. Now, given the distance to the television, people may or may not be able to see a given detail. That has NOTHING to do with the current discussion. For example, let us take the following pictures:

Now, I want you to look at them very carefully. Can you see the area where there are averaged bars in the second picture? These two images correspond exactly to the {1,1,5,2,2} to {1,1,1,3,5,3.5,2,2,2} upscaling example I gave in my previous post - I've just used a very large pixel size. To be precise, I should mention that this is upscaling a 5x5 grid instead of a single line. It is however the exact same algorithm.

Even you must admit that the frequency shift is visible using pixels of this size. You CANNOT just "add" the frequency shifts between the rows to get the same frequency shifts that are in the lower rez image and claim they are the same. They are obviously different, and you can obviously see it. Now, you could try to claim that you must keep the two images the same size. Ok, lets do that:

Did it make a difference or are the bars still visible?

There are two ways to make this difference imperceptible. You can shrink pixels, or increase distance. Stand 100' from your screen and tell me if you can still see the difference. Probably not. So does that mean it doesn't happen? Of course not. The effect happens whether or not it is perceptible to the human eye. As it is the effect under question, this nonsense about "spatial frequency" and "special sampling rates" of the human eye has to stop. If you sample every pixel in the lower resolution image, you sample every pixel in the upper resolution image. That way you can compare frequency results directly to see if a shift does indeed happen.

As long as we are here, let me point out that in the "real" world, the drive for upscaling is generally being caused by larger television screens in static rooms. If you kept the number of pixels constant, their size would have to increase as the screen size did. So they add a larger number of "smaller" pixels to counter that effect and keep the image the same size. Considering that you can get projection screens over 100" now though, you cannot even claim that there is no "real world" difference. For some there wont be. For others, screens are large enough to tell.

You should have listened to me when I said frequency is well defined.

So when you say the pixel rate of change decreases, it doesn't mean anything, rate of change per distance (the reasonable but still inaccurate metric for continues frequencies) stays the "same", since the increased number of pixels cover the same area. Got it?

Can you see the difference between the two gray scale images above? Yes? Then this entire paragraph was nonsense.

No offense, but papers, textbooks, my frequency maps, the two gray scale images above, and even YOUR frequency maps are not wrong. The shift happens. Everyone but you is not wrong on this.

betan · May 19, 2008

Xalion said:
This is just sophistry. If anyone here takes the function x^2 and goes to a math professor at their local university asking to take the Fourier transform, the professor will just do it. This is not a difficult calculation.

Was the FT you posted calculated by a prof? Which local university that is, one wonders.

Moreover, you will see I pointed out that the frequency is 0 and gave the value at that point in the frequency domain in my first example.

Are you still saying frequency of n^2 is 0?

Stop trying to avoid the issue.

Those unfortunately are about your so called proof, if pointing out BS is avoiding the issue I don't really know what you expect me to do.
Again this is what you said:

Once again, to get this you MUST perform a Fourier transform. The Fourier transform of n^2 is

-Sqrt[2 \pi] (DiracDelta'')[w]

This ONLY has a value for w=0. Now, do your simple linear interpolation. That results in the function (n/2)^2. Do the Fourier transform for this function. The transform is:

-((Sqrt[2 \pi] (DiracDelta'')[w])/n^2)

Again, who is avoiding the question?

Because many are - especially in signal processing. A convolution is often slow compared to a fast Fourier transform.

Another BS! why are people filtering samples on time domain then? Just go and check Gimp source code.

Convolution is so much faster most of the time if nothing else for memory locality.
Damn it, I'm now waiting for your Mathematica results doing a 3x3 convolution vs FFT multiplication.

It is no more accurate, and does the same thing. You can build a fast Fourier transform directly into hardware with very simple circuits.

BS!. That circuit is massively more expensive than a simple convolution circuit.

Even then, when you are doing something like a convolution you are doing the exact same thing as the Fourier transform filters I explained. Now, the original post was speaking in terms of generalities.

Like this? :

You will find that most filters like Blur on photoshop do indeed shift the pixel domain into the frequency domain using discrete Fourier transforms and then subtracting or averaging individual frequencies

In case you forgot what you claimed...

I didn't want to have to get down into the details of FFTs vs convolutions and discrete Fourier transforms. The basic theory is there.

It is somewhere all right.

You have n points which you sample. You get n frequency "components". One for each point, a frequency "component" for each sample - say it either way and it is the same thing.

So "frequency of [4 8 4] is 3" means it has 3 frequencies for 3 samples. I should have guessed.

Yes, but the kernel is ONE exponential.
....

And what is the w in that function you keep writing? Is it a constant?

Throughout this thread, you have tried to jump on my for even the smallest "perceived" misuse of a word.

That is not limited to perception, in case you start to believe this lie.

See the 3 frequencies above - where I have clearly explained I was referring to each of the 3 values in the Fourier transform several times. However, you are just as guilty of using the same loose language in regards to the math, as this example clearly demonstrates.

You got me there. I should have been more clear when I said orthogonal Kernel. However, anyone who is familiar with transformations would know that orthogonality was referring to eigenfunctions. Only you knew what "frequency of bla bla is 3".

Yet is it perfect for showing the exact calculation to ensure it was done correctly. With little knowledge of programming, people can follow it through. It is also used and trusted in industry to calculate and verify things exactly like this. Yet you want to just dismiss it. Indeed, you try to cast doubts as to it's accuracy,

Dude, it's not about accuracy of Mathematica it's about core language, and most importantly performance of it. You surely aren't going to claim Mathematica is better then Matlab or even Octave for numerical computations.

even though people would be far better off trusting what they see there than anything you produce without source code backing it.

I admit, I make the images with photoshop?

I personally do not think that is a valid argument.

Indeed not against the validity of Mathematica computations. I had no such claim.

Mine was 3. I did upload a couple of versions, it is possible I had one with 2 uploaded.

How about not changing your own links then?
Because what I saw was some upscaling on blue, FFT on green, a plot on red etc.

However, that would have been for less than 10 minutes. This indicates the maximum possible amount of time you could have spent trying to understand it. It is important to note because of what you tried to hide in your own comparisons.

Hide what?

Let me explain. You see, in your first picture, the only part of that picture that actually corresponds to the translated domain is the small square in the middle. Just for reference, I included the same in my sheet so people could see where it came from and the frequency shift.
You may check if you desire, it is now there.
While you accuse me of doing something that people would need "eagle eyes" to see, you intentionally included a yellow background trying to indicate high frequencies on your picture while hiding it in the middle.

Hiding what? FT of the original image?

The funny thing is it still shows exactly the effect I was talking about. If you look at the density of yellow points in the center of the first image and the density of yellow points in the center of the second, the second image clearly has far fewer.
This means that high frequencies got compressed while low frequencies were magnified.

Yellow is lowest value. Repeat after me, yellow is the lowest value. Having less yellow means amplification which is the opposite of what you claim. I told you to ignore color scaling though, it is normalized all frequencies of the original image are amplified, check the pattern instead.

It is actually worse if people didn't know how to interpret your pictures properly. Because these charts should always be draw at the same size (we will get to this later), the yellow background you include in the first picture would all be high frequencies.

They are high frequencies of 0 amplitude that weren't captured in the original image.
Anything bigger than Nyquist frequency is 0. Can you guess why?

In shifting to the second, the red bands going to the edge would denote a clear shift in high frequency signal.
Of course, people should not look at the pictures like this. Instead, they should try to imagine the small part in the first picture as being the same size as the other two.

You still don't understand. The FT of small image is from -fs/2,-fs/2 to fs/2,fs/2.
The sampling rate for upscaling image is higher, hence the FFT matrix is 9 times bigger.
But the frequencies corresponding to (f < fsoriginal/2) are the small rectangle in the center.
That is why no idiot in the world upscales the FT of the original image to compare when talking about supersampling.

I'm sorry, but after basing so much on presumed expertise in the area did you really just claim this?

Yes, and stop stealth editing your posts, I get your posts in email you know.

First, let us go over what these graphs for those who might not know. We are mapping the frequency space. The first thing you should always do when making these graphs is make them the same size. You see, the information contained in the graph is contained in relative location of pixels and in the color. In this case, you can retain information by drawing 1 pixel as a 2x2 block in the first picture. Relative location and color remains the same. You will note if you look at any textbook, paper, publication, or website that they always show the graphs as the same size. This is to prevent people from trying to claim what you just did.

please stop spreading misinformation.

The actual frequency distribution is not contained in the shape (note - I should put in at this point that depending on the type of diagram this does contain spacial information about the location of frequencies in the picture. IE - a picture with different graph. Especially if you are doing phase/amplitude rather than position/amplitude. However, that is another subject entirely) - it is contained in the COLOR of the pixels.

So you are saying the frequency distribution is contained in the color of the pixels.
I'm sorry, I don't get your loose language.
The pixel values don't contain frequency information but the whole sample (signal) obviously does.

To see what the shape indicates, you should take an image and rotate it. What you will see is that the main lines indicating peaks in frequency will also rotate in the frequency domain. Therefore, the "square" in the middle of the second picture does not contain the first picture.

It contains all the information that is captured during sampling of the original image + noise.
Here is an idea. Why don't you take the upscaled image, get rid of all the frequencies outside of that rectangle. See what happens to the image? Do you think original image will disappear? I'm sure you can do that. Or would you like me to that for you with source code attached?
Of course you can always ask your local prof if LPF is too tough for you.

That is actually obvious at first glance, as that square and the first picture are clearly different.
An aside at this point - the square you referred to in the second picture and the first picture are clearly different. So even under your interpretation, the frequency shift is there....

I don't think you have any idea what my interpretation is. That is my fault of course, I haven't been trying to teach you anything, just mocking.

Back to the real interpretation for now though. As your pictures are not rotated, stretched, skewed, ect, you would expect the overall form (ie - that small rectangle in the middle) to be the same in ALL graphs. This is an indication that you just increased scale without altering the actual picture. So far so good. If the general shape was different it would indicate that you did something wrong while upscaling - not that the frequencies have changed.
Instead, what you are looking at what happens to relative color. These graphs are generally normalized so that the highest and lowest frequency in each graph share the same color. So you have to compare the patterns within color to see what happened rather than just say "red is in both pictures!".

I'm honestly surprised you actually understand that. Nice.

If you look at the corner of the small graph to the corner of the large graph, you will notice that your first is "fuzzy", while your second has large solid yellow areas. Let us think about these for a minute.
There are two reasons this happens. The first is because of the Nyquist theorem that you have cited incorrectly throughout this thread. The Nyquist theorem limits the highest possible frequency according to the number of samples.

According to sample rate, not number of samples.

If you double the number of samples, you raise the highest possible frequency.

If you double the number of samples per distance....

Note that the Nyquist theorem says nothing about frequencies remaining constant under transforms.

Maybe, because it's not about transforms?

Part of that yellow - or "high" area is because you have expanded the frequency space and have nothing to fill it with. So it appears solid.

Plus noise

The second part is easier to see the cause for if you look at the peak lines. If you examine them in the small picture, you see that the frequency along those lines actually rises very fast, meaning the hard lines become more difficult to see rather quickly. In the larger, this is not the case though.

Frequency rises in the small rectangle of the first picture ?

There is a very hard line throughout the entire picture. What is even more telling is the structured "noise" that shows up in several places just off to the side. The only way for this to happen is a shift in high frequency allowing low frequency structures to show through.

Don't know what you are saying here. But don't bother explaining now, I'm gonna add a better example for people who don't know how to interpret FTs.

With your first picture being so much smaller than the other two, it is hard to compare them.

The part you need to compare is in the middle.

So I have gone back in my notebook and included the two pictures at similar size - greatly increased so you don't have to be "eagle eyed" to see it. I have normalized the color scale so that it is similar to yours. I have used blue for high frequencies, red for low, and white for mid though - to make the differences more apparent.

I would like to check that, but I don't see any point doing that when there is a strong possibility you change that by the next post. Let me have my example first, then if you like we can check yours.

Your very own pictures show the shift in frequency. It is obvious in both of them.

Enough of that "shift in frequency" term. We use the term frequency shift for F(w) going F(w-w0), like modulation.
And your claim was that:

you said:
me said:

Do you think upscaling "reduces the high frequencies in the image" (more than lower ones) or not?
Yes or no? simple question.

Click to expand...

Yes.

Now, you claim my picture shows that and I should not zero pad the original image, because I'm hiding something? Right? Right? Right?

Here is another image, (totally artificial but that's the price of readable FT):

Code:

tmpfile="temp.jpg"
Export[tmpfile,Rasterize[BETA, RasterSize->46, ImageSize->50]]
mess=Reverse[Import[tmpfile][[1, 1]][[All, All, 1]]];
{mh,mw}=Dimensions[mess]
nmax=Max[{mh,mw}]*2
dim={nmax,nmax}
image=zero=Array[0&, dim];
For[i=1, i<=mh, i++,
  For[j=1, j<=mw, j++,
    image+=If[mess[[i,j]]<200,Array[(255*Cos[2*Pi*i*#1/nmax+2*Pi*j*#2/nmax])&, dim], zero];
]]
image+=Array[Random[]&, dim];

(edit: After Shifty cleaned my post, I decided to change the original image as the old one could be considered offensive, hopefully no frog)
Here are the FTS, from top left (clockwise), original, 0th order, 3st order, 1st order interpolation:

Try it yourself if you wish to check FTs. You can change the word as well

.
Now tell us again how the high frequencies of the original image are dampened and zero padding is deceptive, upscaling is necessary, etc.

For those who still cannot read FTs, the dominant frequencies of the first FT spells the word "BETA".
When you upscale all of those frequencies are still dominant, plus noise (harmonics because system is not LTI + periodicity assumption of DFT).
Just check the blue pixels on both images.

I used linear regression because it is easy for anyone to do and displays the effect quite well. This is down right amazing. Now you are trying to claim we can't use certain types of upscaling because 1080p wouldn't look good in them. Isn't that the entire point of this discussion?

No it's not, linear regression and linear interpolation (we are using here) are different terms. Linear interpolation (1D) uses only 2 consecutive samples of the original image, while regression uses all.
You see, your interpolation example on n^2 was actually regression because we were fitting all the data to a single polynomial (2nd order), of course you called that linear interpolation too.

What is more interesting is that you specifically claim in this sentence that the effect I am talking about exists ("all non-native 1080p games would look like gradients" - as you must know a gradient will have much lower frequency components than a picture with sharp lines.....), yet you continue to post claiming it doesn't. You are conflicting yourself here.

You wish.

This is the largest piece of nonsense you have thrown in yet.

Are you saying we see discrete frequencies?
Or are you saying we don't see continuous frequencies?

It is a blatant attempt to shift the topic of discussion to another arena entirely.

I already regret it.

The human eye can differentiate spacial differences that correspond to spaces of approximately 1/60th of a degree.

Now that is really off topic, how is human eye resolution related?

While that changes person to person, that is the standard generally used when trying to decide if a human can see something. Now, given the distance to the television, people may or may not be able to see a given detail. That has NOTHING to do with the current discussion. For example, let us take the following pictures:

I'm glad we agree.
And let's try to keep topics small in number.

Shifty Geezer · May 19, 2008

Okay guys, this is a great topic for discussion, but remember to stay happy and civil. The little digs and jibes aren't constructive. If someone is wrong, it's no big deal. If it turns out you're wrong, you should be happy for the correction! This isn't a fight for one's pride, but a chance to ponder and maybe learn something new, or see things in a different light.

Having thought about it, there's no reason for this thread to be localised to the console forum. I'll move it into 3D Tech and Algorithms.

Xalion · May 19, 2008

You know what? Several things you posted are self contradictory. I am not sure what you are trying to argue anymore. So lets start at a very simple picture, go through the upscaling process one step at a time, and then you can tell me where you think it is wrong. This image will be the basis for the discussion.

Image 1:

n=5
intensity values = {1, 1, 5, 2, 2}
Normalized DFT values = {0.831522, 0.289018, 0.265996, 0.265996, 0.289018}

This picture will serve as our baseline. The picture we compare everything too. Our upscaling process is going to be simple first order interpolation. Meaning we draw a line between every pixel and calculate the inserted values based on that. I've written a computer program to do it, however I will keep the first couple of upscaling small so you can calculate all of the inserted pixels yourself to make sure they are correct. These will be categorized by the number of pixels we insert in between each pixel.

1 pixel insert:

n=9
intensity values = {1, 1, 1, 3, 5, 7/2, 2, 2, 2}
Normalized DFT values = {0.873131, 0.284212, 0.177299, 0.0767834, 0.0267113, 0.0267113, 0.0767834, 0.177299, 0.284212}

2 pixel insert:

n=13
intensity values = {1, 1, 1, 1, 7/3, 11/3, 5, 4, 3, 2, 2, 2, 2}
Normalized DFT values = {0.882523, 0.283273, 0.159256, 0.0555189, 0.0197686, 0.0199317, 0.0331723, 0.0331723, 0.0199317, 0.0197686, 0.0555189, 0.159256, 0.283273}

10 pixel insert:

n=45
intensity values = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 15/11, 19/11, 23/11, 27/11, 31/11, 35/11, 39/11, 43/11, 47/11, 51/11, 5, 52/11, 49/11, 46/11, 43/11, 40/11, 37/11, 34/11, 31/11, 28/11, 25/11, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2}
Normalized DFT values = {0.890959, 0.28326, 0.141029, 0.0376519, 0.0155979, 0.0168026, 0.0189081, 0.0110707, 0.00812032, 0.00812838, 0.00895059, 0.00684143, 0.00579968, 0.00573862, 0.00614919, 0.00534881, 0.00480776, 0.00476649, 0.0050594, 0.00476095, 0.0044059, 0.00439216, 0.0046779, 0.0046779, 0.00439216, 0.0044059, 0.00476095, 0.0050594, 0.00476649, .00480776, 0.00534881, 0.00614919, 0.00573862, 0.00579968, 0.00684143, 0.00895059, 0.00812838, 0.00812032, 0.0110707, 0.0189081, 0.0168026, 0.0155979, 0.0376519, 0.141029, 0.28326}

50 pixel insert:

*Note: This one has enough pixels that posting the above numbers would take pages. As such, it is just for reference.

Now we can start going through individual points raised in this thread.

1) Does upscaling blur an image when using interpolation to fill in mixing pixels?

Visual check:
Starting with the original image, we have 2 areas separated by a hard black line. Inserting 1 pixel we already get 2 areas separated by 2 lighter gray lines and a black line. As we go through to an upscaling factor of 50, we get a very blurry center line separating the two fields. Note that something else happens in this picture as well, which is why it is included. At this point, I have enough pixels that I have surpassed the human eye's ability to individually distinguish them. The eye just does not have the spacial resolution to do so. As such, you probably see a black bar in the center that is similar in size to the original black bar. If you move your eyes closer and further from this particular picture, you will see the location of that bar shift. THAT effect is what happens when you have too many frequencies in a small area. Notice that the blur will still exist on the edges at any distance - but now you have an added effect where the image can change based on viewpoint.

So at this point we can conclude with 100% confidence that upscaling - at least by this algorithm - blurs the upscaled image. There is no question as to whether or not it happens, no amount of math that can claim it doesn't, no "distance based frequency domain of the human eye" to say it doesn't exist. The pictures clearly show it exists and clearly show that it can be seen by the human eye.

That in and of itself ends the original argument. Upscaling (at least by interpolation) does indeed blur images. As I said clear back in the beginning, some algorithms are better at it than others. The generalized theorem to prove this deals with averages. When you take an average to fill in a pixel you have to lower nearest neighbor frequencies. That is unfortunately a real effect of taking an average.

2) If we define blur as a type of high pass filter, does upscaling repress high frequencies?

Frequencies here are defined as the intensity difference between pixels. So a "high" frequency produces "sharp" lines. In other words, a high frequency corresponds to a sudden change from one intensity to another much higher or lower. Those definitions just give context to the discussion.

Let us just start with that definition and look at the first two pictures. If the frequency is defined as the rise and fall of intensities between pixels, we can just perform the calculation to get:

{0,4,-3,0}
and
{0,0,0, 2, 2, -1.5, -1.5, 0, 0, 0}

Now, the highest frequency in the second series is 2. The highest in the first is 4. So we could definitely describe this situation by a suppression of high frequencies. Low frequencies made it through fine, but high frequencies tended to be depressed.

This is admittedly not mathematically rigorous. So we need to develop another method for doing this comparison. So at this point is is worthwhile to stop and talk about Discrete Fourier Transforms. What a DFT does at its most basic level is to transform a set of points into a set of numbers that correspond to amplitudes and phase shifts for Sine functions whose frequencies can make up the picture. A practical example might be in order. So let us take the DFT of our original picture and consider it first. Here is the full DFT:

{4.91935+ 0. I, -1.67082 + 0.363271 I, 0.32918- 1.53884 I,
0.32918+ 1.53884 I, -1.67082 - 0.363271 I}

Each of these points has an amplitude and a phase corresponding to a particular frequency. While I wont go through the math here, the first value is the LOWEST frequency. It is important to understand that when we start talking about plots in frequency space. Note that the number is an amplitude - it does NOT indicate high frequency. These numbers are hard to compare. However, we can normalize and take the amplitude for these values. That gives us:

{0.831522, 0.289018, 0.265996, 0.265996, 0.289018}

Please note that the normalization wont affect anything from this point on. The results would be the same with or without. Normalizing the vectors just allows for smaller numbers to be compared. All of the DFT values I have given are normalized amplitudes. Phase isn't really important for us in this discussion. Before doing any labeling it is also worthwhile to note that some numbers are repeated here. If you look in the original set of numbers why that happens becomes apparent. These are frequencies where the amplitude is the same but phase is opposite. This is expected. This generally leads people to write DFTs shifted so that the lowest frequency (the first number) is in the middle.

If you even see a frequency diagram that is symmetric about the middle rather than the 4 edges, the low frequency is always the color of the point in the middle. If the author tells you differently, he misunderstood his own work.

Back to the example. We can label the frequency terms as:
0.831522 - Low
0.289018 - Middle
0.265996 - High

Remember that these are amplitudes so we can make direct comparisons. Now, lets take a look at the smallest interpolation. Extracting the same values (remember that each correspond to the same frequency in a Sine wave expansion) we get:

0.873131 - Low
0.284212 - Mid
0.177299 - High

Now, we can immediately see what has happened. The Low frequency was slightly amplified. IE - it's amplitude increased. The mid and high frequencies were depressed. Their amplitudes decreased. Note that a sum of all the smaller high frequencies that result as a byproduct of increasing the frequency domain would still not be equal to the original high frequency mark. Maybe it was just a fluke though. Lets take the second set:

0.882523 - Low
0.283273 - Mid
0.159256 - High

The same pattern exists! As a matter of fact, it is amplified this time. We can take the 10 pixel insert as well and look at it:

0.890959 - Low
0.283260 - Mid
0.141029 - High

What have we now shown? Simple - that when upscaling by linear interpolation high frequencies are suppressed. At this point, there really is no counter argument. It is clear from the numbers what happened. Does it make sense with what we see visually? Of course! The pictures still look blurred. The last question to ask is can we represent this visually?

3) How do we visual representation frequency domain and how do we interpret those representations?

Because we have amplitudes for each of the frequency components, the easiest thing to do would just be to plot those amplitudes. Here is that done for our original picture:

Notice that the color represents the amplitude for the frequency. The relative position indicates whether a frequency is lower or higher. The problem is if we were to use our nomenclature from above this graph would represent:

Low, Middle, High, High, Middle

Because of the periodicity of our frequencies appearing at different phases, this diagram is not easy to read. So generally we shift so the LOWEST frequency is in the middle and the high frequencies are all on the outside - like this:

(note - post limited to 6 images, so these next 2 are just links to the images)
Frequency shifted graph for original picture

Now we have a graph that represents:
High, Middle, Low, Middle, High

This is useful. We can see the low frequencies clearly in the center. If we were to do the same thing for one of our more complicated graphs - say the inbetween=50 graph, we get:
Frequency Shifted graph for inbetween=50

Notice the huge blue bars on either end here. This is the first place we can actually talk about Nyquist's theorem. Nyquist's theorem determines the maximum number of frequencies you need to sample a given wave with a given frequency. Using it, you can show what the maximum frequency you can measure out of a number of samples is as well. As we increase the number of frequencies, we increase the number of pixels. So we are sampling areas where there was no information in the original picture. So the only possible result is almost nothing in those areas. IE - you should get amplitudes that are very low or next to 0 - just as is shown in this bar.

However, look at what happens near the middle bar. In our original picture, the middle bar was a red stripe surrounded by blue. In this picture, there are clear white stripes around the blue bar. Remember that these are in the low frequency area. What do they indicate? They indicate that low frequencies were shifted up slightly. That depresses high frequencies as shown above.

Notice that just claiming the line exists in both pictures does not mean the high frequencies are the same, especially as it does not represent the high frequencies at all. Just like if you had a word - say "beta" - embedded in the images the location of the word would not tell you anything. It should be there. What will tell you something is the colors around that word. If they go from say - yellow in the original to blue and red lines in the upscaled version - it is an obvious suppression of high frequency components.

So, all 3 sources are consistent. You can see visually that blurring appears. You can see in the DFT that blurring appears. You can see in the plotted phase diagrams that blurring appears. Consistent start to finish, one interpretation and no room for "yeah buts". There really is no wiggle room here. The effect is clearly visible to the eye. It shows up in the math. It shows up in the phase diagrams. It is in peer reviewed published journals. You can find it in introductory signal analysis text books.

Now, we can go back to bickering over math, or you can just look at the pictures in this post and see that the effect exists. If the effect exists, it MUST appear somewhere in the math. I've already shown you where, but if you think you have a better explanation that would give this effect without changing the math, I'd love to hear it.

Xalion · May 19, 2008

betan said:
Here is another image, (totally artificial but that's the price of readable FT):

Try it yourself if you wish to check FTs. You can change the word as well .
Now tell us again how the high frequencies of the original image are dampened and zero padding is deceptive, upscaling is necessary, etc.

For those who still cannot read FTs, the dominant frequencies of the first FT spells the word "BETA".
When you upscale all of those frequencies are still dominant, plus noise (harmonics because system is not LTI + periodicity assumption of DFT).
Just check the blue pixels on both images.

This had to be in a separate post just because I have never seen someone do a better job of proving themselves wrong. We are going to look at your picture in two ways.

I have taken the liberty of drawing two boxes on your image - but I did not edit image content at all. I also drew some arrows because I think I have finally figured out why you are having trouble seeing the truth in your own pictures.

Arrows first. Each arrow represents direction of increasing frequency. Of course, they could go either way from the center. They are drawn the way they are because of where I drew my boxes.

Now, the Fourier transform itself samples frequencies at a rate of 2*Pi*k/N - so the number of samples usually determines the size of frequencies sampled. Now, I have loosely in my previous post done the same thing you have tried to do so it better be explained. The actual frequencies sampled in a 10 sample picture and a 5 sample picture would be:
2*Pi/5*{0,1,2}
and
2*Pi/10*{0,1,2,3,4,5}

Note that the actual frequency correspondence would be the first with first, the second with third, and the third with the fourth. We on the other hand have compared the first with the first, the second with the second, and so on. It is a bit disingenuous. There are many ways used to justify it, but for now we will just accept that it can generally be done as long as both functions are properly normalized.

The reason for mentioning all that is simple. Low frequency is the center for the map, high frequency is the edges. Like I said before, the actual frequency content of the picture is contained in the color of the pixels. A zero in his pictures is yellow. At this point please note something - zero is NOT the same as undefined. This is the issue I take with his "zero padding". If we take (0,0) as the start of the coordinate system, then the value at the point (fmax*2,0) is not necessarily zero. As a matter of fact, it is obvious the y part of that is not zero. The exponential part for the first wont be zero. The only thing left is the "value" at that point in the actual array. What you are doing is assuming that the color value for that location is zero. In other words, you are comparing your original sample with a large black box around it to the upscaled picture. It is NOT a 1:1 comparison. You don't zero pad a picture like this because zero is a value. The values at those points are undefined - not zero.

Anyway, back to the actual picture.

First, let us look at area B. Now, the only thing that changes between these two is orders of interpolation. According to his claim, the high frequency component should not change during upscaling - regardless of the order of interpolation. So let us see if that holds up.

If we look at 0th order, the word ETA is visible, with a large purple (ie - nonzero) high frequency background around it. If we look at 3rd order, the A and maybe part of the T is visible, with a much larger portion of yellow around it. In other words, By just changing the order of upscaling from 1 to 3, the upscaling itself removed part of the high frequency signal. In your original example with the Sine function, you clearly claimed that all higher orders would preserve frequency. Yet frequency is obviously not preserved.

The ONLY operation you have performed here is upscaling. Apparently, upscaling treats high frequencies differently depending on order. Strange... One of us has said that from the beginning.

The second part of your pictures I want you to examine are the points A. If you look in your original image, the A is made up of a solid color that has RGB components (0,254,255). According to your claim, the second A should have the exact same magnitude. In other words, if magnitude of that frequency component does not drop then the color should be the same in both pictures.

Did you know that this is not true? If you don't believe me, open Photoshop or any other video program and look at the RGB value for that A. What you will find is that the actual value is (15,216,0). That is right - it is shifted towards yellow! In other words, the magnitude of the high frequency dropped!

What should be equally obvious is the ghosting in the word beta even in the center of the frequency map - indicating a clear change in the color field.

So your very own picture clearly demonstrates the high frequency drop! This is the second you've posted that does. Did you think about that before posting them?

Xmas · May 23, 2008

Xalion said:
Visual check:
Starting with the original image, we have 2 areas separated by a hard black line.

There is no hard black line unless you view the original size, 5 pixel wide image on a monitor that actually displays pixels as little flat-colored squares.

Simon F · May 23, 2008

Xmas said:
There is no hard black line unless you view the original size, 5 pixel wide image on a monitor that actually displays pixels as little flat-colored squares.

And if one assumes a model like that, then, except for regions of constant colour, there are hard edges everywhere!

I quite like Alvy Ray Smith's mantra (click the red dot)***:

"A pixel is not a little square, a pixel is not a little square, ...

*** Though I personally like to have pixels centred on 1/2 integer locations

MfA · May 23, 2008

I'll add my own then ... a pixel is not a little sinc either, a pixel is not a little sinc either.

I think Alvy Ray did future generations a disservice by dragging sampling theory into it and talking about footprints without explaining that unless the footprint of a pixel is a sufficiently wide untruncated Sinc there will be aliasing components in the image ... that aliasing is in fact an expected and desirable quality in images. Nowadays the blind application of sampling theory seems far more widespread than the pixel as a square model. I have seen too many "Sinc interpolation is ideal but rounding errors and border issues create the ringing" type statements (which is bullshit, sinc rings because it has large negative taps at large radii).

Even a hard edge blurred by a 3-5 pixel wide truncated gaussian is still far too hard to stay below the Nyquist limit, let alone the effectively 2 pixel wide reconstruction filter which are our monitors. So yes, no matter how you slice it (before sampling or after reconstruction) there ARE hard edges everywhere in an image. That's where the whole theoretical framework for sampling theory comes crashing down around us and we are left with no correct way of doing things, but only good looking ways.

Xalion · May 23, 2008

Xmas said:
There is no hard black line unless you view the original size, 5 pixel wide image on a monitor that actually displays pixels as little flat-colored squares.

If I changed them to circles on a black background, you would still get a hard black line (like you might see in a projection screen television). Or If I had them as 3 small color blocks with a white one in one corner (like you might see in an LCD monitor). Or if I used multi-colored LEDs and just left them off (like you would see in a stadium television).

What I used was for mathematical simplicity. It demonstrates the effect - and demonstrates it very well. The averaging that happens when you upscale happens regardless of pixel shape. It is really not worth our time to get hung up on what shape was used to illustrate the point.

betan · May 23, 2008

MfA said:
I'll add my own then ... a pixel is not a little sinc either, a pixel is not a little sinc either.

I think Alvy Ray did future generations a disservice by dragging sampling theory into it and talking about footprints without explaining that unless the footprint of a pixel is a sufficiently wide untruncated Sinc there will be aliasing components in the image ... that aliasing is in fact an expected and desirable quality in images.

It's expected, but not really desirable beyond that.
I would much prefer a first (or second) order hold reconstruction on my TV than the cheap 0th order that your LCD is emulating. I'm going to argue most people prefer that for images, even without knowing, otherwise displays wouldn't upscale lower resolutions.

Even a hard edge blurred by a 3-5 pixel wide truncated gaussian is still far too hard to stay below the Nyquist limit, let alone the effectively 2 pixel wide reconstruction filter which are our monitors. So yes, no matter how you slice it (before sampling or after reconstruction) there ARE hard edges everywhere in an image.

All I read here can be explained by reconstruction error.
How is it related to the hard edges of an image "before sampling" or "before reconstruction"?

That's where the whole theoretical framework for sampling theory comes crashing down around us and we are left with no correct way of doing things, but only good looking ways.

Information theory very well covers reconstruction too, it's not sampling theory's fault that our displays are lame.

MfA · May 23, 2008

Consider a RL camera, what's the footprint of a sample? Close to a convolution between a gaussian and a box ... will filtering with that footprint result in an image with only content below the Nyquist limit? (Sampling with a footprint can be seen as equivalent to filtering with the footprint followed by point sampling.) No.

Same questions with texture sampling (assuming isotropy). Something close to a mexican hat most of the time, something close to a gaussian for high end. Again, filtering with such footprints will not remove above Nyquist limit frequencies.

Same questions with supersampling ... again at best you are going to be using gaussian weighting, so no.

Sure a very large gaussian footprint can remove most aliasing (not all, because you can not guarantee constructive interference won't raise the aliasing above the quantization threshold) but generally we prefer a little aliasing in our images rather than a lot of blur (and we prefer either to the ringing mess which would result from true sinc footprints).

Imaging is inherently aliased every which way and we like it that way ...

PS. a truncated or windowed sinc isn't a sinc, X-th order hacky sinc based filters are no more correct than any other type of hacky filter (not that I dislike hacky filters, it's just that they should be judged on their merits and not their pedigree).

The Nature of Scaling (Signal Processing Edition)

betan

Xalion

StefanS

meandering Velosoph

betan

Xalion

-tkf-

Xalion

corysama

betan

Xalion

betan

Shifty Geezer

uber-Troll!

Xalion

Xalion

Xmas

Porous

Simon F

Tea maker

MfA

Xalion

betan

MfA

Similar threads