You know what? Several things you posted are self contradictory. I am not sure what you are trying to argue anymore. So lets start at a very simple picture, go through the upscaling process one step at a time, and then you can tell me where you think it is wrong. This image will be the basis for the discussion.
Image 1:
n=5
intensity values = {1, 1, 5, 2, 2}
Normalized DFT values = {0.831522, 0.289018, 0.265996, 0.265996, 0.289018}
This picture will serve as our baseline. The picture we compare everything too. Our upscaling process is going to be simple first order interpolation. Meaning we draw a line between every pixel and calculate the inserted values based on that. I've written a computer program to do it, however I will keep the first couple of upscaling small so you can calculate all of the inserted pixels yourself to make sure they are correct. These will be categorized by the number of pixels we insert in between each pixel.
1 pixel insert:
n=9
intensity values = {1, 1, 1, 3, 5, 7/2, 2, 2, 2}
Normalized DFT values = {0.873131, 0.284212, 0.177299, 0.0767834, 0.0267113, 0.0267113, 0.0767834, 0.177299, 0.284212}
2 pixel insert:
n=13
intensity values = {1, 1, 1, 1, 7/3, 11/3, 5, 4, 3, 2, 2, 2, 2}
Normalized DFT values = {0.882523, 0.283273, 0.159256, 0.0555189, 0.0197686, 0.0199317, 0.0331723, 0.0331723, 0.0199317, 0.0197686, 0.0555189, 0.159256, 0.283273}
10 pixel insert:
n=45
intensity values = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 15/11, 19/11, 23/11, 27/11, 31/11, 35/11, 39/11, 43/11, 47/11, 51/11, 5, 52/11, 49/11, 46/11, 43/11, 40/11, 37/11, 34/11, 31/11, 28/11, 25/11, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2}
Normalized DFT values = {0.890959, 0.28326, 0.141029, 0.0376519, 0.0155979, 0.0168026, 0.0189081, 0.0110707, 0.00812032, 0.00812838, 0.00895059, 0.00684143, 0.00579968, 0.00573862, 0.00614919, 0.00534881, 0.00480776, 0.00476649, 0.0050594, 0.00476095, 0.0044059, 0.00439216, 0.0046779, 0.0046779, 0.00439216, 0.0044059, 0.00476095, 0.0050594, 0.00476649, .00480776, 0.00534881, 0.00614919, 0.00573862, 0.00579968, 0.00684143, 0.00895059, 0.00812838, 0.00812032, 0.0110707, 0.0189081, 0.0168026, 0.0155979, 0.0376519, 0.141029, 0.28326}
50 pixel insert:
*Note: This one has enough pixels that posting the above numbers would take pages. As such, it is just for reference.
Now we can start going through individual points raised in this thread.
1) Does upscaling blur an image when using interpolation to fill in mixing pixels?
Visual check:
Starting with the original image, we have 2 areas separated by a hard black line. Inserting 1 pixel we already get 2 areas separated by 2 lighter gray lines and a black line. As we go through to an upscaling factor of 50, we get a very blurry center line separating the two fields. Note that something else happens in this picture as well, which is why it is included. At this point, I have enough pixels that I have surpassed the human eye's ability to individually distinguish them. The eye just does not have the spacial resolution to do so. As such, you probably see a black bar in the center that is similar in size to the original black bar. If you move your eyes closer and further from this particular picture, you will see the location of that bar shift. THAT effect is what happens when you have too many frequencies in a small area. Notice that the blur will still exist on the edges at any distance - but now you have an added effect where the image can change based on viewpoint.
So at this point we can conclude with 100% confidence that upscaling - at least by this algorithm - blurs the upscaled image. There is no question as to whether or not it happens, no amount of math that can claim it doesn't, no "distance based frequency domain of the human eye" to say it doesn't exist. The pictures clearly show it exists and clearly show that it can be seen by the human eye.
That in and of itself ends the original argument. Upscaling (at least by interpolation) does indeed blur images. As I said clear back in the beginning, some algorithms are better at it than others. The generalized theorem to prove this deals with averages. When you take an average to fill in a pixel you have to lower nearest neighbor frequencies. That is unfortunately a real effect of taking an average.
2) If we define blur as a type of high pass filter, does upscaling repress high frequencies?
Frequencies here are defined as the intensity difference between pixels. So a "high" frequency produces "sharp" lines. In other words, a high frequency corresponds to a sudden change from one intensity to another much higher or lower. Those definitions just give context to the discussion.
Let us just start with that definition and look at the first two pictures. If the frequency is defined as the rise and fall of intensities between pixels, we can just perform the calculation to get:
{0,4,-3,0}
and
{0,0,0, 2, 2, -1.5, -1.5, 0, 0, 0}
Now, the highest frequency in the second series is 2. The highest in the first is 4. So we could definitely describe this situation by a suppression of high frequencies. Low frequencies made it through fine, but high frequencies tended to be depressed.
This is admittedly not mathematically rigorous. So we need to develop another method for doing this comparison. So at this point is is worthwhile to stop and talk about Discrete Fourier Transforms. What a DFT does at its most basic level is to transform a set of points into a set of numbers that correspond to amplitudes and phase shifts for Sine functions whose frequencies can make up the picture. A practical example might be in order. So let us take the DFT of our original picture and consider it first. Here is the full DFT:
{4.91935+ 0. I, -1.67082 + 0.363271 I, 0.32918- 1.53884 I,
0.32918+ 1.53884 I, -1.67082 - 0.363271 I}
Each of these points has an amplitude and a phase corresponding to a particular frequency. While I wont go through the math here, the first value is the LOWEST frequency. It is important to understand that when we start talking about plots in frequency space. Note that the number is an amplitude - it does NOT indicate high frequency. These numbers are hard to compare. However, we can normalize and take the amplitude for these values. That gives us:
{0.831522, 0.289018, 0.265996, 0.265996, 0.289018}
Please note that the normalization wont affect anything from this point on. The results would be the same with or without. Normalizing the vectors just allows for smaller numbers to be compared. All of the DFT values I have given are normalized amplitudes. Phase isn't really important for us in this discussion. Before doing any labeling it is also worthwhile to note that some numbers are repeated here. If you look in the original set of numbers why that happens becomes apparent. These are frequencies where the amplitude is the same but phase is opposite. This is expected. This generally leads people to write DFTs shifted so that the lowest frequency (the first number) is in the middle.
If you even see a frequency diagram that is symmetric about the middle rather than the 4 edges, the low frequency is always the color of the point in the middle. If the author tells you differently, he misunderstood his own work.
Back to the example. We can label the frequency terms as:
0.831522 - Low
0.289018 - Middle
0.265996 - High
Remember that these are amplitudes so we can make direct comparisons. Now, lets take a look at the smallest interpolation. Extracting the same values (remember that each correspond to the same frequency in a Sine wave expansion) we get:
0.873131 - Low
0.284212 - Mid
0.177299 - High
Now, we can immediately see what has happened. The Low frequency was slightly amplified. IE - it's amplitude increased. The mid and high frequencies were depressed. Their amplitudes decreased. Note that a sum of all the smaller high frequencies that result as a byproduct of increasing the frequency domain would still not be equal to the original high frequency mark. Maybe it was just a fluke though. Lets take the second set:
0.882523 - Low
0.283273 - Mid
0.159256 - High
The same pattern exists! As a matter of fact, it is amplified this time. We can take the 10 pixel insert as well and look at it:
0.890959 - Low
0.283260 - Mid
0.141029 - High
What have we now shown? Simple - that when upscaling by linear interpolation high frequencies are suppressed. At this point, there really is no counter argument. It is clear from the numbers what happened. Does it make sense with what we see visually? Of course! The pictures still look blurred. The last question to ask is can we represent this visually?
3) How do we visual representation frequency domain and how do we interpret those representations?
Because we have amplitudes for each of the frequency components, the easiest thing to do would just be to plot those amplitudes. Here is that done for our original picture:
Notice that the color represents the amplitude for the frequency. The relative position indicates whether a frequency is lower or higher. The problem is if we were to use our nomenclature from above this graph would represent:
Low, Middle, High, High, Middle
Because of the periodicity of our frequencies appearing at different phases, this diagram is not easy to read. So generally we shift so the LOWEST frequency is in the middle and the high frequencies are all on the outside - like this:
(note - post limited to 6 images, so these next 2 are just links to the images)
Frequency shifted graph for original picture
Now we have a graph that represents:
High, Middle, Low, Middle, High
This is useful. We can see the low frequencies clearly in the center. If we were to do the same thing for one of our more complicated graphs - say the inbetween=50 graph, we get:
Frequency Shifted graph for inbetween=50
Notice the huge blue bars on either end here. This is the first place we can actually talk about Nyquist's theorem. Nyquist's theorem determines the maximum number of frequencies you need to sample a given wave with a given frequency. Using it, you can show what the maximum frequency you can measure out of a number of samples is as well. As we increase the number of frequencies, we increase the number of pixels. So we are sampling areas where there was no information in the original picture. So the only possible result is almost nothing in those areas. IE - you should get amplitudes that are very low or next to 0 - just as is shown in this bar.
However, look at what happens near the middle bar. In our original picture, the middle bar was a red stripe surrounded by blue. In this picture, there are clear white stripes around the blue bar. Remember that these are in the low frequency area. What do they indicate? They indicate that low frequencies were shifted up slightly. That depresses high frequencies as shown above.
Notice that
just claiming the line exists in both pictures does not mean the high frequencies are the same, especially as it does not represent the high frequencies at all. Just like if you had a word - say "beta" - embedded in the images the location of the word would not tell you anything. It should be there. What will tell you something is the colors around that word. If they go from say - yellow in the original to blue and red lines in the upscaled version - it is an obvious suppression of high frequency components.
So, all 3 sources are consistent. You can see visually that blurring appears. You can see in the DFT that blurring appears. You can see in the plotted phase diagrams that blurring appears. Consistent start to finish, one interpretation and no room for "yeah buts". There really is no wiggle room here. The effect is clearly visible to the eye. It shows up in the math. It shows up in the phase diagrams. It is in peer reviewed published journals. You can find it in introductory signal analysis text books.
Now, we can go back to bickering over math, or you can just look at the pictures in this post and see that the effect exists. If the effect exists, it MUST appear somewhere in the math. I've already shown you where, but if you think you have a better explanation that would give this effect without changing the math, I'd love to hear it.