The Nature of Scaling (Signal Processing Edition)

What I used was for mathematical simplicity. It demonstrates the effect - and demonstrates it very well. The averaging that happens when you upscale happens regardless of pixel shape. It is really not worth our time to get hung up on what shape was used to illustrate the point.
It is, because perception is all that matters. Averaging can also happen in the display device as well as the eye. When an upscaling algorithm performs the same averaging, the percieved result is also the same.

Take for example a monochrome CRT without a shadow mask. It shows each line as a contiguous signal. If you saw one line you could determine a lower bound for the horizontal resolution of the digital image displayed, but you couldn't give an upper bound. In other words, you could not tell whether the image was upscaled or not.

On the other hand, if you had two monitors of the same size which showed pixels as perfect flat-colored squares, one with double the resolution in both dimensions, then simple pixel doubling would make the higher resolution monitor show exactly the same image as the other one.
 
It is, because perception is all that matters. Averaging can also happen in the display device as well as the eye. When an upscaling algorithm performs the same averaging, the percieved result is also the same.

Just because the eye cannot see it doesn't mean it isn't happening. Now, there is a commonly used chart you to tell you if you can see an effect like this. Studies have shown that the eye can distinguish about 1/60th of a degree. Effects that are smaller than that tend to get washed out. While individual displays will react differently (and EVERY individual display will act differently, so we only have a few hundred thousand different models to discuss if you really want to go through display by display), this is where most people would draw the limit for seeing changes based solely on screen size.

This is the most famous chart used to display this:
resolution_chart.png


Once again,the actual mathematical effect of high frequency suppression from upscaling by any form of interpolation is separate from the discussion of what an individual might see. What an individual sees is affected by their own eye sight, the monitor they are looking at, the light in the room, their distance from the monitor, the weather outside, what they ate for dinner, how long they've been home from work, the average ion density of the solar corona (ok, maybe that stretching it a bit but you get the picture), ect. There are too many individual factors to consider. For every instance that someone might say a certain monitor wont show it, another monitor can be shown that will.

Instead, it is far more productive to point out that the effect will exist in frequency space for any image you can represent as a set of pixels. Then you can look at individual devices to see what their tolerance for frequency shifts are. Also keep in mind that while I used a verticle line, the effect is equally present for horizontal or even angled lines.

On the other hand, if you had two monitors of the same size which showed pixels as perfect flat-colored squares, one with double the resolution in both dimensions, then simple pixel doubling would make the higher resolution monitor show exactly the same image as the other one.

While line doubling doesn't depress high frequencies so to speak, it is ONLY works well for aspect ratios that are integer multiples in BOTH directions. This is rarely the case, so line doubling can end up stretching or skewing images. I do think it is worth noting that many displays still employ line doubling as a cheap method to fit images to screens. However, that is not the case for the products that were originally in question in this thread, and indeed is not the case for most consumer electronic devices marketed today as upscaling.
 
Instead, it is far more productive to point out that the effect will exist in frequency space for any image you can represent as a set of pixels. Then you can look at individual devices to see what their tolerance for frequency shifts are. Also keep in mind that while I used a verticle line, the effect is equally present for horizontal or even angled lines.
Your original claim was that upscaling "destroys hard lines". Compared to what? And how do you know there were hard lines in the original image to begin with?

Take your example above:
Image 1:
barsexample0.jpg

n=5
intensity values = {1, 1, 5, 2, 2}
Here you choose a specific upscaling algorithm to illustrate (you could even say "reconstruct") the signal: nearest neighbour sampling. By doing so you're already biasing your argument. The image shows the samples as big flat-colored squares with hard edges. These edges do not exist in the actual sampled signal, they're only there because you used that particular upscaling algorithm to generate the image.
 
Your original claim was that upscaling "destroys hard lines". Compared to what? And how do you know there were hard lines in the original image to begin with?

Read the next part of that post:

Xalion said:

Frequencies here are defined as the intensity difference between pixels. So a "high" frequency produces "sharp" lines. In other words, a high frequency corresponds to a sudden change from one intensity to another much higher or lower. Those definitions just give context to the discussion.

Following is an example of performing DFT on a set of pixels that demonstrates mathematically the frequency drop. As I have said many times, some algorithms do better than others. However, the shift is caused by the averaging that takes place when you try to guess at information that was not there to begin with. You will find that the Discrete Fourier Transform I performed makes no assumption about size or shape of pixel. As a matter of fact, the only assumption in makes are that the pixels are in a grid and that their intensities can be represented by a numeric value.

Examples were specifically designed to show this effect. They were chosen for conceptual and mathematical simplicity, so that other people could replicate the math themselves, and because of context in the discussion at the time they were used. That is kind of the point to having examples. I could have chosen other algorithms. For example bilinear sampling tends to do the same thing with line art. On the other hand, 2xSaI or the hqnx algorithms linked earlier in the thread do a great job with it. They don't do so well with photos generally though.

I've provided samples of this happening with real pictures during the thread as well. I've linked to papers describing super sampling and wavelet transforms that also do a better job of keeping high frequency information. Along with being interesting, they also mention the specific need for this type of algorithm is to address this exact problem. I believe there is more than enough information in the thread already to show that this is a real effect.
 
As I have said many times, some algorithms do better than others. However, the shift is caused by the averaging that takes place when you try to guess at information that was not there to begin with.
(Emphasis mine)
You do exactly that in your visual comparison. You compare one image generated with nearest neighbour sampling to another generated with linear interpolation, then you claim that the latter "blurs" the image because it doesn't produce hard edges like the former. But both are guesses. None of them is inherently more correct than the other – the information of what is supposed to be in between the samples just isn't there!

Digital image data typically doesn't fit the sampling theorem because it wasn't sampled from a (sufficiently) band-limited signal. Perfect reconstruction of the original signal is therefore impossible, that information is lost. Every monitor "guesses" when it displays/reconstructs an image. Even one that does not upscale.

Thus I will ask again: your original claim was that upscaling "destroys hard lines" – compared to what?

For example bilinear sampling tends to do the same thing with line art. On the other hand, 2xSaI or the hqnx algorithms linked earlier in the thread do a great job with it. They don't do so well with photos generally though.
See, that is exactly what I'm talking about. Line art is usually meant to be "hard-edged", photos aren't. But that is contextual information, it is not present in the actual pixel data. A monitor cannot infer from pixel data that the image is a photo or line art. Pixel data for line art is inherently aliased because the finite sample rate cannot represent the hard edges the artist wanted to achieve.
 
(Emphasis mine)
You do exactly that in your visual comparison. You compare one image generated with nearest neighbour sampling to another generated with linear interpolation, then you claim that the latter "blurs" the image because it doesn't produce hard edges like the former. But both are guesses. None of them is inherently more correct than the other – the information of what is supposed to be in between the samples just isn't there!

Absolute nonsense. Really, this makes no sense at all.

Are you trying to argue the semantics of the word "hard"? Well, let me repeat since you seem to have skipped it. In this discussion, "hard" and "soft" were defined according to the following:

Xalion said:
Frequencies here are defined as the intensity difference between pixels. So a "high" frequency produces "sharp" lines. In other words, a high frequency corresponds to a sudden change from one intensity to another much higher or lower. Those definitions just give context to the discussion.

Do I need to make it bigger or perhaps repeat it a 4th time? There is no guessing involved in the definition. When I say something is "blurred", I am specifically referring to the fact that high frequencies are suppressed. You may take issue with that definition, but it has been the definition the entire time. "Hard" lines are lines with high frequencies. "Soft" lines are those with low frequencies. There is no question about that.

There is absolutely NO guessing in what I did. I spelled out the EXACT conditions for the experiment. I show visually and mathematically exactly what happens. Several different ways. There is no guessing.

Let us go over what an example is. An example can be defined as "an instance (as a problem to be solved) serving to illustrate a rule or precept or to act as an exercise in the application of a rule". Now, I wanted to illustrate what happens by linear interpolation. So I set up an example. Like most examples, I looked for some key qualities:

1) Simple. The example should not complicate matters beyond what is needed to demonstrate the effect. You can always make things more complicated. However, you rarely should - especially when you are trying to demonstrate and effect.
2) Repeatable. Someone else should be able to take your calculation and repeat it for themselves. Note that this requires you not use any special tools, and normally that you keep the assumed knowledge down to a minimum.
3) Easy to interpret. You don't want people guessing at what your example means.

So let us review my example. First, I chose a 1 dimensional set of points. Those points consisted of 2 color fields and a line between them. Those were defined by a set of points:

{1,1,5,2,2}

That is simple. The small number of points makes it repeatable. It is very easy to interpret. Each number corresponds to an intensity of a pixel. So this suits the need for an example very well. Note that the example is independent of representation.

I choose to examine this example using 3 separate methods. The first was visual. You MUST make some assumptions to represent this visually. I spelled those assumptions out. To make it large enough to see, I used square pixels and expanded it from each number representing 1 point to each number representing a 5x5 grid. This is simple. It is easy to repeat. It is easy to interpret. For upscaling, I choose linear interpolation. Once again, this is simple. You can perform it with a piece of paper if you need too.

Then there is no guessing whatsoever involved. You can step through the example yourself if you need too. It isn't hard. Visually, there is absolutely no guessing. Mathematically, there is absolutely no guessing. You can calculate for yourself the exact moment things happen and why they happen.

Now, the visual comparison does assume square pixels. I realized people might take issue with that. So I did a purely mathematical comparison. Note that the mathematical comparison depends on only 2 assumptions. A: That the picture can be represented by a series of pixels. B: That those pixels are arranged in a grid. That is it. From there, you may need a calculator to actually perform the DFT, but you can repeat the example exactly. There is no guessing. There is no gray area. You can do the math yourself and you will discover the exact same result.

I then gave a method for visually comparing large samples using the above mathematical comparison. Once again, no guessing. Once again, everything comes from exactly where it comes from.

Perfect reconstruction of the original signal is therefore impossible, that information is lost.

No kidding? I mean, it almost is as if everyone involved in this thread hasn't been saying this from the very beginning right? Oh wait, we have.

Thus I will ask again: your original claim was that upscaling "destroys hard lines" – compared to what?

And I have answered you 3 times now. We have been discussing frequency shifts caused by upscaling - specifically linear upscaling. So we have been comparing relative frequencies between an image and its upscaling counterpart.

You are trying to turn this discussion into a subjective wasteland. The problem is there are objective definitions involved. Frequency has been defined as the shift in intensities between pixels. Hard and soft are defined as high and low frequencies respectively. Blur has been defined as a suppression of high frequency.

All of these definitions are independent of pixel shape and monitor. That should explain to you why we can define line art as hard edged regardless of the monitor we show it on. Because the definition does not involve pixel shape or monitor.

What you are trying to argue is basically the equivalent to "certain broken monitors cannot show red. Therefore, red content in pictures can never be compared because it might not always show up!".

The definitions are there. They are independent of pixel shape. They are independent of monitor. I have shown that at least with linear upscaling, my original statement is correct. As a matter of fact, because of the general nature of the DFT proof, to claim my original statement was incorrect would require you to disprove one of the two assumptions. Here they are:

A) Pictures can be represented by a series of pixels
B) Those pixels are arranged in a grid

For the first, I think you are going to have a hard time proving that. For the second, there are indeed some applications where you can't make this assumption. Unfortunately, computer monitors and televisions don't count themselves among these applications.
 
Last edited by a moderator:
You don't have to make it bigger, nor repeat it ad infinitum as you are doing. At least if you want to keep posting at all. Come on now, this isn't the place for ranting until you're blue in the face, even if you think the people peering at your words via their grid of non-square pixels can't understand you unless you do.

Tone down the aggression please.
 
You may take issue with that definition, but it has been the definition the entire time.
I do indeed take issue with that definition. If you sample a 1 Hz sine wave at a rate of 4 Samples/s you get much higher differences from one sample to another compared to sampling at 4000 Samples/s. Yet the frequency of the signal doesn't change at all.

As a matter of fact, because of the general nature of the DFT proof, to claim my original statement was incorrect would require you to disprove one of the two assumptions. Here they are:

A) Pictures can be represented by a series of pixels
B) Those pixels are arranged in a grid

For the first, I think you are going to have a hard time proving that.
Actually, the assumption should be
A) Pictures can be perfectly reconstructed from a finite series of sample points
and that can be trivially disproved: a finite number of sample points cannot represent an infinite number of frequencies.

You're really completely missing my point, so I try to be more precise, and maybe you can tell me where you disagree:

1. Digital image data is typically a lossy representation of an image because the original image was not band-limited to less than half the sample frequency.

2. Applying the DFT to a lossy representation of a signal does not give you the frequency distribution of the original signal. That information is lost.

3. Since the original data is lost there is no gold standard against which you could compare an image reconstructed from the lossy representation.

4. Thus you cannot objectively judge the accuracy of reconstruction of an image.

5. Any attempt at reconstructing the original image from the lossy representation involves guessing the missing data. This reconstruction can change any part of the frequency spectrum compared to the frequency distribution of the sampled data.

6. A monitor that displays a digital image performs reconstruction. It takes a sampled signal and generates a contiguous signal of light spread over an area which then can be picked up by the eyes.

7. From 5 and 6 it follows that every monitor guesses when reconstructing an image. For most monitors the method of guessing is fixed by the display technology, e.g. most LCDs "guess" that each pixel should be represented as three small rectangles.

8. Upscaling prior to display can be considered part of the reconstruction process.

9. The quality of reconstruction of a monitor is subjective and is influenced by display technology and possible upscaling. What is perceived more or less correct depents on the actual image contents. A monitor with a certain upscaling algorithm may be percieved as better than another monitor that shows the image at native resolution, while a third monitor with the same upscaling algorithm may look worse because it uses different display technology.
 
I do indeed take issue with that definition. If you sample a 1 Hz sine wave at a rate of 4 Samples/s you get much higher differences from one sample to another compared to sampling at 4000 Samples/s. Yet the frequency of the signal doesn't change at all.

Unfortunately, the frequency of a Sine wave has never been defined as the relative intensity between samples. So this statement is wrong. The sampling rate of a Sine wave will not change the defined frequency unless that frequency is above the Nyquist limit (ie - you do not have enough samples to sample that high of frequency).

On the other hand, the color of pixels do not follow Sine waves generally. So you must find another way to define frequency. For image processing, this is almost always done by using the definition that spacial frequency is the change of intensities between pixels. This is not my definition, and I encourage you to find out where it comes from.

A) Pictures can be perfectly reconstructed from a finite series of sample points
and that can be trivially disproved: a finite number of sample points cannot represent an infinite number of frequencies.

Absolutely unnecessary for the current discussion. No one has ever claimed that the picture needs to be perfectly reconstructed. It was never an issue. The issue here is what happens when you upscale an image from resolution x to resolution y. Forget about the concept of "original perfect image". It doesn't matter if it exists or not. The only thing important in this discussion is what happens when you take a lower resolution image to a higher resolution.

2. Applying the DFT to a lossy representation of a signal does not give you the frequency distribution of the original signal. That information is lost.

This is where the problem is. The "original" image in question here is the encoded lower resolution image. I have never tried to argue that this image is a perfect replica of whatever it is meant to represent. That does not matter for the discussion. You have an image on DVD. That image is 720x480. You must make that image go to 1920x1080. The question is simple:

What happens to that image under different upscaling techniques.

4. Thus you cannot objectively judge the accuracy of reconstruction of an image.

We are not comparing the reconstruction of an image. Once again, we are comparing the original (unscaled image) to the SAME image that has been upscaled. Objectively, this corresponds to comparing the frequency distribution of the image.

5. Any attempt at reconstructing the original image from the lossy representation involves guessing the missing data. This reconstruction can change any part of the frequency spectrum compared to the frequency distribution of the sampled data.

Once again, no reconstruction involved. What we are trying to do is upscale the image. An image exists - regardless of how you got it, it exists. The image must be drawn on a larger display than it was originally intended for. That is the topic in question. You started with the assumption that ALL upscaled images are representations of some real life thing. That assumption is not true.

We are not concerned with making the object look exactly like the tree outside our window. The ONLY concern here is what happens with the tree on the DVD when we make it larger. We can objectively define the original image as the image on the DVD. We can objectively define the frequency shift as the change in relative intensities between pixels. We can objectively define hard as high frequencies, and soft as low frequencies. We can then objectively discuss the difference between two different scaling algorithms.
 
On the other hand, the color of pixels do not follow Sine waves generally. So you must find another way to define frequency. For image processing, this is almost always done by using the definition that spacial frequency is the change of intensities between pixels.
That definition is only useful if the sample rate doesn't matter. Upscaling means resampling* though, and here the sample rate matters.

Following that definition, to maintain the frequencies while upscaling you'd have to perform nearest neighbour sampling. That effectively means that you define pixels as squares, and I disagree with that definition. Maybe colours don't follow sine waves, but they don't follow squares either.


* which can be defined as signal reconstruction followed by sampling. So there you have reconstruction again.

Once again, no reconstruction involved. What we are trying to do is upscale the image. An image exists - regardless of how you got it, it exists. The image must be drawn on a larger display than it was originally intended for. That is the topic in question. You started with the assumption that ALL upscaled images are representations of some real life thing. That assumption is not true.
That is not my assumption. My assumption is that the data represents a visible image. Reconstruction is necessary to show that image. A digital image cannot be seen without reconstruction because the samples are points. They don't have an area. There is nothing in between them.

You were claiming that upscaling destroys hard lines. The sampled data doesn't contain hard lines. It doesn't even contain lines at all. It contains only sample points. Lines and shapes only come into existence by reconstruction. So if you say there is no reconstruction involved, then there are no hard lines to destroy.
 
That definition is only useful if the sample rate doesn't matter. Upscaling means resampling* though, and here the sample rate matters.

No, that definition DEFINES each pixel as a SAMPLE, which DEFINES the sample rate for the picture. That is why it is useful in image processing. Because you cannot predefine sampling rate or pixel size, and MUST work with the information given. Hence the reason why it is useful to show what happens during upscaling. It treats each pixel as an independent sample of intensity and assumes no prior knowledge of what is contained in the image or shape of pixel. Like I said, that is not my definition. This is the most common definition used in journals, text books, scholarly articles, and even image processing in general.

Following that definition, to maintain the frequencies while upscaling you'd have to perform nearest neighbour sampling.

Part of this is my original claim. As I said, upscaling tends to depress high frequencies. Like I stated in the beginning - line doubling (which is a form of nearest neighbor sampling) maintains high frequencies. However, because of differing aspect ratios it can stretch images.

So, let us just state this clearly for everyone. Using the common definition of frequency in image processing that frequency is the change in intensities between pixels, you are ready to admit that high frequencies are not maintained during some forms of upscaling?

I say some forms because there are algorithms (see 2xSaI or hqnx) that actually do a fairly good job at maintaining high frequencies. They have other problems when dealing with real images though.

That effectively means that you define pixels as squares, and I disagree with that definition.

No, it means you define pixels as samples. They are individual samples of the frequency of whatever it is you are trying to represent. It does not matter if they are square, round, triangular, hexagonal, ect. The DFT is independent of pixel shape. It does assume you can represent the samples in a grid - but that is the only assumption it makes.

Actually, we can really end this argument about squares quickly. If I assumed the pixels were squares, please explain the picture for the two pixel case (where each pixel is clearly represented by a rectangle). In reality, I defined the pixels as samples. Then to go from 1 dimensional to 2 dimensional I used a fixed horizontal aspect ratio and copied the horizontal line a sufficient number of times to make the pictures visible. If you check, you will find I represented all of the pixels as rectangles - not squares. The first case is the only one where those rectangles happened to add into perfect squares.

This is the fundamental flaw with your entire argument. The pixels are samples, they are not squares - even in the pictures I provided. The DFT treats them as independent samples, it does not treat them as squares. No where in my math does it assume a square. It always assumes a sample.

You were claiming that upscaling destroys hard lines. The sampled data doesn't contain hard lines. It doesn't even contain lines at all. It contains only sample points. Lines and shapes only come into existence by reconstruction. So if you say there is no reconstruction involved, then there are no hard lines to destroy.

The example was defined by a 1 dimensional set of points. That set of points was clearly given as {1,1,5,2,2}. Hard lines were CLEARLY defined as high frequency shifts and frequency was CLEARLY defined as the change of intensities between points. That image was then upscaled. It was CLEARLY shown how the frequencies shifted. The ONLY reason you could bother to claim that there weren't "lines" is because the example was one dimensional. You are right, in a horizontal one dimensional example there aren't vertical lines. In general terms however the definition of a line is independent of reconstruction. For example, take the sample set:

{{1,1,5,2,2},{1,1,5,2,2}, {1,1,5,2,2}}

Now, you do have to assign a coordinate system. In this case, it makes sense to start with (1,1) as the first pixel and (5,3) as the last with those coordinates representing (x,y) respectively. In this case there is a clear vertical line of the value 5 defines with equation x=3. Notice - no reconstruction necessary. The line exists anyway! Moreover, if you define the frequency shift between 1 and 5 as "high" it is perfectly valid to call that a hard line. Line and hard have always been used in this sense in the thread. At this point, you cannot redefine line to mean something else and use the new definition to claim I am wrong.

At this point, this is nothing more semantics game. You will actually find that this game was already played earlier in the thread. It led to the very precise definition for the effect I am talking about. Here is the definition one more time:

Frequency is defined as the change of intensities between points. High frequencies represent large changes. Low frequencies represent small changes. Upscaling by linear interpolation suppresses high frequencies. Upscaling by averaging in general tends to have the same effect, although the actual results depend on algorithm. These are not definitions I just made up and are fairly common. They are the basis for this discussion.
 
No, that definition DEFINES each pixel as a SAMPLE, which DEFINES the sample rate for the picture.
Sample rate is the number of samples per distance. You have to define a location of the samples to have a sample rate. Upscaling increases the sample rate. i.e. the sample density, therefore defining frequency as the difference between samples only makes sense if you assume a constant sample rate.

If you're claiming that sampling a given image at a higher rate suppresses high frequencies compared to sampling it at a lower rate (hence my sine wave example) then I simply disagree with your definition.

No, it means you define pixels as samples.
You say that nearest neighbour sampling is required to maintain the frequencies of the sampled image, i.e. to preserve the contents. Infinite upsampling with nearest neighbour samples results in each pixel becoming a square (or rectangles, depending on the aspect ratio). Hence you're claiming a pixel should be reconstructed as a square. And I disagree with that.

This has nothing to do with your DFT, your math or your example. It's about your definition.

In this case, it makes sense to start with (1,1) as the first pixel and (5,3) as the last with those coordinates representing (x,y) respectively. In this case there is a clear vertical line of the value 5 defines with equation x=3. Notice - no reconstruction necessary. The line exists anyway!
Three points don't make a line. Samples are points. They are not connected. You don't know what is between them, that data is lost. You have no knowledge about what the value at, say, (1.5, 3) is supposed to be. You have to guess.

That's why reconstruction is necessary.
 
Sample rate is the number of samples per distance. You have to define a location of the samples to have a sample rate. Upscaling increases the sample rate. i.e. the sample density, therefore defining frequency as the difference between samples only makes sense if you assume a constant sample rate.

If you're claiming that sampling a given image at a higher rate suppresses high frequencies compared to sampling it at a lower rate (hence my sine wave example) then I simply disagree with your definition.

Once again - upscaling does NOT resample an image. There is no "original" image to resample. If you take a picture of a tree with a camera that samples at a rate of 720x480, upscaling does not go out and find that tree to resample at a rate of 1920x1080. It can't. You are not sampling at a higher rate. You are averaging existing samples to increase size.

I think your confusion stems from what sample rate actually is. Let us take your Sine wave for example. If you sample the wave at 10hz, and then interpolate to fill in values at 20 hz, you will not get a perfect Sine wave - especially if you use linear interpolation. Interpolation is not resampling. You need to understand that before we continue. You do not have the original object to resample. If the image was computer generated, you do not rerender. If the image was taken with a camera, you don't go out and retake the image. If the image was painstakingly designed pixel by pixel you don't go get the artist to redo their work. You do some form of averaging what is already there.

You say that nearest neighbour sampling is required to maintain the frequencies of the sampled image,

No - YOU said that it was required. I specifically pointed to two other forms of upscaling that do a good job of maintaining frequency. Once again, those are 2xSaI or hqnx. Once again though, those have their own problems with upscaling real images. They tend to do "well" on computer games - especially slightly older ones. They don't do as well on DVDs. I have said that several times now.

i.e. to preserve the contents. Infinite upsampling with nearest neighbour samples results in each pixel becoming a square (or rectangles, depending on the aspect ratio). Hence you're claiming a pixel should be reconstructed as a square. And I disagree with that.

No, actually you will find that I have said that nearest neighbor sampling has serious problems with upscaling this entire time. Yes, it maintains frequency. However, I have never claimed that maintaining frequency is always a good thing. I have pointed out several times that it has undesirable effects, especially for non-fixed aspect ratios.

Once again, review the discussion. I said that upscaling tends to "soften" images when it averages between pixels. That statement specifically referred to the suppression of high frequencies in an image when you average to increase size. If you read the original statement, you will even say that I specified that some people like this effect, some people don't. The question is whether or not the effect exists. It is not if it is good or bad. It isn't about what form of upscaling should be used.

This has nothing to do with your DFT, your math or your example. It's about your definition.

Once again, this is not MY definition. This is THE definition used to represent images in frequency space. In ANY publication where you have seen a frequency map, this is the definition that was used. This is the implied definition used when you encode a jpeg. As a matter of fact, jpeg encoding uses a Discrete Cosine Transform that is very similar in function to a DFT and then intentionally suppresses certain frequencies. This definition is the definition used when doing most lossy encoding.

As I have said before, you should really take some time to find out why this definition is used. It is well worth understanding the benefits involved.

Three points don't make a line.

You can make a line between any 2 points. 3 points actually makes a really well defined line. As a matter of fact, the ability to connect points with lines is the basis for first order interpolation. You connect the two points with a line where the slope is the change in frequency and pick inbetween points based on that line.

Note that this whole discussion on what a line is doesn't really apply to the original subject. When I said hard line, it was obvious what it referred too. Trying to say that the picture has no line now is a hand waiving argument. It wont change the frequency shift I demonstrated in the slightest.
 
Last edited by a moderator:
It seems to me, as a casual visitor, that this discussion is very much at cross purposes with itself, and the involved parties aren't following/communicating well enough to understand each other's points such that they can present arguments for and against the aspects of scaling theory.

Just taking this one post as example :
I think your confusion stems from what sample rate actually is. Let us take your Sine wave for example. If you sample the wave at 10hz, and then interpolate to fill in values at 20 hz, you will not get a perfect Sine wave - especially if you use linear interpolation.
Why do you think xmas was getting confused with interpolation? He said...
xmas said:
I do indeed take issue with that definition. If you sample a 1 Hz sine wave at a rate of 4 Samples/s you get much higher differences from one sample to another compared to sampling at 4000 Samples/s. Yet the frequency of the signal doesn't change at all.
He's comparing sampling rates of a fixed frequency, and pointing out that the frequencies within those samples differ. There's nothing there about interpolation or original signal recreation. It's all about frequencies of

And again:
You can make a line between any 2 points. 3 points actually makes a really well defined line.
Are you saying that this
...
is a continuous line? You're right in that it is possible to mathematically define a line with two or more points, or to find a curve that three points sit on, but that is not the same thing as having three discrete points and calling them a line.

As a casual observer, I'm not seeing two+ people talking about the same thing within the same contextual understanding. We've already had some heated moments. I know it can be frustrating when someone isn't following, but that doesn't mean they're being thick, nor obtuse. Any knowledgeable people should be able to talk about the same subject on the same level without feeling like they're hitting their heads against a brick wall. To me, I feel that the people here know what they're talking about but are coming at it from such different directions, due to the innate inaccuracies of the English language, that the discussion isn't really flying. I hope future posts pay more consideration to how ideas are being expressed, rather than what those ideas are that are trying to be expressed, to avoid ambiguities and lost meaning.
 
Once again - upscaling does NOT resample an image. There is no "original" image to resample. If you take a picture of a tree with a camera that samples at a rate of 720x480, upscaling does not go out and find that tree to resample at a rate of 1920x1080. It can't. You are not sampling at a higher rate. You are averaging existing samples to increase size.
Resampling is the process of changing the sampling rate of a discrete signal. I hope we can agree on that definition. It does not involve sampling the original image again.

Resampling is involved e.g. when you convert 44.1kHz digital audio to 48kHz sample rate. Based on that definition I consider upscaling a form of resampling. If you disagree with that definition it would be nice if you could point out why.

I think your confusion stems from what sample rate actually is. Let us take your Sine wave for example. If you sample the wave at 10hz, and then interpolate to fill in values at 20 hz, you will not get a perfect Sine wave - especially if you use linear interpolation.
Since in practice an ideal sinc does not exist, perfect reconstruction is indeed just a theoretical construct. But you can get close enough for any practical purpose. I did not mention linear interpolation here.

Interpolation is not resampling. You need to understand that before we continue.
Interpolation would be reconstruction, which is only one half of resampling. The other half is sampling the reconstructed signal.

Once again, this is not MY definition. This is THE definition used to represent images in frequency space. In ANY publication where you have seen a frequency map, this is the definition that was used.
"Computer Graphics: Principles and Practice, Second Edition in C" (Foley et al.) does not define frequency as "the intensity difference between pixels". It simply uses the usual definition of sine wave cycles per unit distance which most people interested in signal processing are familiar with.

You can make a line between any 2 points.
I agree that you can make a line between 2 points by connecting them. You fill in the blanks between them. That is called reconstruction.
 
Resampling is the process of changing the sampling rate of a discrete signal.

This I can agree with.

It does not involve sampling the original image again.

Resampling is involved e.g. when you convert 44.1kHz digital audio to 48kHz sample rate. Based on that definition I consider upscaling a form of resampling. If you disagree with that definition it would be nice if you could point out why.

This is where the problem occurs. For instance, take the following picture.
sinwaves.jpg


Now, if you originally sampled the purple wave (higher frequency) at the points 0, .25, .5, 7.5, and 10, then you will get a maximum frequency of 1. This gives you the blue Sine wave.

At this point, you want to increase the sampling rate. If you sample the original frequency at a much higher sampling rate - say every .01, then you will get the purple Sine wave and a frequency of .2. On the other hand, look at what happens if you just try to fill in values based on your original small sample rate. Without prior knowledge of what the function is, there is no way to return the purple wave. If you assumed it was a Sine wave, you would get the blue wave back - which actually is not terribly good at representing the purple wave. Also at issue here is how you actually fit the points to increase the number of samples. If you did not know it was a Sine wave and just used linear interpolation between points, you would not even get the blue curve back.

So these situations must be distinguished. The highest possible frequency in a sample is defined by the original sampling rate. I believe that this is a better way of defining "changing sampling rate". With sound, the highest frequency humans can hear generally clocks in at around 15-20 kHz, which corresponds to a sampling rate of between 33 and 45 or so. So I generally have no problem with people saying they are "resampling" as long as they are at sampling rates high enough that you get similar results by increasing the number of samples from the original.

With graphics, that is rarely the case. You rarely get information sampled at a high enough rate to preserve all detail. So I think it disingenuous to say you are "increasing the sampling rate". While you increase the overall number of samples, you really didn't increase the rate that the original picture was sampled at. You have lost information, and are not really recreating that information. You are just making what you have larger. As sample rate is tied in so closely with the maximum possible frequency of an image, I do not believe you should refer to upscaling as resampling.

"Computer Graphics: Principles and Practice, Second Edition in C" (Foley et al.) does not define frequency as "the intensity difference between pixels". It simply uses the usual definition of sine wave cycles per unit distance which most people interested in signal processing are familiar with.

Just out of curiosity, how does Foley get those "sine wave cycles per unit distance"?

If he uses a discrete Fourier transform (or discrete Cosine or Sine transforms), that is the definition I gave, just stated another way. To be much more precise:

The discrete Fourier transform takes a "sampled" value (like the intensity) and maps it to a set of complex numbers that represent the amplitude and phase of Sine functions with frequencies defined by the number of samples and the number of points (k/N generally where k rangeds from 0 to N-1). In other words, the frequency (which is the Sine wave cycles per unit distance) is defined by the differences between samples. For graphics, the most commonly sampled value is intensity because of the way the human eye works. Sampling color works the same way though.

If you need a better explanation, you can see A Simplified Approach to Image Processing by Randy Crane or I can link you to several web sites. However, once again - it would be WELL worth your time to understand where my definition comes from. I am not saying that to be condescending - it is just easier to understand something when you work it out yourself. Here is a site with a series of examples that will help you link the definition you just gave with the one I gave. Specifically, watch what happens to the image when he cuts out low or high frequencies. You will find that sudden changed in intensity correspond to high frequencies, and gradual changes correspond to low frequencies.

If you don't like his examples (there are many pages with similar examples, he just provides very clear pictures so you can see and follow along), there are others. For another example, it would be useful to work through the encoding of a jpeg image.

One way or another, you must have SOME way to extract frequencies from a set of discrete points. If you think it is some method other than the DFT or DCT, I would love to hear that method. Until then, the DFT and DCT return the definition I gave.
 
With graphics, that is rarely the case. You rarely get information sampled at a high enough rate to preserve all detail. So I think it disingenuous to say you are "increasing the sampling rate". While you increase the overall number of samples, you really didn't increase the rate that the original picture was sampled at.
I very much doubt Xmas needs that explained... :) The problem here is simply that of the definition of 'resampling'; i.e. does it imply sampling the original data again, or are you actually sampling the already-sampled data? So clearly if resampling means sampling the original data again, then there must be a word for sampling/interpolating/... the sampled data to change some of its characteristics; so what's that word? Assuming there is indeed one, that'd make this discussion much simpler I suspect...
 
I'll add my own then ... a pixel is not a little sinc either, a pixel is not a little sinc either.
I won't disagree with you.

Even a hard edge blurred by a 3-5 pixel wide truncated gaussian is still far too hard
By "hard edges" I was implying a discontinuity.
 
With graphics, that is rarely the case. You rarely get information sampled at a high enough rate to preserve all detail. So I think it disingenuous to say you are "increasing the sampling rate". While you increase the overall number of samples, you really didn't increase the rate that the original picture was sampled at. You have lost information, and are not really recreating that information. You are just making what you have larger. As sample rate is tied in so closely with the maximum possible frequency of an image, I do not believe you should refer to upscaling as resampling.
I am aware of the effects of sampling below the Nyquist rate, and I agree that the information on any frequencies above half the sample rate is lost (or turned into aliasing).

Indeed that was the base of my objecting to the statement that "upsampling destroys hard lines": since the maximum frequency in the low-resolution image is lower than what a high-res image could represent, I find it difficult to consider lines in the low-resolution image "hard". Methods such as nearest neighbour upscaling actually make edges "harder" than they are in the low-res source image.

Yes, resampling is not recreating information, but it is still defined as changing the sample rate. And while sample rate determines the maximum representable frequency, it is not a claim that a given signal contains that frequency. It is just the number of samples per unit length. That is why I consider upscaling resampling.

If he uses a discrete Fourier transform (or discrete Cosine or Sine transforms), that is the definition I gave, just stated another way.
The problem I see with the definition of frequency as "the intensity difference between pixels" is that it makes no statement about distance (and therefore sample rate), which the definition I gave does.

That is where the example I gave of sampling a sine wave of a given frequency at different sample rates (above the Nyquist rate) comes in. Using a higher sample rate in this case means the differences between samples will be smaller. Yet I don't think you could say that this changed the frequency content of the signal.
 
Indeed that was the base of my objecting to the statement that "upsampling destroys hard lines": since the maximum frequency in the low-resolution image is lower than what a high-res image could represent, I find it difficult to consider lines in the low-resolution image "hard". Methods such as nearest neighbour upscaling actually make edges "harder" than they are in the low-res source image.

This is arguing the definition of the word "hard" though. This was done before. I was more specific than the general statement. "Hard lines" refer to high frequency signals in the Fourier domain. So the issue at question became not the definition of hard line, but whether or not upscaling suppresses high frequency in the Fourier domain. Lets not turn this back into a game of semantics.

The problem I see with the definition of frequency as "the intensity difference between pixels" is that it makes no statement about distance (and therefore sample rate), which the definition I gave does.

It does make a statement about distance though. It is actually very precise - it defines distance based on the number of samples. Once again, that is why that definition is so useful. The easiest way to see this is to look at a set of examples. Take the following:
Code:
Ref #   Sample                             0     1/5     2/5     3/5     4/5 
1        {1,1,1,1,1}                      1      0       0      0        0 
2        {100,100,100,100,100}            1      0       0      0        0
3        {1,1,1,1,5}                     .75   .33     .33    .33       .33
4        {1,2,3,4,5}                     .90   .25     .15    .15       .25
5        {1,1,1,1,20}                    .53   .42     .42    .42       .42
6        {1,2,20,4,5}                    .68   .37     .36    .36       .37
* DFT is normalized to make it easier to read

I have given a number of samples and the DFT frequencies for each. To be precise, the numbers given are the magnitude of the sinusoidal component with frequency defined as it is in the top column. In other words, we can treat these as the magnitudes for the frequencies listed across the top. Note also the periodicity in the second half of the graph. We normally consider the frequency range as running from -2/5 to 2/5 instead of 0 to 4/5, I just wrote them out this way so if you were doing the DFT by hand it would be easier to follow.

Look at examples 1 and 2 with no change between samples. The only frequency is the low base frequency, and it does not change by just increasing the samples. Look at 1 and 3. If you change one of the samples, the frequencies shift. The high frequency components go up and the base frequency goes down. This brings us to example 4. Once again, changing numbers makes the frequency shift. However, the really high frequencies are not shifted as much as in example 3 and the base frequency regains some of its previous stature. The numbers here changed gradually - even though the highest and lowest sample were exactly the same. So obviously the frequency has to be tied somehow to the magnitude of intensity changes between pixels.

At this point, you should see where the definition comes from. Sudden changes in intensity correspond to high frequencies in the Fourier domain. So you can say that spacial frequency is a result of the changes in intensities between pixels. This is true as long as you are using the DFT (or a form of it like the DCT) to calculate your frequencies. So the definition is akin to saying that frequencies are defined by the discrete Fourier transform. It takes us back to your definition by Foley. He said that the frequency was Sin wave samples per unit distance. The Discrete Fourier transform maps the image onto a series of Sine waves sampled at a specific rate.

So the definitions are the same as long as you are using a DFT or DCT to get your frequencies. As I have stated before, if you have a better way of fitting discrete points to individual frequencies without a priori knowledge of the function they were generated with - I would love to hear it.

That is where the example I gave of sampling a sine wave of a given frequency at different sample rates (above the Nyquist rate) comes in. Using a higher sample rate in this case means the differences between samples will be smaller. Yet I don't think you could say that this changed the frequency content of the signal.

We dealt with the pure Sine wave example earlier in the thread. However, I think your assertion is better answered with a question:

Xalion said:
So at this point is is worthwhile to stop and talk about Discrete Fourier Transforms. What a DFT does at its most basic level is to transform a set of points into a set of numbers that correspond to amplitudes and phase shifts for Sine functions whose frequencies can make up the picture.
...
All of the DFT values I have given are normalized amplitudes.
...

We can label the frequency terms as:
0.831522 - Low
0.289018 - Middle
0.265996 - High

Remember that these are amplitudes so we can make direct comparisons. Now, lets take a look at the smallest interpolation. Extracting the same values (remember that each correspond to the same frequency in a Sine wave expansion) we get:

0.873131 - Low
0.284212 - Mid
0.177299 - High

Now, we can immediately see what has happened. The Low frequency was slightly amplified. The mid and high frequencies were depressed. Their amplitudes decreased. Lets take the second set:

0.882523 - Low
0.283273 - Mid
0.159256 - High

The same pattern exists. Next sample

0.890959 - Low
0.283260 - Mid
0.141029 - High

Compare the bolded numbers which are the high frequency component of my image after a series of upscales. Now, the frequency content of the image clearly changed (high frequencies are suppressed) through nothing more than upscaling. So, the question becomes:

If you are claiming the frequency content is not changing, why do their amplitudes decrease?
 
Last edited by a moderator:
Back
Top