Image quality - a blast from the past

Bigus Dickus said:
No, I didn't miss that statement, I just don't agree with it. Perhaps it's just a difference of philosophical opinions. As I said above, "correctness" to me indicates some absolute assessment. As in 2 + 2 = 4 is "correct." It's almost as if you're saying that, if the reference rasterizer insists that 2 + 2 = 5, then so long as you match that, you are also correct. That seems ludicrous.

Call it "compliancy with the refrast" or "agreement with the refrast," but not "correctness," and for God's sake not "quality."

Arguing over semantics now eh? :) Correctness, schmancness, whatever. If you don't like the word "correctness", fine. Either way, your alternatives show that you understood my idea. Although you keep bringing up "quality" when I have state more than a couple times that I no longer believe that you quantify it. Anyway, you seem to believe that you shouldn't quantify the difference between the reference rasterizer and hardware vendor's rendering methods. I believe we should. So, we differ in opinions. Oh well, not the first time or the last.


Bigus Dickus said:
The root of the problem is that there just isn't a guarantee that MS or anyone else having input into the refrast chose the "most correct" way to render something. Theirs is but one way, and just because another method is different does not make it incorrect. It makes it DIFFERENT.

Remember I don't care for the specifics of how a specific feature is implemented. I'm only interested in the result. Please tell me you at least agree that specific(though not all) kinds of rendering can be defined mathematically? For example, z-buffer is defined mathematically, agreed? The answer or result of using a z-buffer should be easily determined mathematically. However, different vendors may choose different ways to reach that result. Depending on the rendering feature, most may not reach the exact same result. However, they should be close(by varying degrees depending on the feature). And that's what can be quantified.

The next thing you bring up is whether or not Microsoft chose the correct definition for each rendering feature. I can understand that this can be a problem. However, I contend that if by and large a definition in the reference rasterizer is shown(objectively) to be vastly different, then that in itself should be enough to show Microsoft that they need to update the reference rasterizer to the more commonly agreed upon result. Will it actually be done? That's a totally different discussion. However, I will say that I seem to remember that one of 3D Winbench's quality tests did get Microsoft to change how a rendering feature was defined. Can't remember which one off the top of my head. Maybe somebody else remembers.


Bigus Dickus said:
So you have two parts: a subjective judgement on quality, and a compliance to a particular method of rendering something, which was likely chosen subjectively when image quality was involved, or in many cases for reasons completely unrelated to IQ at all.

How could you possibly derive an objective assessment of IQ from those two components?

AGAIN, I'm no longer saying you can do an objective comparison on image quality. Since my second post in this thread I've stated I'm no longer interested in quantifying IQ, but instead only the "compliance with the reference rasterizer" as you state. Is that spelled out enough for you now? :D

Tommy McClain
 
AzBat said:
Anyway, you seem to believe that you shouldn't quantify the difference between the reference rasterizer and hardware vendor's rendering methods. I believe we should. So, we differ in opinions.
Actually, I think such a comparison could be a useful tool. I think that if a good algorithm was derived such as the one Fred, Joe, and others are suggesting that it could be a very useful tool. My concern is that as soon as such comparisons are made people will take the results out of context, and attempt to relate it to image quality, which is fundamentally flawed.

This isn't some paranoid overconcern either - it happened in the past when legion88 (don't remember his real name at the time) so thoroughly botched up the XOR test comparison.

So we don't differ in opinion on whether such a test should be done (ideally), or whether it would be useful. I'm just very aware of how the results will likely be interpreted by the masses, and I'm not sure that type of misinformation is better than just not knowing.

Please tell me you at least agree that specific(though not all) kinds of rendering can be defined mathematically? For example, z-buffer is defined mathematically, agreed? The answer or result of using a z-buffer should be easily determined mathematically. However, different vendors may choose different ways to reach that result. Depending on the rendering feature, most may not reach the exact same result. However, they should be close(by varying degrees depending on the feature). And that's what can be quantified.
Yes. But to reiterate my point, I'll elaborate a bit on your example. Suppose that IHV X intentionally chose a method of calculating Z values that was different from the results obtained by the reference rasterizer. Perhaps they wanted a slightly non-linear (relative to the refrast) set of values, which used in combination with some other nifty depth blurring feature they used gives a subjectively better simulation of depth-of-field. In that case, why would they care about matching the refrast, and why should we care? It's only the subjectively judged final image that is important.

OK, so I still think having the comparison would be useful, if only as a tool to identify how each IHV does certain things. But I don't think any label of correctness, accuracy, or quality should be applied to the results... and you know that's what people would do.

AGAIN, I'm no longer saying you can do an objective comparison on image quality. Since my second post in this thread I've stated I'm no longer interested in quantifying IQ, but instead only the "compliance with the reference rasterizer" as you state. Is that spelled out enough for you now? :D
Perfectly clear. :) And I agree there. Such tests should be theoretically possible (comparisons with any reference, not just the refrast, should be as well), and I think it would be interesting to know the results.

But if it's "compliance with the refrast" then how do you think people will react to a review labeling card X as "non-compliant with the refrast?" ;)

If consumers could be educated, I'd be all for it, but that seems unlikely.
 
My concern is that as soon as such comparisons are made people will take the results out of context, and attempt to relate it to image quality, which is fundamentally flawed.

Heh...nothing lost then. "Fundamentally flawed results out of context" is what happens today. ;)
 
Bigus Dickus said:
Take any scene that is being rendered in OGL. The rules for rendering that scene are deterministic (AFAIK), and there is only one "perfect image" as specified by those rules. Before rasterization, a continuous form of the image is, in theory, created. If you were to objectively compare the output of a given 3D card to that image, you should, in theory, be able to quantitatively assess its deviation from the reference. I think this is the point that many of you are making.

EXACTLY! Where was this comment when I was mentioning OpenGL in my second post in this thread!? :D


Bigus Dickus said:
Actually, I think such a comparison could be a useful tool. I think that if a good algorithm was derived such as the one Fred, Joe, and others are suggesting that it could be a very useful tool. My concern is that as soon as such comparisons are made people will take the results out of context, and attempt to relate it to image quality, which is fundamentally flawed.

This isn't some paranoid overconcern either - it happened in the past when legion88 (don't remember his real name at the time) so thoroughly botched up the XOR test comparison.

So we don't differ in opinion on whether such a test should be done (ideally), or whether it would be useful. I'm just very aware of how the results will likely be interpreted by the masses, and I'm not sure that type of misinformation is better than just not knowing.

Wow, we both think such a tool would be useful! Imagine that. :) I'm not so concerned that the results would be mis-interpreted. It goes without saying that some people will do that and more than likely purposely, but I believe their intrinsic value outweighs those possibilities. I mean the same can be said of any type of performance benchmark used today.


Bigus Dickus said:
Yes. But to reiterate my point, I'll elaborate a bit on your example. Suppose that IHV X intentionally chose a method of calculating Z values that was different from the results obtained by the reference rasterizer. Perhaps they wanted a slightly non-linear (relative to the refrast) set of values, which used in combination with some other nifty depth blurring feature they used gives a subjectively better simulation of depth-of-field. In that case, why would they care about matching the refrast, and why should we care? It's only the subjectively judged final image that is important.

True, but aren't vendors required by Microsoft to at least match the reference rasterizer for compatibility reasons or certification? I can see where they would want them to provide a fall back in addition to the new calculated method.


Bigus Dickus said:
OK, so I still think having the comparison would be useful, if only as a tool to identify how each IHV does certain things. But I don't think any label of correctness, accuracy, or quality should be applied to the results... and you know that's what people would do.

So you suggest we make the comparison and just show how the reference rasterizer does it, and then show how IHV X does it and just leave it there? No passing, no failing, no values assigned showing how different they are, etc? Hmm. That's better than nothing, but it's not very fun. :)


Bigus Dickus said:
Perfectly clear. :) And I agree there.

:) Cool.


Bigus Dickus said:
But if it's "compliance with the refrast" then how do you think people will react to a review labeling card X as "non-compliant with the refrast?" ;)

If that happened, then I would say good for them! :) All kidding aside, I believe to a certain extent that happened with 3D Winbench quality tests. It made some vendors mad, but in the end I believe it was the best thing since it fixed some gross cheating and got them aware that people were concerned with the results.

Tommy McClain
 
I'd just like to add a little bit more on "correctness."

The way I see it, there are but three major ways that video cards can vary from "correctness" and have it still be good:

1. FSAA. There are tons of ways to do it, and the best is a stochastic sampling pattern, which simply cannot be judged with any sort of correctness (since two renders of the same frame with stochastic won't look the same).

2. Texture filtering. Again, lots of ways to do it.

3. Color combination. Using some gamma correction can help, but will screw with "correctness."

Now, the first two can be readily quantifiable in how they work. It's generally not hard to look at two different FSAA methods and decide which one is better. There is the natural tradeoff between supersampling/multisampling (And I really do wish nVidia had gone for more MS samples per pixel, as well as better sampling patterns, instead of the supersampling/multisampling hybrids...), but FSAA is generally very easy to analyze.

Texture filtering is also not terribly hard. First, there is currently only one advanced type of texture filtering in use today (in the future we may see more, such as bicubic for magnified textures, though those textures are getting less and less common), anisotropic filtering. So, there are again some simple ways to quantify visual quality here, though it does get a bit more complex. What needs looking at are the texture LOD selection algorithm, default LOD levels, texture aliasing, maximum degree of anisotropy, "uniformness" of anisotropy across different surfaces. Again, all of this is quantifiable and measurable. Similar to FSAA, while you might not be able to produce one "score" that would satisfy everybody, the tradeoffs are generally rather obvious.

But the last I'd like to comment on. Due to the programmable nature of modern graphics processors, it really is necessary for the hardware to give complete control over any special color combination hardware in order for there to be maximum image quality. At the same time, I don't see any problem in enabling auto gamma-correct combination for legacy games, but it still should be user-adjustable. The reason is that the developer may have designed artwork with a specific gamma in mind. This sort of freedom should not be taken from the developer, and the developer should not need to tweak such things for each different video card.

Everything else, however, absolutely must adhere to the spec. If it's different from the refrast in DX, then it's wrong and needs to be fixed. If it doesn't match the spec of a supported extension in OpenGL, then it's wrong and needs to be fixed. This should be whether it comes to z-buffer calculations or the accuracy of pixel shader calculations. 3D programs depend on certain parameters, and if those are not identical among all cards, it creates major headaches. That's what specs are for.

And I'd like to go ahead and point out one other way where the Radeon 9700 doesn't render correctly. Enabling anisotropic forces bilinear or trilinear filtering (depending on whether performance or quality is set). While you might think this would be a good thing, developers today set point sampling on certain things in games for a reason. I don't like blurry text.
 
Chalnoth said:
since two renders of the same frame with stochastic won't look the same
Do you mean 'Two renders of the same frame on hardware with different stochastic patterns won't look the same'?

The stochastic pattern would not typically be varied on a frame-by-frame basis, or the image would fail to pass repeatability requirements and would 'move' even when the data has not changed.
 
Specs

And I'd like to go ahead and point out one other way where the Radeon 9700 doesn't render correctly. Enabling anisotropic forces bilinear or trilinear filtering (depending on whether performance or quality is set).

Is this moan actually based on some spec or just another nitpick :?:
 
Dio said:
Do you mean 'Two renders of the same frame on hardware with different stochastic patterns won't look the same'?

The stochastic pattern would not typically be varied on a frame-by-frame basis, or the image would fail to pass repeatability requirements and would 'move' even when the data has not changed.

Well, the best stochastic pattern would be totally random, which means that it must change on a frame-by-frame basis. If the sampling is high enough, this movement won't be very noticeable (though I would like to see what it looks like). I would suspect that as long as the stochastic method uses enough samples (~20 or so), there won't be a problem. Provided there's no regular pattern to the amount that the stochastic patterns change, at in excess of 30 fps the difference shouldn't be noticeable.

Anyway, truly random stochastic probably couldn't ever be done in hardware (except maybe with Quantum Computers), as you'd need to store the sample pattern of each pixel each frame, which could be quite a bit of data.
 
Re: Specs

Heathen said:
Is this moan actually based on some spec or just another nitpick :?:

Well, yes. The program defines that it wants point sampling. The Radeon 9700 does bilinear or trilinear. And it is a problem because it sometimes causes blurry text.

And this is even more of a problem due to the texture aliasing of the Radeon's anisotropic. I can't lower the LOD any to fix the aliasing because it blurs the text.
 
Some people don't like blurry text while others can't stand blurry textures. I guess both being sharp would be preferred, except some people probably don't like sharp textures, why else would anyone use Quincunx. Also speaking of text right now I am on my GF2mx400 and all my text is blurry. Me too hate blurry text, me too hate my GF2mx400.

This discussions is about doing an IQ comparison for reviews right? Just keep it simple and show different aspects or components that make a image look good to people in general. Not everyone will agree with someone's assessment but it will probably be in the ball park.

Still images I think work alright in showing a number of aspects of IQ. A two to three frame animation or just a slight movement captured in an image can show aliasing or not, it doesn't take much to show this aspect by flipping between two images or three. K.I.S.S..
 
Chalnoth said:
Well, the best stochastic pattern would be totally random, which means that it must change on a frame-by-frame basis.
Because the _pattern_ is random, that doesn't mean that it has to change. All you're trying to do is get the fourier transform of the resulting distribution to look right.
 
Dio said:
Because the _pattern_ is random, that doesn't mean that it has to change. All you're trying to do is get the fourier transform of the resulting distribution to look right.

Right, but the goal of stochastic is to break up regularity in the image, which, in essence, eliminates aliasing. The absolute best way is to not only let the technique be random in the two screen dimensions, but also in the dimension of time. Obviously you need more samples if you're going to do it this way, as you don't want the whole screen to be visibly changing color all the time (if it's a supersampling technique).

The way I see it, a more primitive stochastic sampling method will not change from frame to frame, while a more advanced one (using more samples) will. Another requirement for this to look good is having high framerates, so that any changes that do occur will be blended together for a very nice final image (in essence, a 20-sample changing stochastic pattern might simulate a 160-sample static stochastic pattern at 60 fps).

Update: The reason it would simulate a 160-sample pattern is because the receptors in our eyes respond about every 1/8th of a second, so a changing pattern would be averaged in our eyes over that amount of time. But, just as we can see things happening more quickly than 1/8th of a second because all receptors in our eyes are not synchronized, the actual effect may be a bit less.

Anyway, you're right. Two different renderers, even if they have the exact same static stochastic rendering technique, would most likely produce different images because of differing choice of the stochastic pattern for each pixel.
 
So

you don't like it so therefore it's a problem. Think again Chalnoth, what you see is purely subjective not obejective. And considering your level of objectivity over this issue (9700 IQ)...

I've played NWN to death over the last couple of months, compared to the IQ and speed of the GF3 I had in before the 9700 is leaps ahead. Silky smooth performance at 1280*1024 (4*AA & 16*AF) and brilliant IQ. You may not like the blurry text but my guess is that most people either a) Don't notice it, or b) Don't care. Which brings me on to another point, have any of NWN's programmers or artists complained about the 9700 rendering the text incorrectly. No, because it's a non-issue.
 
cetainly compared to the 8500 SSAA I don't consider the text blurry in eother NWN or DAoC when I have AA enabled. As for texture aliaising, again the 9700Pro in 8x or 16x quality mode - I would say the aliaising is almost non-existant whereas with the 8500 trilinear alone was prefarable to bilibear aniso.
 
Chalnoth said:
Right, but the goal of stochastic is to break up regularity in the image, which, in essence, eliminates aliasing. The absolute best way is to not only let the technique be random in the two screen dimensions, but also in the dimension of time. Obviously you need more samples if you're going to do it this way, as you don't want the whole screen to be visibly changing color all the time (if it's a supersampling technique).
From which literature do you get this? It seems quite at odds with the definition of stochastic sampling I have in "Advanced Animation and Rendering Techniques" by Watt/Watt, which has been mentioned several times on this board as one of the key pieces of reference literature. Can you point me at the (seemingly to me, flawed) document you've got this from?

I paraphrase: The purpose of stochastic sampling is to introduce _controlled_ randomness in order to change the Fourier transform to something more useful and thereby to trade off aliasing with some noise function.

The sample points should not change between frames. Changing them will introduce a different noise function on every frame, and therefore cause visual artifacts.

Edit: Having read and thought through this, I believe that the sampling points can actually change from pixel to pixel, but not from frame to frame (or for repeated accesses to the same subpixel).
 
Chalnoth said:
And I'd like to go ahead and point out one other way where the Radeon 9700 doesn't render correctly. Enabling anisotropic forces bilinear or trilinear filtering (depending on whether performance or quality is set). While you might think this would be a good thing, developers today set point sampling on certain things in games for a reason. I don't like blurry text.

This is just plain wrong. I'm sure you would like to go ahead and point out lots of other places where 9700 doesn't render correctly, but please get the facts straight ;)

If you force anisotropic filtering on with the control panel then you will get either bilinear or trilinear depending on if you choose performance or quality. Since you are doing something external to the API (fiddling with a control panel option that is forcing 'incorrect' behaviour) nothing about rendering correctness is guaranteed, and nor can it be.

You have chosen to engage an option that is outside what the application writers specified when they created the app - but at the same time you obviously want a 'psychic driver' that combines the application writer's intent and your desires and always comes up with what you regard as the right answer.

- On what criteria do I, as a driver writer, decide where to force any of the possible different filtering behaviours?

- On which specific textures should I ignore your apparent desire for higher quality filtering (eg. trilinear anisotropic) as indicated from the control panel, and instead give preference to what the application requests?

- Do I have to mystically detect that the desire in a particular configuration is to plot text on the screen?

If trilinear was not forced on in quality mode, but taken from the application's settings then I guarantee that there would be complaints that ATI was only doing bilinear anisotropy on legacy applications (that don't enable trilinear). Control panel options are a no-win situation - no matter what you do and how you handle them someone will always complain. (In this case, you ;))

Naturally, if the application itself sets up anisotropic filtering through the API and the control panel is set to 'application preference' (which is, of course, the default state) then it can set whatever set of filters it requests. Point sampled, bilinear or trilinear - 'what you set is what you get' WYSIWYG(tm) ;)

This is completely correct behaviour. If you want the application to look exactly as it was created to look then don't mess with the control panel.
 
or in other words blame the developer for not giving aniso options in the NWN video options.

I wonder why they didn't? I mean v-sync, quincunx, shiny water are all there.
 
Randell said:
or in other words blame the developer for not giving aniso options in the NWN video options.

I wonder why they didn't? I mean v-sync, quincunx, shiny water are all there.
Good question.
However, remember even the shiny water was not working for ATI.
Maybe a case of "game developed for one IHV's hardware"?
 
Back
Top