IQ and 3dmark03

AlphaWolf · Jun 3, 2003

Well the recent statement minimizing the actions of nvidia and somewhat compromising the validity of 3dmark03 got me to thinking. How hard would it be for 3dmark03 to implement an IQ test.

Basically my idea would be that the card/driver need to generate frames within a margin of error of reference in order to pass the tests.

Any thoughts?

Lezmaka · Jun 3, 2003

AlphaWolf said:
Well the recent statement minimizing the actions of nvidia and somewhat compromising the validity of 3dmark03 got me to thinking. How hard would it be for 3dmark03 to implement an IQ test.

Basically my idea would be that the card/driver need to generate frames within a margin of error of reference in order to pass the tests.

Any thoughts?

What about AA/AF tests? There's no way of comparing them to the reference (especially AA)

FUDie · Jun 3, 2003

AlphaWolf said:
Well the recent statement minimizing the actions of nvidia and somewhat compromising the validity of 3dmark03 got me to thinking. How hard would it be for 3dmark03 to implement an IQ test.

Basically my idea would be that the card/driver need to generate frames within a margin of error of reference in order to pass the tests.

Any thoughts?

We already have such a thing. It's called DCT. You can see how effective that is at preventing cheating^H^H^H^H^H^H^H^Happlication specific optimizations.

-FUDie

Doomtrooper · Jun 3, 2003

Who cares ?? I don't anymore.

Sharkfood · Jun 3, 2003

If you dig deep, you will find old threads about this very topic discussing "IQ tests" in previous release benchmarks.

The discussion really focused on a huge number of factors and lead to much debate.

In previous versions of 3DMark, IQ tests would compare the reference raster (which was basically identical to the GF3's renderer from it's DX8.0 roots) with stills taken from the benchmark and grade accordingly.

This basically made it's "IQ score" based on how close to the reference rasterizer the underlying hardware could get. With no regards to color accuracy, truncation or other issues.

Unfortunately, this lead many websites to try and create arguments to "dock" IQ on 3D cards that were delivering what many would consider far superior IQ. So the debate really became- what was the developer's intention? And did the rasterization process and 3D hardware deviate, enchance or degrade the final quality from the original concept?

More on the website thing. The easiest way to explain the farce of the time is through a good illustration:

If these are shaded renderings used in the IQ comparison tests such as 3DMark employs, you can see that IHV B would "fail" the image quality tests for reasons of employing higher color accuracy and delivering better gradients. You would have test results failing IHV-B, while giving a perfect "A" score to IHV-A.

After all in this example- IHV-B deviates greatly from the reference rendering, so therefore it fails.

So it's a much bigger topic once you start to scratch the surface.

epicstruggle · Jun 3, 2003

@sharkfood:Wow, I never thought of it that way. Shouldnt you always try to render what the reference does, and leave it as an option to maximize IQ?

later,

Sharkfood · Jun 3, 2003

It's a huge discussion epic.. I think the last thread on this topic went like 20+ pages of very insightful discussion.

Your suggestion is just one concerning only one such example in a sea of similar situations. Such a solution would also hinge on wherein the optimization lies embedded in the process. If it's some sort of internal register length or automatic hardware function, it may not be so easy to simply switch on or off. It could also bring a larger debate on whether or not things such as this might just be bypassed to remain true to refrast in order to be "compliant"- then improvements go out the window if they cannot be simply and elegantly controlled through drivers just to get around some (arguably flawed) IQ comparison tests. In this scenario- refrast litmus testing might be viewed as crippling progress.

Texturing opens up huge cans of worms too.. and who can truly decide whether or not one implementation closer recognizes the artist's vision better or worse versus the refrast. Perhaps such an accepted notion would force developers to ignore their artistic vision and what superior/advanced hardware may deliver and cramp themselves only to think in terms of refrast in such a world... it's a really big discussion.

epicstruggle · Jun 3, 2003

looks like i only scratched the surface. It is an interesting subject though. I guess everyone will have their own take on this. I guess I fall in the camp of do what the artist want, if not tell the user what you have done so that all is clear.

just my 2 cents
later,

Typedef Enum · Jun 3, 2003

Not to beat a dead horse, but I feel that you should always first conform to the standard/spec first, and foremost...once that has been accomplished, then you can go beyond it.

In this case, if I were running one of the IHV's, I would make sure that my product would conform to the reference...and then I would make it clear that, although our product conforms to the reference, we also have a way of bettering the reference.

KimB · Jun 4, 2003

If one is attempting to compare image quality by comparing it to a reference image, then the best way to do it would be to attempt to produce some sort of "maximum image quality" image to compare it to.

But, in the end, the comparison always boils down to just a few things:
1. Edge anti-aliasing
2. Texture clarity
3. Texture aliasing

These can't really be lumped together easily: different situations stress different parts of image quality. The portion that is most important to each user will depend on what sorts of games he/she plays.

For example:
Edge anti-aliasing: Most slower-paced games, especially flight/racing simulations accent this aspect the most.

Texture clarity: Most any game looks quite a bit better with good texture clarity.

Texture aliasing: The flipside of better texture clarity, rears its ugly head mostly with regular, high-contrast textures. Can appear in most any gametype, as it mostly depends upon the artwork.

Update:
Actually, the above only applies to legacy fixed-function rendering. The landscape does get a bit different when pixel shaders come into the picture. Here the best way to compare would likely be vs. a reference image rendered in software at, say, double floating point precision throughout. A Mandelbrot shader would be a great way to expose differences, though it wouldn't show how well these precision differences manifest themselves in game situations: it would just show which one's better. This may be useful if future competing video cards claim the same level of precision. For now it's obvious.

Sharkfood · Jun 4, 2003

In this case, if I were running one of the IHV's, I would make sure that my product would conform to the reference...and then I would make it clear that, although our product conforms to the reference, we also have a way of bettering the reference.

The only way I believe this wouldnt stifle progress would be if the reference were so far out of reach from a quality/accuracy standpoint that no IHV would stand a chance to match any time soon. This would also require a reference that would be updated/maintained regularly to keep this edge.

As Chalnoth detailed- this becomes especially sticky with shaders. If you are going to create a reference from which IQ is to be judged- no IHV should be able to match or surpass such a reference standard. This will allow all IHVs to be graduated as to how close/far from the reference they are able to maintain.

The (fictional) methods then from which to quantifiably measure delta from reference then opens an even bigger can of worms.

IQ and 3dmark03

AlphaWolf

Specious Misanthrope

Lezmaka

FUDie

Doomtrooper

Sharkfood

epicstruggle

Passenger on Serenity

Sharkfood

epicstruggle

Passenger on Serenity

Typedef Enum

KimB

Sharkfood

Similar threads