Hmm...
As far as I understand, the benefit of aniso is by adding detail compared to basic bilinear and trilinear filtering, and increasing the number of texture subsamples represented in each pixel to reduce aliasing/crawling/error compared to infinite resolution rendering scaled down to screen resolution.
It is possible to simultaneously do less sampling work, but from a more detailed mip map than normal. This would result in an image that would look very similar, but would also have error, and aliasing that would be apparent when in motion.
This would be akin to turning up LOD while at the same time lowering aniso levels. There isn't a problem if a user wants to make that decision to increase performance, but there is something wrong with calling that aniso...the sampling isn't being done. It is similar to the ATI "128x" aniso modes, except ATI didn't call it 128x in their control panel like nVidia is calling this 8x in theirs.
The diffs would be highlighting the aliasing errors, which would be seen in motion. Refrast comparison wouldn't work here...only comparison to the hardware's own aniso algorithms.
They would show that it was deviating from the default behavior, with the impact on aliasing/error determined by other means...some sort of motion based evaluation would need to be used in conjunction with detail comparison to evaluate things.
The only way to do this now, that I know of, is to subjectively evaluate aliasing. This wouldn't necessarily be apparent in anything outside of the game test(s) in question, because we don't know how many layers the "3dmark03.exe" branch of driver behavior modification might have.
The issue here seems to be: when it detects something unique to 3dmark03.exe, it is doing one thing, and when that detection characteristic is changed,it is doing something else that adds more detail in some spots and lowers performance significantly. We don't have a tool that facilitates aniso comparison in conjunction with performance levels, and this seems to point out something simple like a special "bake" of mip map LOD + aniso levels that someone decided was close enough.
Actually, for "optimal" filtering settings in 3dmark03.exe, this would be a much more "gray" issue, except as far as Futuremark's application detection rules preclude its existence in any case. In fact, except for application detection, the differences in images wouldn't matter as long as some achievement of the minimum Futuremark "optimal" specifications are accomplished.
You could argue that it might never go "up to" the "Max Anisotropy - 4" that 3dmark03 defaults to, but that is only a problem if Futuremark defined that...it isn't clear to me that "Optimal" stipulates that (it, in fact, seems exactly like a mechanism to allow some leeway in aniso, which I think is quite defensible outside of the application detection problem).
Has this been tested with "anisotropic" selected instead of "optimal" and I just missed it? Remember, nvidia doesn't "happen" to have a label for "application preference" anymore, atleast AFAIK.
If it occurs based on application name detection with aniso being stated to be selected (whether by the application, or by the driver saying "application preference" and not following through), then any deviation from aniso defaults for the hardware is based on deception, because they are falsely representing the applicability of the similarity to aniso of the named degree. Sort of like the "Application" and "Balanced"/"Aggressive" bait and switch. They're violating the integrity of the labels "Aniso" and "4x" because it looks different in the exact same scene when the application detection is defeated. Very much like Quack, except Quake III didn't have clearly established rules against application detection for performance comparisons.
Could even be a bug, except all the other effort nVidia has demonstrated with targetting 3dmark03 performance boosts by changing behavior from what was asked for sort of makes that rather conclusively unlikely.
However, if it only occurs with "optimal" selected, and
not "anisotropic filtering" selected from within 3dmark, then the image differences are secondary except as the user evaluates image quality, and it is the application detection that is the problem.
The same way "application" specific nVidia "adaptive" aniso could be bad
if the "adaptivity" depends on nVidia testing for the application and putting assumptions in the driver, but doesn't tell the user that is what is going on. Unlike pixel shader output, the range of variation allowed before being "wrong and indefensible" is a lot wider, and the lower quality itself might be valid under certain circumstances...some more evaluation is required to see how far afoul of those circumstances nVidia is beyond the application detection problem.