my own re[h]ash
The questions for the methodology is one of actually following through with the implications of the limited scope of the approach, and concentrating the presentation on being clearly focused on its strengths. Things seem on track for AF and AA, but look to be opening a new can of worms as far programmable featureset (we are already seeing some of this, in terms of increased fragility of patches and new game content).
What I am most concerned about is certain statements and representation in the article itself, that indicate to me that the basis on which decisions relating to this matter are going to be made are fallacious...
...
It spends a great deal of time on what I can only evaluate as revising the history of what occured when [H] allowed, paraphrasing, "cheaters to cheat [H] and it's readers". The resultant description of events bears little resemblence to what actually happened or to actual justification for some statements it makes.
According to the article, "synthetics" led them astray concerning cards in the past, yet for one example of "enlightenment" on the matter it mentions the article by Extremetech that exposed the truth...inconvenient for the conclusion drawn about synthetics, Extremetech exposed it using "synthetic" testing. How synthetics were discredited by the event were when sites, most prominently [H] itself, attacked the Extremetech article and said what the synthetics demonsrated were false. The article somehow fails to convey that accurately to me.
According to the article, colored mip map levels can lead you astray concerning filtering issues, like in UT2k3, but in the actuality of that issue, colored mip maps, in fact, were used to illustrate the issue.
It was [H]'s own specific decisions in selecting textures and still screenshots to erroneously state there was no difference, and to ignore any issues with colored mip maps, that led people astray, when both synthetics and other decisions on where to look for in game issues did not.
This is a central inconsistency...the actual blame for [H]'s errors around the issues the article represent as enlightening, and 'naturally' leading them to the conclusion to not use synthetics, was the site's own decisions (decisions to ignore synthetic evaluations, actually). At the time, other people weren't making the same decisions, and the error of the decisions were fairly obvious to others even as [H] was making and defending them.
The tools that actually sparked the opportunity for enlightenment in those issues...were synthetic tools.
From this, it concludes and propose that the problem is "synthetic" tools failing to represent actuality, and that getting rid of them removes the issue!? This isn't supported by actuality, only by the selectivity and/or revisionism of what happened.
What it actually shows is that thought, investigation, and standards have to be applied to any information gathered. Correcting a lack in that helps prevent people being led astray, distorting where the error was in the past does not.
...
Here is some fairly simple logic that some of the article seems to contradict...
Comparing synthetics to other data and determining where they fit is more thought and investigation than not doing so. That is, for example, how many people knew how the NV30 and NV35 compared for PS 2.0 far before [H] informed them, and that's where [H] failed in directly stating that the people saying so were wrong/motivated by sour grapes/lying just to hurt nVidia (these actions seem to have been forgotten for analyzing how people could have been led astray
). This is how [H] failed on the matter of readers being "cheated by the cheaters" and their stance on 3dmark 03...by simply looking away from the information. They even further assisted in leading people astray by mirroring nVidia's PR statement of "don't look behind the curtain" so closely, and I think linked it as justification for the article...as justification for their decisions, not even as an illustration of it being a mistake to echo IHV PR nearly verbatim, point for point, at full article length.
Ignoring synthetics measurements by not looking at them is less thought and investigation, not more. This isn't because of the places where such a policy of less investigation happens to overlap where synthetics fail, it is because of the places where the policy simultaneously overrides where synthetics
do not fail, and ignores where the reviewers selection and perception
do fail to apply for the user. Again, the above examples for the article's revelations about where synthetic criteria fail are actually cases where [H] decided to ignore synthetics and led readers astray!
...
Moving away from the article's issues, and onto concerns on certain aspects of the the outlined methodology alone:
Examining only the most popular games? That's exactly where IHVs would have financial interest in engineering a representation that doesn't reflect the hardware's general abilities. It gives them the maximum opportunity for fragile corner cutting, and moves backward from progress that has been made to encourage more general "cut corners" that can actually survive scrutiny in the general case.
A general case "cut corner" is far more likely to be an optimization than a cheat, and a fragile "cut corner" is far more likely to be a cheat than an optimization.
Also, how will it serve any readers when they might happen to play a game (or several) that weren't popular enough to be represented in the selection of top selling? Actually, that question sparks a bit of deja vu...this situation for representation is a return to the aspect of many reviews being more a review of the effort spent by the IHVs on targetting the most popular games.
Finally, it seems to actively encourage the bad aspects of the various IHV marketing campaign initiatives, by giving maximum exposure to the picture IHVs can manage to engineer with them.
However, if you're going to limit usefulness like many reviews have always done, it is better to do it well and educate your readers about what you're doing. Where the article relates to explaining that this might be the outcome, instead of trying to vindicate the site and revise the context of past failures, it is an encouraging sign.