My early take (from Page 5, haven't caught up yet):
The NV35's 330 score is most likely using floating point. The NV30's 330 score is most likely not. The 330 score for the NV35 would then be a really big improvement, because it is actually following the spec, though just looking at the number would not tell that story. If benchmarks bear out similar results between the cards, this would be in accordance with the distinction I recall of the 44.03 drivers being WHQL for the NV35 only...the "register combiner" usage would only be up to DX 9 spec for that particular card (out of those released now).
It seems clear that nVidia is cheating on a large scale, in a way that doesn't seem compatible with the term optimization at all.
It seems clear that ATI is doing a "bad optimization", i.e., a benchmark specific one. Given the architecture, it seems easy to believe that this is absolutely identical in output. But it would still remain that ATI needs to improve their low level parser to handle the optimization opportunity to do this "legally" within Futuremark's rules. Kudos for the incentive to do this being provided by Futuremark's standards.
I.e., it is quite easy for me to believe it is a
100% completely legitimate optimization done in a
100% completely illegitimate way for 3dmark 03. What they did wrong would then not be lying about their hardware, but lying about their current drivers' ability to adapt low level shader code to it. I'd be curious for a technically detailed explanation of whether this is beyond what ATI can accomplish, or whether they have plans to introduce improved analysis in the future.
Call it a cheat? For 3dmark,
it is at current, despite that such optimizations might be quite easy to deliver for games through developer relations, even with absolutely identical output and workload. The problem for ATI would be that 3dmark's definition of identical workload seems to preclude application specific detection.
Call it the same thing as what nVidia is doing? I don't think there is any way to justify that by any rational means, as it seems clear that the workload is drastically altered by any remotely reasonable metric by the 44.03 drivers.
Let the battle be between hardware and runtime shader analysis by the drivers for each vendor, so the benefits can be universally delivered to all DX shader applications. Thumbs up to Futuremark for working to make sure that their application facilitates IHVs delivering that.