Another demalion monster post
First let me distinguish between DX "functionality" and performance "level". The "functionality" is features that can be rendered using the card, and the "level" is a card that was designed to the level of performance of that DX generation.
Crusher said:
Well, you can look at it a couple of ways.
First, you can say that my computer getting 30 3DMarks shows how useless it's going to be on DX9 games. The fact that it's going to be useless on DX9 games is probably true. However, that score is based only on Game Test 1, since that's the only test my computer can run. Game Test 1 is supposed to be a DX7 benchmark.
A DX 7 "functionality" portion of a DX 9 "level" benchmark. This does not mean that a DX 7 "level" card will necessarily perform well, but that is testing the DX 7 "functionality" of DX 9 "level" cards. I view it as an input to they dynamic of the overall score that is not computationally bound, but other than that I tend to agree that in isolation it is exactly as useless/useful as 3dmark 2001 tests have been in my opinion (I'm not a fan of the focus of 3dmark 2001 scoring).
Well, last time I checked, the GeForce 2 was one of the best DX7 cards around.
DX 7 level cards, maybe.
I think the peak framerate I saw displayed was 8 fps, and it usually was below 3 fps. That's not indicitive of any DX7 game I know.
Well, it might be if you ran all games just using DX 7 functionality at maximum settings with it. That's strikes me as a valid difference between benchmark and game behavior. The thing is, there are DX 7 functionality games that came out, and they require DX 8 level performance to utilize maximum settings. You simply don't use those settings, and are neglecting that as a result apples to apples comparison will run into CPU limitations before you could demonstrate the difference in performance capability of a DX 9 level card, let alone the cards that come after that the benchmark is trying to benchmark representatively.
Even UT2003 and NOLF 2 run much better than that on my computer, and they're about the most demanding DX7 games I've seen.
Do you really run them at maximum settings?
That tells me that Game Test 1 is clearly not indicitive of the type of situation it's meant to portrait.
I disagree about the type of situation it is meant to portray. Futuremark recommends you use their prior DX 7 level and DX 8 level (but mostly DX 7 functionality) benchmarks for better representation. You have to admit it makes some sense that they recommend this in lieu of Game 1, doesn't it? Well, if you only care about getting fps like you would in your DX 7 games from the same era.
If GT1 can't properly judge the abilities of a DX7 card running a DX7 game, how is it going to properly judge the abilities of a DX8 or DX9 card running a DX7 game?
GT1 tries to represents (not judge) the ability of the card in question when performing DX 7 functionality. DX 8 level card, or DX 9 level card, and beyond have to be represented, and as a result many DX 7 level card performance will be low. It is the non computational representation in the benchmark...as I've said I consider it dismissible in isolation as a DX 7 functionality test, but as something that makes sense as a contributing factor in the scoring.
The idea of this properly judging the abilities of any card running a DX 7 game is what I agree this test does not do, and what I think prior 3dmark benchmarks did not do. Fortunately for 3dmark03, however, there are some other tests that this test makes sense to be associated with. I'll discuss that when I mention shader testing.
Now you might say, "who cares? there are DX7 games out to test with if you want to know how well a card works with DX7 games. 3DMark03 is supposed to compare cards with DX8 & DX9 level games."
No, it is supposed to benchmark DX 8 and 9 level cards, with DX 8 and DX 9 functionality and a bit of DX 7 functionality representation. The DX 7 functionality test is scaled to DX 8 and DX 9 performance levels, and DX 7 performance level cards suffer as a result. This is natural and logical in my view.
That's all well and good, except that Game Test 1 still counts towards the final score on DX8 and DX9 cards.
To represent a card's ability to handle a simple non-computational workload. Since not all games will be as computationally bound as the rest of the benchmark suite, this makes sense in my opinion. Another way to view it is as simply a simple performance yardstick that by necessity scales from DX 7 to DX 9, and beyond. The low fps values for a GF 2 don't seem too surprising to me in this regard.
So here we have a clear example of how the final 3DMark03 score is being based in part off an irrelavent and inaccurate test.
I don't agree with the summary in the way you mean it: Though this test does seem irrelevant and inaccurate to trying to judge DX 7 game performance, that is not its purpose. In fear of being accused of semantics games later
, I'll clarify again here that "judge" and "represent" have significantly different meanings in my discussion.
Then you might extend the argument to say that, if the DX7 test is not indicitive of even the worst-case scenarios of DX7 applications, how can you trust that the DX8 and DX9 tests will be any more accurate?
That is a pure circumstantial association, where you say "this person in the family is totally unlike that person in the family, but this person is dishonest, so how can you trust the rest of the family?"
Indeed, NVIDIA's argument is that the methods of rendering scenes in the last 3 game tests are not efficient, and not what game developers are going to be doing in the future.
And there can be a whole valid discussion of this, but you are trying to circumvent it by the prior illogic. I think I've had this discussion at length elsewhere, but Dave H has a much more succint summary somewhere around...
And why would you expect them to be? How can a company designing an aplication in 2002 predict what methods game developers are going to be doing in 2003 and 2004? I don't think it's possible.
The road you travelled to get here is very winding. My argument, repeated very briefly, is that I think what has changed is that there is a common factor for measurement that can a very useful prediction, and that is shader performance. All GPUs moving forward will concentrate performance on executing shaders as quickly as possible, and similar to how one benchmark of fairly representative assembly instructions can show performance difference between CPUs with good representation, I think this serves to simplify a lot of the issues of predictability for 3dmark03.
Now, the efficiency of their implementaion is an interesting question, discussed at length elsewhere. I don't consider the matter settled, but Futuremark has proposed some creditable rebuttal to nvidia's comments (I think), and we could perhaps test their validity.
And that brings up the other twist--the fact that the majority of game developers inherintly try to do their best to make sure games perform acceptably on all makes of video cards.
Exactly the difference between games, and apples to apples benchmarks...the two have to behave differently when seeking a wide range of scalability (for the DX 8 functionality tests, this results in some DX 8 level cards performing as poorly as yor GF2 does in the first test).
If a developer does things the way 3DMark03 does them and finds tremendous descrepancies between how different cards from different vendors and/or different generations perform those actions, they will probably change the way they're doing things.
Games
do change the way they are doing things. You act like you play all your games on a GF2 at maximum settings. 3dmark03 does adapt, but being a benchmark with equivalent output being the goal, it's adaptations are to expose functionality so that performance can be measured, not drop funtionality and scale back to achieve acceptable performance.
As I've said before, I think 3dmark03 has turned into one big collection of both simple and
complex synthetic tests, and that due to dependence on shaders they've managed to, pretty likely in my estimation at this time, get it right.
You can argue the extent to which things will change, or the number of developers who will ultimately make such decisions, but that doesn't change the fact that it's a relevant variable.
I don't think people should confuse 3dmark scores with fps values, and I don't think they should have in the past. That doesn't mean this new benchmark is useless, though.
...
Don't reviews have a responsibility to protect the public from themselves in this regard?
Snipped a bunch of text I agree with. Yes, they do. I think they should do this by education.
In contrast, I have a problem with, for example, recognizing that responsibility only after an article that makes the very same points and leaves exactly the same questions unanswered as a message an interested party has been sending around to reviewers. This same interested party who used to benefit from reviewers ignoring that responsibility, and whose message has gaping inconsistencies and plentiful misinformation.
Compelling arguments can and have been made for both sides, but I think ultimately history and logic point towards the decisions [H] is making with their reviewing philosophy are more or less going in the right direction. And that's hard to say coming from someone who's never been particularly fond of [H] as a whole
However I view the above, I must consistently point out that I recognize I may be wrong in my evaluation, and the 3dmark03 score is still not representative. I've given my reasons above and in plenty of places why I don't think I am...take it for what its worth, and I look forward to investigations that confirm things either way.