Patric Ojala/FM on benchmarking games vs benchmarking applic

This seems to be a good, well-reasoned article. But I'm still left feeling doubtful about the usefulness of synthetic benchmarks designed solely to measure raw hardware performance.

How much value is there in determining the relative performance of different pieces of hardware, if that performance can't be extrapolated to practical applications (i.e. games, in the case of consumer graphics cards)? The data in ATI's benchmarking paper, which seems to match up with what most web sites are reporting, gives an example where two products have similar 3DMark03 scores, yet one clearly outperforms the other in a long list of games. So having the highest 3DMark03 score in this case might allow you to brag about having the fastest graphics card, but in reality it's nowhere near the fastest when you are actually using it for its intended purpose.

I suspect this is why 3DMark was originally marketed as "The Gamer's Benchmark". If it isn't designed to closely resemble the rendering engine of an actual game, then why not just get rid of the game tests and stick to measures of raw power like fillrate and vertex throughput? Probably because then you'd be stuck with nothing more than theoretical numbers, which aren't all that useful to someone shopping for a product that will make their games play better.
 
I would think that if you use both 3Dmark01 and 3Dmark03 you would get results that would be similar to what is in ATI's pdf and is this not what FutureMark suggest ?
 
GraphixViolence said:
If it isn't designed to closely resemble the rendering engine of an actual game....

Well, the the post makes it pretty clear that it IS designed to resemble the rendering engine of an actual game. It's just that it's a "future" rendering engine. One that has much higher stresses on GPUs than today's games.

then why not just get rid of the game tests and stick to measures of raw power like fillrate and vertex throughput?

Because synthetic tests that stress one aspect of the GPU are different than synthetic tests that stress multiple, simultaneous aspects of the GPU working together.

Raw fill rate is one measurement.
Raw vertex throughput is another.

Lots of fill rate, with lots of vertex information, with pixel shaders piled on top at the same time is completely different.

Probably because then you'd be stuck with nothing more than theoretical numbers, which aren't all that useful to someone shopping for a product that will make their games play better.

Again, "one dimensional" synthetic tests have their uses too. However, the "game tests" are not one dimenstional, they are multidimensional synthetic tests, designed to stress the GPU in ways that are speculated to be similar to how future games will stress the GPU.
 
Joe DeFuria said:
Well, the the post makes it pretty clear that it IS designed to resemble the rendering engine of an actual game. It's just that it's a "future" rendering engine. One that has much higher stresses on GPUs than today's games.
Well, it's hard to see how Game Test 1 in 3DMark03 could be considered representative of future rendering engines, considering there were already many games available with more advanced graphics features than this when it was released. Now that several DX9 games have hit the market, the same thing could be argued for the other three game tests as well.

If the goal of a synthetic benchmark is to help evaluate performance in future games, then it's only relevant as long as it stays ahead of the games that are already on the market. This might have been true earlier this year, but not so much anymore.

Because synthetic tests that stress one aspect of the GPU are different than synthetic tests that stress multiple, simultaneous aspects of the GPU working together.
But again, what does such a test tell me, if the performance can't be extrapolated to real games? At least a synthetic test that isolates one aspect of the hardware tells me something interesting and allows me to draw some conclusions. It would be much harder to extract any useful information from, say, a test with 10 million polygons per frame with massive overdraw and 100+ instruction pixel shaders on every surface that runs at <1 fps on high end hardware.

Again, "one dimensional" synthetic tests have their uses too. However, the "game tests" are not one dimenstional, they are multidimensional synthetic tests, designed to stress the GPU in ways that are speculated to be similar to how future games will stress the GPU.
I agree that in the absence of games with a futuristic rendering workload, synthetic benchmarks are the only tool we have to help make reliable predictions about performance on a given piece of hardware. What I'm not convinced of is the relevance of synthetic benchmarks with workloads that attempt to match today's games.
 
Regarding fraps repeatability, I for one don't care if it is a online game. I would like to see reviewers use 'actual play' if it is a online game.
Say BF 1942 since it is so popular, join a server and benchmark 3 or four maps...that is realality.

The end user then knows exactly how that card will perform in a 'real world scenario', no the results are not repeatable...but real.
 
GraphixViolence said:
This seems to be a good, well-reasoned article. But I'm still left feeling doubtful about the usefulness of synthetic benchmarks designed solely to measure raw hardware performance.

How much value is there in determining the relative performance of different pieces of hardware, if that performance can't be extrapolated to practical applications (i.e. games, in the case of consumer graphics cards)?...

First, who says it cannot be extrapolated? nVidia found the TR:AoD benchmark (in a "real" 3d game) to be so threatening to the sale of its current nV3x product line that it took steps to have the benchmark completely removed from the software (I've never seen a publisher do this to his own software before.) If the results could not be "extrapolated" to real games supporting these features, why on earth would nVidia have made such a fuss over *all* such benchmarks, based on real games or not, all year long? If they couldn't be extrapolated then nVidia would have nothing to worry about, right? What worried nVidia was not that they couldn't be extrapolated--but that they could be. Of course, that should be obvious by now.

Secondly, what possible difference does it make whether the 3d engine in a synthetic benchmark is in a "real" game? All 3d engines in "real games" are different from each other. In fact, even the performance of the "same" game engines used in different games can be much different (IE, Q3 and Wolfstein, or UT2K3 and U2, etc.) Is it reasonable to suggest that because you can't estimate the performance of UT2K3 by running U2, that UT2K3 is an "inavlid" benchmark, or vice-versa? Would you say that Q3 is an "invalid" benchmark simply because you cannot extrapolate from it how UT2K3 will run, or vice-versa? Of course not. What you do is run them all.

That's where synthetics fit in. You run them in addition to all of the other "real" 3d games you run. They are especially of value in testing *API feature support* not yet in evidence in current 3d game engines. It seems to me that the 3dMk03 results have been more than validated by everything nVidia has done to obstruct use of the benchmark this year, and by the emerging DX9 game engines which we have seen (as well as nVidia's attempted obstruction of those along with where it did succeed in obstructing them, as in the TR:AoD benchmark.)

While it's easy to think of many things that 3dmk03 asserted this year which have been verified, I can think of nothing nVidia has asserted to the contrary which has been verified. What nVidia has actually done is to prove the worth and validity of synthetic benchmarks beyond a shadow of a doubt. Of course, that was not the company's intention when it began its obstructionist tactics, but obstructionist policy in the face of easily verifiable fact is rarely intelligent, or successful...:)
 
I'd wanted to comment in this thread but perhaps I'd said all I wanted to say in my original email to Patric when he had previously asked for feedback :

Reverend said:
Well, I had wanted to type out a novel-length commentary but I think the very brief summary of what I wanted to say is that the gist of it comes down to the responsibility and focus of a particular hardware review website.

While I can ignore the obvious promotion of Futuremark/3DMark that what you wrote is about (and I won't fault you for that, it's understandable to me), I think you need to touch on the topic of what a hardware review website wants to do in their reviews.

Is it to show a video card in all possible (or as much as possible) gaming scenarios, by using popular games and/or a wide suite of games?

Or is it to study a video card's technology?

One of the above is Beyond3D's main focus (can you guess which?), and at most times the two are mutually exclusive to each other.

Which would present more truthful data? Which would present more useful data?

Should a hardware review website look at the now and ignore the future? By this, I mean use existing games/benchmarks.

Or should a hardware review website set out to prove if a IHVs claims about their products are truthful or realistically possible to implement?

You see Patric, both reviewing approaches have their virtues. The best scenario would be to incorporate both approaches in reviews... touching on the now (existing games), as well as examining the future (tech as introduced/evangelized by IHVs about their new products).

If we have to follow each/both approach, then using games as a way of benchmarking video cards is what viewers can immediately appreciate and relate to. That goes the same for something like 3DMark, an application that is more in line with the "technology approach" (i.e. the future, not the now).

I think it is unwise to "denounce" (for lack of a better word... it came out this way to me from what you wrote) using any game as a form of benchmark in reviews, and to only focus on "best selling" games.

A million+ purchasers of Tomb Raider AOD** is worth a lot of money, not only to the developers and publishers, but also to IHVs if the game is used as a benchmark. And it is a million reason to use the game in reviews, yes?

We should not ignore the "minority" in favour of short reviews or attempts to use something (like 3DMark) to be the all-encompassing overall representation of a piece of hardware. The more the better... the more games used the better... the more benchmark applications used the better. Sadly, time is such a huge factor in the reviewing industry and just like many others, Beyond3D cannot spend a month conducting the review of a pice of video card to have a review that is as wide and varied as possible, which is what we all want.

I'm not sure if this is the kind of feedback you expected or wanted but it's something that has been on my mind for a long time and is what I think a large majority of the consumers do not know.

**Patric had originally included some comments specifically wrt TRAOD.
 
i think futuremark should make game benchmark plug-ins/scripts for some games that not include benchmark function . :rolleyes:
 
I remember rewriting quite a bit of my story after reading the feedback from Reverend, but I don't remember if I actually emailed back to him. Anyway, to offer you guys more to read, I'll comment some of Reverend's feedback here:

Reverend said:
While I can ignore the obvious promotion of Futuremark/3DMark that what you wrote is about (and I won't fault you for that, it's understandable to me), I think you need to touch on the topic of what a hardware review website wants to do in their reviews.

Quite true, I presented some recommendations on how to benchmark and what those results mean, but in the end the editor chooses how to review the hardware. It is also true that I am a bit partial, it is my job after all. Then again, we make 3DMark the way we see that it measures correctly. This means that we don't only promote 3DMark because Futuremark pays our salaries, we really believe in our product very strongly.

Is it to show a video card in all possible (or as much as possible) gaming scenarios, by using popular games and/or a wide suite of games?

Or is it to study a video card's technology?

Both of these are important and that was what I tried to bring out. The users will most likely play games with their high end hardware, therefore game benchmarks are an important part of a hw review. Then again, game benchmarks only speak for the games out now, and there is a limit to how many game benchmarks can be used in a review. Therefore a study of the video card's technology and overall performance is also important. That should give a picture of how that card manages in other games than those benchmarked in the review.

Or should a hardware review website set out to prove if a IHVs claims about their products are truthful or realistically possible to implement?

This is a touchy one :D
I think this is very important. Each IHV wants to sell as many of their products as they can. The mission of the marketing departments of these companies is to give the consumer the impression that their product is superior compared to the competition. It is no easy task, but I think it is the job of the hw press to reveal if the marketing hype of an IHV is true or not. With all the hazzle after the 3DMark03 launch, people wanted to give us at Futuremark the responsibility to reveal what IHV marketing talk is true and what not. I believe this role is not ours, we only make tools for the hw press to carry out such investigations. We naturally want to help the hw press with using our products. After the end of this month, we'll put a list up on our site which gfx drivers we consider to generate comparable 3DMark results. We already gave out the optimization guidelines, now we'll continue that work.

You see Patric, both reviewing approaches have their virtues. The best scenario would be to incorporate both approaches in reviews... touching on the now (existing games), as well as examining the future (tech as introduced/evangelized by IHVs about their new products).

I quite agree. By using game benchmarks, benchmark applications like 3DMark and feature specific benchmarks like Shadermark or Rightmark, you should be well covered.

I think it is unwise to "denounce" (for lack of a better word... it came out this way to me from what you wrote) using any game as a form of benchmark in reviews, and to only focus on "best selling" games.

A million+ purchasers of Tomb Raider AOD** is worth a lot of money, not only to the developers and publishers, but also to IHVs if the game is used as a benchmark. And it is a million reason to use the game in reviews, yes?

That TRAOD comment I removed because it is true that a game benchmark should be used if the game is played widely enough. My initial impression was that TRAOD sold badly since the reviews on that game were very poor. Still, the TR brand obviously guarantees good sales, so indeed, that benchmark is valid. Despite all bad reviews that game is played a lot, and it is therefore relevant how fast the gfx hw runs that game.

We should not ignore the "minority" in favour of short reviews or attempts to use something (like 3DMark) to be the all-encompassing overall representation of a piece of hardware. The more the better... the more games used the better... the more benchmark applications used the better. Sadly, time is such a huge factor in the reviewing industry and just like many others, Beyond3D cannot spend a month conducting the review of a pice of video card to have a review that is as wide and varied as possible, which is what we all want.

You should use an as large a variety of game benchmarks as possible, I agree with that. On the other hand, I do not think a game benchmark should be used just because it is new or "it is DX9" or something. Since game benchmarks mostly just measure the performance of that particular game, there is no reason to benchmark using a game hardly anybody will play. Each game benchmark scales a bit differently, and therefore the game benchmark results that are valid to the readers of the reviews are those of games they will play themselves, or in other words, those that sell well. The results of game benchmarks are very much affected by the various code paths implemented for different hardware and thereby do not represent generic hardware performance. If you can use a TON of game benchmark results, you can experiment by adding results from less selling games, or upcoming games that most likely will sell less. But since the resources of the reviewer are limited, the more relevant benchmarks should be used in the first place.

FRAPS could widen the selection of benchmarks to use from games that sell well, but so far at least I don't know how to get genuinely repeatable results using that. But as somebody said, play BF with a bunch of guys online and measure the performance. I can't think of anything more real world than that.
 
Doomtrooper said:
Regarding fraps repeatability, I for one don't care if it is a online game. I would like to see reviewers use 'actual play' if it is a online game.
Say BF 1942 since it is so popular, join a server and benchmark 3 or four maps...that is realality.

The end user then knows exactly how that card will perform in a 'real world scenario', no the results are not repeatable...but real.
I mostly agree to it, but IMO there are two requirements: the map sequence has to be identical (the influence of maps, hence the systematic error, would be too big) and the gameplay has to be very dynamic (no camping, etc.).

Maybe someone could perform a comprehensive test of several 10 minute runs using the same maps, and derive mean and standard deviation from it. That way you could simply statistically "prove" how useful the result from one 10 minute run would be.
 
Well, I've agreed with a lot of this over time (trip through the Wayback Machine), but I disagree that Futuremark's "recipe" of GPU stress and control is uniquely suited for comparing graphics hardware independent of representing gameplay.

Games are precluded from being useful for this if equivalent workload is absent or misrepresented, such as with the mentioned unique hardware code paths and an inability to specify a universal one.

However, if a universal code path can be specified, then game benchmarking can be as potentially useful as 3dmark for comparing hardware performance (at least down to the level of abstraction for which it can be made universal), by virtue of adding more unique workloads ("datapoints") for evaluation of hardware performance.

This seems to me to be regardless of how many people play the game, and dependent on the merit of the workloads...just like 3dmark should be evaluated.

I am, however, glad that more effort is being focused by Futuremark on preserving that, along the lines that make sense to me (listing drivers giving results useful for comparison, etc), as this seems to indicate a recognition that this is what is required to distinguish 3dmark's usefulness along with the above.
 
WaltC said:
First, who says it cannot be extrapolated? nVidia found the TR:AoD benchmark (in a "real" 3d game) to be so threatening to the sale of its current nV3x product line that it took steps to have the benchmark completely removed from the software (I've never seen a publisher do this to his own software before.) If the results could not be "extrapolated" to real games supporting these features, why on earth would nVidia have made such a fuss over *all* such benchmarks, based on real games or not, all year long? If they couldn't be extrapolated then nVidia would have nothing to worry about, right? What worried nVidia was not that they couldn't be extrapolated--but that they could be. Of course, that should be obvious by now.
I'm not sure how you managed to relate Nvidia's position on TRAoD into my comments on 3DMark03. In any case, I'd be surprised if Nvidia's motivation for attempting to discredit these benchmarks was based on anything more than a desire to avoid negative press for their products.

Secondly, what possible difference does it make whether the 3d engine in a synthetic benchmark is in a "real" game? All 3d engines in "real games" are different from each other. In fact, even the performance of the "same" game engines used in different games can be much different (IE, Q3 and Wolfstein, or UT2K3 and U2, etc.) Is it reasonable to suggest that because you can't estimate the performance of UT2K3 by running U2, that UT2K3 is an "inavlid" benchmark, or vice-versa? Would you say that Q3 is an "invalid" benchmark simply because you cannot extrapolate from it how UT2K3 will run, or vice-versa? Of course not. What you do is run them all.
I think the point you're trying to make is that every game behaves differently, so how can you extrapolate. The answer, of course, is that you have to benchmark as wide a variety of games as possible. That still doesn't address the issue of why it would be advantageous to include 3DMark03 in that list of games, considering that it cannot be played.

That's where synthetics fit in. You run them in addition to all of the other "real" 3d games you run. They are especially of value in testing *API feature support* not yet in evidence in current 3d game engines.
Agreed. But as I pointed out, now that all of the API features used in 3DMark03 are also available in real games, it's hard to see the relevance.

It seems to me that the 3dMk03 results have been more than validated by everything nVidia has done to obstruct use of the benchmark this year, and by the emerging DX9 game engines which we have seen (as well as nVidia's attempted obstruction of those along with where it did succeed in obstructing them, as in the TR:AoD benchmark.)
Once again, I think Nvidia's actions were primarily driven by marketing/PR concerns, rather than genuine concern about the validity of the benchmark.
 
GraphixViolence said:
I'm not sure how you managed to relate Nvidia's position on TRAoD into my comments on 3DMark03. In any case, I'd be surprised if Nvidia's motivation for attempting to discredit these benchmarks was based on anything more than a desire to avoid negative press for their products.

You talked about "synthetic benchmarks" in the quote, and a company doesn't need to worry about "negative press" for its products from benchmarks unless there's something negative about those products which the benchmarks reveal, right?

The point was was that the 3dMk03 synthetic results extrapolated nicely into the TR:AoD game, which included a benchmark of a real game that also supported 3dMk03's conclusions.

I think the point you're trying to make is that every game behaves differently, so how can you extrapolate. The answer, of course, is that you have to benchmark as wide a variety of games as possible. That still doesn't address the issue of why it would be advantageous to include 3DMark03 in that list of games, considering that it cannot be played.

No, the point was that every 3d game has a different game engine, and that even with the same engine, different games do not perform identically. 3dMk03 has a 3d engine, just like any game you'd care to talk about. But it's not a game--it's a benchmark...:) Benchmarks have been around for a long time and they aren't going anywhere. The point is that simply because you cannot use 3dMK03 or UT2K3 to extrapolate how Q3 will run, that does not invalidate UT2K3 as a game or 3dMk03 as a benchmark.

Agreed. But as I pointed out, now that all of the API features used in 3DMark03 are also available in real games, it's hard to see the relevance.

Its relevance, of course, is proven by the fact that the 3dMk03 results are underscored by the API feature support in real 3d games, which means that the benchmark has much more in common with 3d games than it doesn't. What would have hurt the legitimacy of 3dMk03 would have been 3d games emerging whose DX9 feature support did not mirror DX9 feature support performance in the benchmark. Emerging DX9 games simply add to the benchmark's credibility, as opposed to rendering it useless.


Once again, I think Nvidia's actions were primarily driven by marketing/PR concerns, rather than genuine concern about the validity of the benchmark.

I think this is illogical, since if the benchmark had not been valid there'd have been no need to oppose it since emerging DX9 3d games would have proven nVidia's point. However, since these games are proving 3dMK03's point, instead, and not nVidia's, it becomes obvious that nVidia's concern was over the negative press it would receive by having the real weaknesses of its products exposed.
 
Xmas said:
Maybe someone could perform a comprehensive test of several 10 minute runs using the same maps, and derive mean and standard deviation from it. That way you could simply statistically "prove" how useful the result from one 10 minute run would be.
Someone alredy did...
 
Xmas said:
Doomtrooper said:
Regarding fraps repeatability, I for one don't care if it is a online game. I would like to see reviewers use 'actual play' if it is a online game.
Say BF 1942 since it is so popular, join a server and benchmark 3 or four maps...that is realality.

The end user then knows exactly how that card will perform in a 'real world scenario', no the results are not repeatable...but real.
I mostly agree to it, but IMO there are two requirements: the map sequence has to be identical (the influence of maps, hence the systematic error, would be too big) and the gameplay has to be very dynamic (no camping, etc.).

Maybe someone could perform a comprehensive test of several 10 minute runs using the same maps, and derive mean and standard deviation from it. That way you could simply statistically "prove" how useful the result from one 10 minute run would be.

There are results from extensive multi-player testing I did with BF1942 on a Radeon 9800 Pro back in August. My goal was to determine the settings were "playable" on the Berlin map. Most of the settings below are "playable", but I was more competitive when the minimum was closer to 40fps.

Note that I did not disable frame rate logging in FRAPS while waiting to respawn :)


BERLIN MAP


1920x1440 - NO AA - NO AF

2003-08-18 20:02:59 - BF1942
Frames: 11288 - Time: 212686ms - Avg: 53.073 - Min: 32 - Max: 79

2003-08-18 20:06:35 - BF1942
Frames: 5348 - Time: 96919ms - Avg: 55.180 - Min: 41 - Max: 85


1600x1200 - 0X AA - 8X AF

2003-08-17 08:42:24 - bf1942
Frames: 21753 - Time: 398934ms - Avg: 54.527 - Min: 30 - Max: 148

2003-08-19 20:58:37 - BF1942
Frames: 12491 - Time: 232224ms - Avg: 53.788 - Min: 30 - Max: 239

2003-08-19 21:02:55 - BF1942
Frames: 15252 - Time: 263378ms - Avg: 57.909 - Min: 26 - Max: 195

2003-08-19 21:07:30 - BF1942
Frames: 13203 - Time: 242188ms - Avg: 54.515 - Min: 29 - Max: 195


1024x768 - 4X AA - 8X AF

2003-08-17 22:35:09 - bf1942
Frames: 18939 - Time: 234837ms - Avg: 80.647 - Min: 59 - Max: 139

2003-08-17 22:45:48 - bf1942
Frames: 40550 - Time: 469435ms - Avg: 86.380 - Min: 46 - Max: 159

2003-08-17 22:53:39 - bf1942
Frames: 57539 - Time: 763878ms - Avg: 75.324 - Min: 46 - Max: 137

2003-08-17 23:06:27 - bf1942
Frames: 19883 - Time: 230131ms - Avg: 86.398 - Min: 44 - Max: 158

2003-08-17 23:10:19 - bf1942
Frames: 18639 - Time: 199196ms - Avg: 93.571 - Min: 61 - Max: 202


1280x960 - 4X AA - 8X AF

2003-08-18 17:58:57 - bf1942
Frames: 19273 - Time: 378804ms - Avg: 50.878 - Min: 27 - Max: 99

2003-08-18 18:08:50 - bf1942
Frames: 26680 - Time: 442095ms - Avg: 60.349 - Min: 31 - Max: 104

2003-08-18 18:21:02 - bf1942
Frames: 35722 - Time: 601385ms - Avg: 59.399 - Min: 37 - Max: 108

2003-08-18 18:33:04 - bf1942
Frames: 23757 - Time: 421046ms - Avg: 56.423 - Min: 35 - Max: 110

2003-08-18 18:40:07 - bf1942
Frames: 25522 - Time: 429708ms - Avg: 59.393 - Min: 31 - Max: 115


1024x768 - 6X AA - 8X AF

2003-08-18 19:04:45 - bf1942
Frames: 17834 - Time: 306240ms - Avg: 58.235 - Min: 35 - Max: 100

2003-08-18 19:31:09 - bf1942
Frames: 20105 - Time: 299831ms - Avg: 67.054 - Min: 39 - Max: 100

2003-08-18 19:45:03 - bf1942
Frames: 25507 - Time: 370713ms - Avg: 68.805 - Min: 26 - Max: 140

2003-08-18 19:51:46 - bf1942
Frames: 15459 - Time: 226276ms - Avg: 68.319 - Min: 32 - Max: 123


ABERDEEN MAP

1024x768 - 6X AA - 8X AF

2003-08-19 22:42:44 - bf1942
Frames: 37272 - Time: 275166ms - Avg: 135.452 - Min: 58 - Max: 231



WAKE ISLAND MAP

1024x768 - 6X AA - 8X AF

2003-08-19 22:49:19 - bf1942
Frames: 90980 - Time: 724852ms - Avg: 125.515 - Min: 64 - Max: 289

2003-08-19 23:01:28 - bf1942
Frames: 75990 - Time: 805187ms - Avg: 94.375 - Min: 28 - Max: 268

2003-08-19 23:14:56 - bf1942
Frames: 20644 - Time: 213637ms - Avg: 96.631 - Min: 56 - Max: 174
 
I guess I just missed this

Didn't realize they'd offered a good improvement for Pro users:.a refrast database. Would be a pretty significant step forward for general 3dmark 03 usage if downloading and comparison was automated as part of 3dmark 03.

Seems relevant, and maybe I'm not the only one who overlooked/forgot it.

EDIT: :oops:
 
WaltC said:
You talked about "synthetic benchmarks" in the quote, and a company doesn't need to worry about "negative press" for its products from benchmarks unless there's something negative about those products which the benchmarks reveal, right?

The point was was that the 3dMk03 synthetic results extrapolated nicely into the TR:AoD game, which included a benchmark of a real game that also supported 3dMk03's conclusions.
I can only assume you are referring to results with Nvidia's "pre-cheat" drivers here, because with their currently available drivers they are apparently outperforming ATI in 3DMark03 but are well behind in TRAoD. I guess there's not much more to say about this until Futuremark releases their approved drivers list.

The point is that simply because you cannot use 3dMK03 or UT2K3 to extrapolate how Q3 will run, that does not invalidate UT2K3 as a game or 3dMk03 as a benchmark.
If a UT2k3 benchmark tells you how fast a system will run UT2k3, then it is providing useful information whether or not that performance can be extrapolated to other games. On the other hand, if a 3DMark03 benchmark only tells you how fast a system will run 3DMark03, which is not a game, then that information isn't really useful unless it can be extrapolated to other games.

Its relevance, of course, is proven by the fact that the 3dMk03 results are underscored by the API feature support in real 3d games, which means that the benchmark has much more in common with 3d games than it doesn't. What would have hurt the legitimacy of 3dMk03 would have been 3d games emerging whose DX9 feature support did not mirror DX9 feature support performance in the benchmark. Emerging DX9 games simply add to the benchmark's credibility, as opposed to rendering it useless.
How do you know that the DX9 features in 3DMark03 are the same ones implemented in the same way in current DX9 games? I'd actually be quite surprised if this were true, given how flexible DX9 is.

I think this is illogical, since if the benchmark had not been valid there'd have been no need to oppose it since emerging DX9 3d games would have proven nVidia's point. However, since these games are proving 3dMK03's point, instead, and not nVidia's, it becomes obvious that nVidia's concern was over the negative press it would receive by having the real weaknesses of its products exposed.
Once again, I'm referring to results which show NV being on par with ATI in 3DMark03 but behind in DX9 games (with their latest driver releases), while you are referring to results which show ATI ahead in both (with Det44.03 and earlier). Until Futuremark officially rules results from Nvidia's newer drivers as being invalid, these are what people are seeing and will continue to see.
 
Back
Top