Hi guys,
While so many different debates concerning benchmarking are going on at the same time it's hard (or next to impossible) to answer them all extensively. Having now read more B3D forums that ever before (thanks for giving me a nudge worm!)I do think most of the points have already been more or less evaluated, but I thought I'd join the conversation regardless.
Here are some of the points of view that we (Futuremark) see as important:
- Performance measurement is a crucial part of PC industry. Manufacturers spend a lot time, resources and money on how new technologies are designed and measuring the end result is the final step before customer purchases hardware. From a manufacturers point of view the important part is how hardware is being bought and sold.
- Any application can be used as a benchmark. The applications that are widely used will become important for previously mentioned purchasing process. Most manufacturers will do what ever they can to ensure their hardware is shown in favorable light with at least a dozen of the most important applications like this. As benchmark developers we get constant pressure from all manufacturers and in an evenly applied environment that's a good thing. Please note that many game developers also live with this pressure. (I hope that possible game developers here in the forums might actually enlighten the rest of us with one or two examples?)
- Cheating is possible with any application. With games there's a number of things you can do which will improve your performance profile (dropping frames, funny stuff with aa or filtering, texture resolutions, etc.) without improving the end user experience or even actually decreasing it. While the argument that games should be used as benchmarks does have a lot of merit, the conclusion that synthetic benchmarks work against the industry is false. Synthetic benchmarks like 3DMark03 make manufacturers do the same optimisation work for the application but also to the DirectX API. There's currently few games in the market that would agressively use DX8 features. 3DMark03 is the first tool in a row of many that will enable manufacturers to work with this kind of issues. In the end both developers and gamers benefit it manufacturers optimise for the API's and all developers get the same benefits from this.
- Our aim in developing benchmarks (not just 3DMark, but all benchmarks that we design) is to ensure that objective apples-to-apples comparisons are possible. In order to do this we give all manufacturers and equal opportunity to be involved in our development through beta program cooperation. We also disclose all technical information about our benchmarks to the public. How many other developers do the same? We have policies in place with regard to what DirectX features we adopt, when we adopt them etc. Finally we work very closely with Microsoft to ensure that we are measuring DX in a proper manner. Summing this up we realize that developing benchmarks is a signifincant responsibility and do our best to meet this absolute requirement for objectivity.
- We began developing benchmarks back in 1997 when the industry was still very confused about what was fast 3D hardware and what was not. Back then only thing that we had for performance metrics was games. This confusion back then was due to fact that each and every manufacturer was always able to show a game that performed fast with their hardware or if everything else failed they'd show a game that wasn't even published yet and claim something about the future performance.
I believe most of the readers here in B3D forums were already involved back in 1995 to 1997 and remember what kind of jungle the industry was back then. With such a confused market selling hardware was dictated almost solely by the size of your marketing budget and the skill of your PR department.
When benchmarks emerged from Ziff-Davis and us the net result was that everyone sold more hardware. Not necessarily because they received good benchmark scores (all manufacturers offer both high end and low end products) but because consumers felt safer with their purchase decisions. When in doubt one tends to postpone a purchase and confusion hurts everyone in the industry.
- Frankly, after last week it feels like we're going back to 1997. Games as benchmarks certainly have a lot of positive aspects, but the scary part is that then every manufacturer can highlight a game that they do well with. Looking at the examples so far I'm on the verge of despair. While people demand an amazing level of technical disclosure from the benchmarks I've seen very few technical breakdowns about _any_ of the games that are 'commonly accepted' as solid benchmarks. Additionally there does not seem to be any logical or at least openly disclose way to explain why some game is used as a benchmark while the other is (Serious Sam, Commache4 being good examples, while both are excellent games neither really qualifies as a triple A title, nor to my knowledge has publicly disclosed any technical benchmark information).
- Deliberately creating confusion to the market place is a very short-term focused goal which is not beneficial to anyone. It first and foremost will lead to unhappy customer who will move over from PC world to PS2 or somewhere where they have a warm and happy feeling about how to spend their money.
- Futuremark as a company is dedicated to creating objective and technically accurate benchmark software that is publicly available with technical disclosures. Solid and time tested standards for performance measurement are an integral part of computer industry and in part ensure that computer hardware manufacturers focus more on their actual engineering budgets instead of relying simply to their marketing budgets.
Cheers,
AJ
Ps. Given the level of criticism over last week I’ll have to pre-empt a few follow-up questions:
Yes, games are good as benchmarks and we support their use. They as well as other tools may be used to mislead, so demand public technical disclosures about them also (incidentally John Carmack does this very well in his plan files).
No, 3DMark03 did not invalidate 3DMark2001. They are different tools that should be used for measuring different generations of hardware. 3DMark99 and 3DMark2000 have now been retired as legacy products.
While so many different debates concerning benchmarking are going on at the same time it's hard (or next to impossible) to answer them all extensively. Having now read more B3D forums that ever before (thanks for giving me a nudge worm!)I do think most of the points have already been more or less evaluated, but I thought I'd join the conversation regardless.
Here are some of the points of view that we (Futuremark) see as important:
- Performance measurement is a crucial part of PC industry. Manufacturers spend a lot time, resources and money on how new technologies are designed and measuring the end result is the final step before customer purchases hardware. From a manufacturers point of view the important part is how hardware is being bought and sold.
- Any application can be used as a benchmark. The applications that are widely used will become important for previously mentioned purchasing process. Most manufacturers will do what ever they can to ensure their hardware is shown in favorable light with at least a dozen of the most important applications like this. As benchmark developers we get constant pressure from all manufacturers and in an evenly applied environment that's a good thing. Please note that many game developers also live with this pressure. (I hope that possible game developers here in the forums might actually enlighten the rest of us with one or two examples?)
- Cheating is possible with any application. With games there's a number of things you can do which will improve your performance profile (dropping frames, funny stuff with aa or filtering, texture resolutions, etc.) without improving the end user experience or even actually decreasing it. While the argument that games should be used as benchmarks does have a lot of merit, the conclusion that synthetic benchmarks work against the industry is false. Synthetic benchmarks like 3DMark03 make manufacturers do the same optimisation work for the application but also to the DirectX API. There's currently few games in the market that would agressively use DX8 features. 3DMark03 is the first tool in a row of many that will enable manufacturers to work with this kind of issues. In the end both developers and gamers benefit it manufacturers optimise for the API's and all developers get the same benefits from this.
- Our aim in developing benchmarks (not just 3DMark, but all benchmarks that we design) is to ensure that objective apples-to-apples comparisons are possible. In order to do this we give all manufacturers and equal opportunity to be involved in our development through beta program cooperation. We also disclose all technical information about our benchmarks to the public. How many other developers do the same? We have policies in place with regard to what DirectX features we adopt, when we adopt them etc. Finally we work very closely with Microsoft to ensure that we are measuring DX in a proper manner. Summing this up we realize that developing benchmarks is a signifincant responsibility and do our best to meet this absolute requirement for objectivity.
- We began developing benchmarks back in 1997 when the industry was still very confused about what was fast 3D hardware and what was not. Back then only thing that we had for performance metrics was games. This confusion back then was due to fact that each and every manufacturer was always able to show a game that performed fast with their hardware or if everything else failed they'd show a game that wasn't even published yet and claim something about the future performance.
I believe most of the readers here in B3D forums were already involved back in 1995 to 1997 and remember what kind of jungle the industry was back then. With such a confused market selling hardware was dictated almost solely by the size of your marketing budget and the skill of your PR department.
When benchmarks emerged from Ziff-Davis and us the net result was that everyone sold more hardware. Not necessarily because they received good benchmark scores (all manufacturers offer both high end and low end products) but because consumers felt safer with their purchase decisions. When in doubt one tends to postpone a purchase and confusion hurts everyone in the industry.
- Frankly, after last week it feels like we're going back to 1997. Games as benchmarks certainly have a lot of positive aspects, but the scary part is that then every manufacturer can highlight a game that they do well with. Looking at the examples so far I'm on the verge of despair. While people demand an amazing level of technical disclosure from the benchmarks I've seen very few technical breakdowns about _any_ of the games that are 'commonly accepted' as solid benchmarks. Additionally there does not seem to be any logical or at least openly disclose way to explain why some game is used as a benchmark while the other is (Serious Sam, Commache4 being good examples, while both are excellent games neither really qualifies as a triple A title, nor to my knowledge has publicly disclosed any technical benchmark information).
- Deliberately creating confusion to the market place is a very short-term focused goal which is not beneficial to anyone. It first and foremost will lead to unhappy customer who will move over from PC world to PS2 or somewhere where they have a warm and happy feeling about how to spend their money.
- Futuremark as a company is dedicated to creating objective and technically accurate benchmark software that is publicly available with technical disclosures. Solid and time tested standards for performance measurement are an integral part of computer industry and in part ensure that computer hardware manufacturers focus more on their actual engineering budgets instead of relying simply to their marketing budgets.
Cheers,
AJ
Ps. Given the level of criticism over last week I’ll have to pre-empt a few follow-up questions:
Yes, games are good as benchmarks and we support their use. They as well as other tools may be used to mislead, so demand public technical disclosures about them also (incidentally John Carmack does this very well in his plan files).
No, 3DMark03 did not invalidate 3DMark2001. They are different tools that should be used for measuring different generations of hardware. 3DMark99 and 3DMark2000 have now been retired as legacy products.