A look at benchmarking

Nite_Hawk

Veteran
A lot of people have been talking lately about the 3DMark situation, and it's gotten me thinking about benchmarking. What are the purpose of benchmarks? What do they help us do/realize? How can we best use them? I've gone back and forth on the issue a couple of times, but I think I've come to a couple of conclusions.

First, it's important to realize that a videocard (or any piece of hardware) is built on the concept of abstractions. The hardware is abstracted by the video drivers, which may be abstracted by an HLSL/OpenGL/DirectX/etc, which is finally abstracted by the game engine. This is important, because at each abstraction layer we can measure performance, and the performance at each layer is going to be important to different people.

Let's begin by talking about the hardware itself. What is the architecture of the card, and what information can we gather about the card that will tell us about how it *should* perform? This is important because it gives you a lot of good general information about the card. If you look deeper though, it's going to be most useful for the people who write the drivers for the card. It tells them how they should go about writing drivers the expose the greatest potential of the hardware.

Next, lets talk about the drivers. At least for windows users, developers are usually limited to only using the drivers released by the card manufacturers. Thus, it's important for the drivers to expose as much of the potential of the hardware as possible. determining how well the drivers expose the hardware is going to be most important to developers as it will affect how they go about optimizing their code.

The next layer is somewhat similar to the layer above. an HLSL, or any other language that abstracts the developer away from the hardware is going to only be as good as the compiler. If the HLSL compiler does a really good job at compiling optimized code, it may run faster than what the developer could write in assembly. Again, this is going to affect how developers optimize their code.

The final layer is that of the game engines themselves. At this layer, the speed at which the game runs is dependent on the programmers that have written the engine. Have they used optimization tricks for the card to improve performance, or only met the baseline specs to reduce complexity and support the widest range of cards? How the game runs is going to be really important to the end user, because gamers are the ones that eventually end up buying the cards.

Given the above analysis, it seems that performance at different layers is going to be important to everyone in some sense, but is going to be most important different kinds of people. Still, what should we, as consumers of the card be most concerned about? Is the current gaming experience the only important thing, or should we also look at the potential of the card and what developers could do with it in the future?

This of course is going to be different depending on the individual, but I think that most can agree that to a certain extent at least, both are important. First, the architecture of the card is important. It helps give a good impression of what the card is capable of with good drivers. It's important though, not to let it overly distract you. Specs can be misleading.

Next, the drivers and compiler optimizations for a given card are important, because it's what developers are limited by when they are going to make a game. Highly optimized demos and benchmarks can be useful, because they help us see what kind of potential is available to the developer; what a developer could do if they had the inclination. Because developers are interested in making their games perform well, this should be an indication of how future games will likely perform.

Finally, the performance in games is going to be important because it shows the card in a variety of situations. Some games will be optimized for it, others won't be at all. It should help give a good over all impression of what the card is capable of given the amount of work developers are willing to do to support it. An example might be a card that is extremely fast, but only when optimized for. If no one is willing to take the time to do the optimizations, it might not be a good buy even though it has a lot of potential. On the other hand, if developers are willing to support it, it might be the best thing since sliced bread.

In conclusion, I'd like to say that that I think analyzing the performance of a card at multiple different layers is important. It can tell us the potential of a card, and also how the card is being supported by developers. I personally tend to think that *any* benchmark is useful to a certain extent as long as you know exactly what it is testing. Thanks for reading all the way through this if you've actually reached the bottom here. :)

Nite_Hawk
 
This may sound harsh - but my opinion is that many people who run these so called benchmarks do not understand the meaning behind the word. Successful performance related measurement tests/benchmarks need to give detailed descriptions of requirements and objectives. Not to discount planning activities, preparing resources, executing tests and reporting results.

None of which can be done without a full understanding of what you are benchmarking. Bottom line in my opinion is that there are no real "defined" graphic benchmark. It seems that 3dmark is trying - but still not a industry standard. Then everyone elses uses a set series of games...

Maybe this is what is lacking - a standard.
 
saf1:

That brings up some interesting points. When benchmarking, what exactly are you benchmarking, and what do the scores you get mean? This is a big pet peeve of mine. While the benchmark itself might yield useful data, a lot of times it's interpreted incorrectly, or presented in such a way that it's useless. We really need standard deviations and histograms to tell us what the heck is going on, rather than just average numbers. (especially in the case of fps). I never really understood why so many reviewers don't atleast mention this when doing reviews. Perhaps it's simply impossible (or hard to do) at this point, but I think it's really necessary for us to make worthwhile comparisons in a lot of cases.

Nite_Hawk
 
Back
Top