Revisiting the synthetics discussion
Brent said:
Althornin said:
Brent, i understand the focus of your site.
Waht i dont understand is how you can shrug off the predictive benefits of synthetic benchmarks - they can predict how future games will run, and most gamers want to know how thier card will handle future games as well!
Please, dont ignore this, or consider it a bash, i just dont get it. I'd like some evidence that sythetic benchmarks fail to predict future game performance (which is gonna be hard to come by, because all the shadermark/3dmark tests let us predict performance in games like TR:AOD just fine...) - if thats not your reason, then what is?
Apparently you think gamers cannot understand how to predict future performance? Or what?
Well, you see, while synthetics may be able to tell you which card is faster at whatever PS version etc.. That doesn't necessarily mean which card will be faster in an actual game.
But it does mean which hardware is faster at a particular workload. A good synthetic test gives you detailed information on what the workloads are.
Replicating that level of specific information and isolation is indeed just as possible for a game, if you go out and do the work and analysis to get the same information synthetics give you. Are you going to be doing this?
Developers and drivers from IHV's do use valid optimizations to increase performance.
They use, as some examples, optimizations that do the same workload differently (like different AA and AF methods can be done by hardware), and optimizations that replace a given workload with something else to misrepresent fps (like "on the rail" clipping plane hand tuning for time demos).
Valid and invalid are determined by how these optimizations are represented, and what they deliver to the user. The latter, for example, is represented as "in game" experience, but actually only delivers faster timedemos/fixed playback.
What your discussion actually confuses me about is whether you are reviewing hardware or games, and I'll explain.
Sure, the synthetic may tell you the raw power, but it doesn't always work out to mean the same thing when it comes time to test a game.
As far as the game is using the same thing the "synthetic" tests,
yes it does. That's why the information it works to give you puts synthetics at and advantage.
Let's discuss your hypothetical situation.
Here is a hypothetical example, Tomb Raider AOD uses PS 2.0. The 9600XT in PS 2.0 according to synthetics vs. the 5700U shows a big difference.
Yes it does.
However, in the game the difference of the minimum FPS of a recorded demo in a level is only on the order of 4 FPS in difference.
What is the relevance of specifying minimum FPS? Minimum FPS is representative of a specific game experience on a system
at one point in time..what about the other points in time? You're introducing that in an arbitrary context.
A synthetic
doesn't begin to pretend to define what the minimum fps will be in a given game, so why evaluate it based on that? There is no "minimum FPS for games" synthetic, is there? I'm aware of pixel shader, fillrate, vertex shader, CPU, and AGP bus synthetics though...they apply to minimum fps when the associate factor(s) does.
What it sounds like (assuming the 4 fps difference is proportionately insignifcant, occurs at similar times, etc) is that either the system (CPU, AGP, hard disk, etc), or some other GPU factor besides PS 2.0 shading, is the determing factor when this hypothetical minimum occurs.
How does that make the times when the fps difference is more than "just" 4 fps any less significant?
Synthetics don't claim to represent every second of performance, so why do you hypothetically test them by that criteria and dismiss them entirely with regard to "game experience" when they fail it?
The percentage of the difference in the FPS in the game is less than the percentage of the difference in the synthetic tests.
What "games", doing what? Which "synthetic tests"? I guarantee you that there are situations where "the percentage of the difference in the FPS in the game a user will play is less than the percentatge of the difference in <any game chosen instead of a synthetic>", such that choosing a game is at least as much of a failure of your criteria as a synthetic.
Except...that a good synthetic will set out to inform of you of when this discrepancy will happen and why, and how this applies to the
hardware. Also, if you don't dismiss it out of hand, it will tell you when what it measures
will actually be indiciative of gaming performance.
It seems clear that a synthetic benchmark will be indicative when similar things are the bottleneck. I.e., a DX 9 benchmark should be indicative of the features unique to DX 9, like pixel shader and vertex shader 2.0 models at the current time.
3dmark 03, as one example, does demonstrate success in this, when it is actually allowed to use (as one example) its DX 9 workload, and when compared to games stressing (in correlation) DX 9 workloads.
Not in predicting fps, but in indicating specifically how hardware compares in executing such workloads.
The problem that can lead to error here is when you fail to be informed of when either of these parties isn't doing what you think it does, such as DX9 features not actually being stressed.
But this applies to both "synthetics" and "'real' games". But it is with synthetics that you are likely to have effort being put forth by the authors on correcting that error, because such error defeats their purpose...this is not true with games.
The distinction here is not important for reviewing games, but it is crucial for reviewing hardware and the drivers that expose its features and performance.
Now, you could indeed try and duplicate all the detail of information of synthetics with a game, and isolate the specific aspects of performance involved.
Are you going to do this anew for every game? Are game authors going to always set aside time to help you with extracting information? If they do, it just became a synthetic benchmark...it is that isolation that is the distinction.
So according to the synthetic tests the 9600XT should be a lot faster in TRAOD,
When comparing PS 2.0 shaders being executed, and equivalent features being offered, yes.
but in actual reality the gameplay performance isn't that much faster.
Are you still just comparing the minimum fps?
Are you comparing PS 2.0 shaders?
See what I'm getting at, the synthetics will tell you the raw performance differences, but in a game it doesn't always parallel with the synthetic tests, for whatever reasons.
Sure, but why do you ignore the times it
does parallel? That is why synthetics are not proposed by anyone (well, that I've encounteered recently, AFAIK) to replace measuring game performance, but to be an information source used to get more information from game performance measurements.
For example, in your hypothetical situation, it might be telling you other things: pixel shader 2.0 is not the limiting factor in the game in that situation; that one of the cards might not be doing the same pixel shader workload; that one of the cards might be removing other limiting factors and you don't know about it. Synthetic pixel shader 2.0 (and vertex shader 2.0, and CPU speed, and bandwidth, and fillrate...etc) tests will help you find out which is the case, or if something else might be occuring.
However, as long as you ensure that you are comparing the same workloads in games (harder when you throw out an information source, but you can depend on other sites to investigate for you and report on it)
and don't ever offer an opinion on the information you ignored and propose it was an informed opinion, you can indeed do "this is how this hardware runs these games and these games alone" reviews accurately, because that is the effort you extended. Perhaps even excellent ones, if you take steps to ensure that something like the timedemo issue is avoided (as I understand you are).
I'm not shrugging off synthetics, I'm just saying synthetics don't represent gameplay experience, and gameplay experience is the focus of HardOCP.
Do you call them gameplay experience reviews or hardware reviews? Do you still compare different hardware and draw conclusions?
For example, I wouldn't be surprised if a 9600 non Pro was around the same speed running PS 2.0 in Halo as a 8500 when running PS 1.4, perhaps even slower, and there should be quite a few areas where the image quality will be, subjectively, very close. This comparison is indeed a valid gameplay exprience comparison, but it is uniquely uninformative about the actual differences in the
hardware involved.
The Direct Question: "How do current games predict future performance and longevity of a video card?"
The short answer, they don't. They can give you an idea because a future game may use the same gaming engine, but as everyone knows just cause it uses the same engine doesn't mean it will perform the same. So what can you do? Pretty much stay on top of the game and make sure you are using the latest most popular games out there using the most features, i.e. what will become heavy feature usage like heavy shader usage and the like and which gaming engines will be used the most.
Sounds like a perfect place to use a good synthetic, which has these advantages over a game:
- It measures and isolates a variety of things.
- t clearly informs what it is measuring.
- It has a vested interest in successfully measuring what it says it tries to measure (to be a good synthetic).
It
shares this disadvantage:
Any IHV can target it to misrepresent their hardware in comparison to other hardware.
If you care about providing accurate information, it seems to me that it is always a time and place to utilizes those advantages, because it helps in overcoming the common disadvantage
games and synthetics share.
If that's all you use a synthetic benchmark for, it has served its purpose. That's all any "synthetics advocate" has claimed AFAICS, including Futuremark. It is simply a matter of using them in a way that makes sense...as the minimum fps comparison does not, for example.
Out of curiosity: are you going to talk to the programmers for every game introduced and get the same level of information and detail "synthetics" will give you up front? This is actually possible with some game authors, and that would indeed serve well, as long as it happens.
Now, me personally, I like synthetics, even Kyle will tell ya, I always use to include synthetics in the reviews, I've used the PS Precision test, I've used Shadermark from the very beginning, and of course I've been using Futuremark/Remedy since Final Reality came out a long time ago! I even started to use Humus's demo's and Demo Scene demo's. I agree that they have a use and I do believe they have a place.
OK. But that's philosophy. It is the application of it to providing information to readers that this discussion is about.
However, that place is not at HardOCP.
Then HardOCP is a place more easily exploited to misinform about hardware, and a place failing to use all available information and tools to independently ensure that it is
not doing so, and educate its readers accordingly. Unless you avoid comparing hardware based on what you are willing to do, and warn users not to do the same.
The focus of HardOCP's reviewing of video cards has changed, and the focus is now on gameplay experience, and so I have taken up the reigns and heading that up at HardOCP and I think we've done a great job so far and I will continue to focus on games and the experience they deliver for as long as I work at HardOCP.
You are choosing arbitrarily more guesswork in place of information available to you, and against your stated philosophy. You will depend on the good will of IHVs to honestly present their products to you in comparison to their competion (or on their incompetence in misrepresentation), and the work of other people's investigation to correct misinformation.
Actually, the latter should work if you honestly go back and correct any oversights, but that leaves you your readers with a lack of timely veracity. You'd have to bear the possibility of looking often wrong if IHV(s) decide to repeatedly take advantage of your popularity while you follow such a practice of giving them a smaller set of information necessary to distort.
There is no wrong or right here IMHO, just simply a different method.
Also, there is doing your best to be accurate, and there is specifically choosing to give up on the endeavor in one or more areas.
Whether one results in "correct or incorrect" information depends on whether happenstance or some party with a vested interest is working against accurate information being represented, and how well the inaccurate factor is hidden.
Whether one is "wrong or right" depends on whether you tell the truth about the information you did and did not have available when you made your choice in investigation, and what you propose your investigation represents.
Either way, there is "wrong and right" involved.