B3D custom demos not availed to public -- Agree?

Reverend · May 24, 2003

With the current cheating issue and dubious optimizations going on, will you object to B3D making their own demos and use them for its reviews but do not make them available to the public nor even say what level such demos are based on?

We will use the same demos for all our benchmarks of course.

Note that Dave hasn't exactly made any decision on anything about the future of reviewing at B3D so this poll of mine is just an expression of my personal curiosity without Dave's consent. What I'm suggesting here doesn't necessarily agree with what Dave may have in mind -- Dave runs the site and makes the decisions.

PS. This is a quick post/poll so if you have any other options you'd like to see, mention them.

g__day · May 24, 2003

Tough one - transpareny or closed tests. I trust you so I could live with closed tests. Would a portfolio of open and closed tests be more effective?

Saem · May 24, 2003

I wonder how feasible it would be to run a demo in some software renderer which was designed to give an accurate rendering(up to the requirements of the API spec) and then do a binary comparison of the output vs the output of something rendered by a video card.

You could take somewhate random samples, since you'd have to take a fair bit per demo and possible target key areas. The only thing would be HDD space and time to software render the demo. You could save additional time by only rendering the selected screens with the software renderer.

The problem is filtering that's applied by the hardware would have to be compensated for.

YeuEmMaiMai · May 24, 2003

I am all for you guys keeping your timedemos in house and expecially using less pupular maps or making your own (if you have the time)

Calavaro · May 24, 2003

Here is a thought.
How about having a set of non-public demos and a set of public demos. Entirely different of course, but on the same levels.
This way you could detect any cheats very easily.
IHV's will be taking a big risk "optimizing" as you can easily verify it by running both sets of benchmarks (non-public and public).

cho · May 24, 2003

public the demo after the review been public , and use a new non-public demo in next review .

Tokelil · May 24, 2003

cho said:
public the demo after the review been public , and use a new non-public demo in next review .

Thats an insane amount of work and more importantly for me, it makes the tests uncomparable from review to review!

Himself · May 24, 2003

Tokelil said:
cho said:

public the demo after the review been public , and use a new non-public demo in next review .

Click to expand...

Thats an insane amount of work and more importantly for me, it makes the tests uncomparable from review to review!

Results are generally not comparable from review to review anyway, since the latest drivers are always used. If you do a shootout then you redo all the tests anyway, right? I don't see the problem. Nothing stopping you from redoing the tests with the old benchmarks and doing a new set with new benchmarks if you really need to compare how drivers have changed or something.

And I think making the benchmarks available after the review is essential, otherwise sites could just be generating random numbers. I don't think much of those doom3 numbers, I wouldn't want every review to be just like that.

g__day · May 24, 2003

A real sickening thought occured to me - (and probably others a few weeks ago

) If NVidia cheat on one important benchmark, and skimp on Doom 3 beta, what's to say they are not endemically cheating on a host of major benchmarks - trading IQ for performance in tricky ways?

Cold shudder runs down my spine...

Reverend · May 24, 2003

Calavaro said:
Here is a thought.
How about having a set of non-public demos and a set of public demos. Entirely different of course, but on the same levels.
This way you could detect any cheats very easily.
IHV's will be taking a big risk "optimizing" as you can easily verify it by running both sets of benchmarks (non-public and public).

This would mean turning every video card review into a "Investigating Possible Cheating" article. That is not the intention nor purpose of a review.

We would love to do continuous "investigating possible cheating" but this would involve an inordinate amount of time due to the "continuous" nature it evidently demands. And I really don't think this belongs in every product review as a "review section". In my opinion, if a product review has to include such a section, I'd find it terribly depressing as well as insulting to a IHV board partner (if it is a board vendor product we're reviewing). Imagine what Albatron would think if I have a "Any Cheating Involved?" section everytime I review their NVIDIA-based product (I think it should be understandable why I chosed NVIDIA as the example here... in reality, this sort thing should apply to every IHV). The issue here is NVIDIA, not their board partners. We must make this distinction clear when we review board vendor products.

Of course, if what you're saying is to do this sort of investigation but don't mention it in the published review if there are no discovered cheats, I still think it would involve far too much work than is necessary compared to simply not providing custom demos to the public to start with.

If we want to do such "cheating investigations" articles, which really should be regularly updated, what you propose (already in my mind when this 3DMark03 story broke for the first time at ExtremeTech) is a viable option, obviously.

Calavaro · May 24, 2003

Reverend said:
Calavaro said:

Here is a thought.
How about having a set of non-public demos and a set of public demos. Entirely different of course, but on the same levels.
This way you could detect any cheats very easily.
IHV's will be taking a big risk "optimizing" as you can easily verify it by running both sets of benchmarks (non-public and public).

Click to expand...

This would mean turning every video card review into a "Investigating Possible Cheating" article. That is not the intention nor purpose of a review.

We would love to do continuous "investigating possible cheating" but this would involve an inordinate amount of time due to the "continuous" nature it evidently demands. And I really don't think this belongs in every product review as a "review section". In my opinion, if a product review has to include such a section, I'd find it terribly depressing as well as insulting to a IHV board partner (if it is a board vendor product we're reviewing). Imagine what Albatron would think if I have a "Any Cheating Involved?" section everytime I review their NVIDIA-based product (I think it should be understandable why I chosed NVIDIA as the example here... in reality, this sort thing should apply to every IHV). The issue here is NVIDIA, not their board partners. We must make this distinction clear when we review board vendor products.

Of course, if what you're saying is to do this sort of investigation but don't mention it in the published review if there are no discovered cheats, I still think it would involve far too much work than is necessary compared to simply not providing custom demos to the public to start with.

If we want to do such "cheating investigations" articles, which really should be regularly updated, what you propose (already in my mind when this 3DMark03 story broke for the first time at ExtremeTech) is a viable option, obviously.

Good points. I guess I don't think with a reviewers mind

I just want to be sure what I buy is actually what was advertised.

I do think there should be a continious vigilant effort to stomp out cheating. I suggest that you have one board (vendor name not disclosed) from nvidia and ATI and continuasly test new drivers on this setup. This way it will be driver only focus and involve nothing else.

Pete · May 24, 2003

I trust B3D, but do you have to go as far as to keep secret what level the demo is run on? I was thinking a private demo, but you include some screenshots or a description of the action. I mean, can a company optimize for a map without knowing what path the camera will take and without you noticing?

I'm also not sure why you can't bench both a private demo and a public one. Given what nVidia has done, it's not too far-fetched to consider every review a no-cheat verification. But if the work is what's stopping you, then I can't really complain until I start paying you for your services.

If it's not obvious, I voted "I trust you guys," all the while thinking "Trust, but verify."

kyleb · May 24, 2003

Reverend said:
(I think it should be understandable why I chosed NVIDIA as the example here...

no Rev, i am not sure what you are up to here, it seems to me like you are making some fool-hearty attempt to tarnish nvidia's impeccable name and i am generally disappointed that you would stoop to such underhandedness.

:?

seriously though, i don't think non-public benchmarks are the answer; while i would feel reasonably comfortable with b3d in such a situation i really think that it is a generally a bad precedent to set. better would be custom but publicly available benchmarks, and for others to follow that lead as well so any company trying to cheat them would really have their work cut out for them trying to keep up with all the possibilities.

dksuiko · May 24, 2003

Trust no one...! <plays X-Files theme song>

Anyway, cho beat me to the post.. I think you should just make the demo public after you review the card. And use a new one every time you make a review.

Dave H · May 24, 2003

My first choice would be what cho said: record new timedemos for each review/shootout, and release them with the review. That way the IHVs have no chance to optimize for that particular demo, but the results are still verifiable and repeatable by third parties (you know, that whole scientific method thing).

If this is too much work (and it doesn't seem like it should be all that much work to me, but I really don't know), then yeah, the next best option is to keep your demos secret. I can't speak for everyone on this board, but I'm going to anyways: we all trust a B3D review 100%, not only to be conducted properly but to draw insightful conclusions from the data. But even so, sometimes just the appearance of transparency brings its own rewards.

Of course the ideal option would be what has been mentioned in some other threads: a randomizable benchmarking utility. That is, based on a seed number, the utility would randomly generate, at the very least, a new "script" for the demo (by "script" I mean the path taken through the level plus whatever occurances take place: weapon fire, NPC actions, explosions, etc.); and possibly also randomized geometry counts or even randomized shader code. Results would still be comparable between cards, and verifiable by third parties: just use the same seed number. But this way there's no extra work: you just pick a new number for each review (and publish it).

Ironically, the best chance I can see for such a system anytime in the near future would be if Futuremark develops it, perhaps for 3dM04 or something. Unfortunately, the one false note on FM's part (IMO) in a day of otherwise saying exactly the right thing came when AJ wrote that any attempts to special-case 3dMark are illegal and thus that the benchmark itself need not be designed to take possible cheating into account. His comment that because 3dMark is closed source he doesn't expect IHVs will try to reverse-engineer it struck me as security through obscurity, which is particularly naive in a situation like this one. And the whole attitude indicates to me that FM doesn't feel a randomized benchmark should be necessary (although perhaps he was just defending the decisions they made for 3dM03, and this incident might cause FM to realize the wisdom of designing one's benchmark with the assumption that everyone will try to cheat).

Another possibility to achieve a similar goal would be if the seed number were to seed existing AI bot code in a game (plus determine a random starting location); the bot would then just "play through" the game for a little while, and the benchmark would either be record the framerate immediately (to bench with AI, physics, etc.), or record a demo and benchmark that (to bench rendering speed only). This might not be too difficult for a developer to add to an upcoming game, assuming they care enough about preventing benchmark cheating to go to the effort.

But as either of those solutions is some time off...record your own demos, release if you have the time to record new ones, but keep 'em secret if you must.

Lezmaka · May 24, 2003

Pete said:
I mean, can a company optimize for a map without knowing what path the camera will take and without you noticing?

Unless any optimizations made for that map are only activated during a timedemo, then that would be a good thing, as people actually playing that map should benefit from the optimizations.

I'm also not sure why you can't bench both a private demo and a public one.

I'd have to guess it has something to do with the amount of work involved. Basically, take all the timedemos they run now, and double that time. Plus, theres the time involved in figuring out what maps to use when recording the two timedemos, determining if the timedemo actually stresses a card (you don't wanna run around just to have everything from a GF2MX to 9800 Pro all getting 800fps).

I'd like to know what maps were used and maybe a screenshot or two of the demo, but at least with B3D, I could live w/o if they decided not to do any of that.

DOOM III · May 24, 2003

I partially agree with what cho said though I don't think changing a new demo in every new preview is necessary. You can use a demo for a specific amount of time,say one or two months. During the period,the demo remians unseen from the public and you can choose whether or not to publicate the demo when you're going to use a new one. Using custom demo makes comparsion between different sites somewhat invalid,but that's better than being fooled by video card vendors in every articles in which the hardworking reviewers put much energy.

Mephisto · May 24, 2003

I don't agree, but NOT because I do not trust B3D.

My solution would be to share the demos to other review sites and interested people at request, but not make them available for download in the public area of this site. I don't like sites using proprietary tools (Digit life beeing one for some time, now their changed their opinion). It makes sites/rewiewrs look some kind of "elite", which they're not.

Lezmaka · May 24, 2003

Mephisto said:
My solution would be to share the demos to other review sites and interested people at request, but not make them available for download in the public area of this site.

What would prevent ATI or Nvidia from persuading a review site to give them a copy of the demos? Persuading as in $, rewards such as getting to bench Doom 3, continuing to receive hardware for reviews, etc?

DOOM III · May 24, 2003

Mephisto said:
My solution would be to share the demos to other review sites and interested people at request, but not make them available for download in the public area of this site.

But how can you be sure other sites wouldn't leak the demo to card vendors? And how do you differentiate "interested people" from "spys" of card vendors?

Changing demo frequently enough is a good way and maybe the only way to prevent cheating in benchmark.

B3D custom demos not availed to public -- Agree?

Would you agree to B3D using custom game demos in reviews but not provide them to the public?

Yes, I agree because I trust B3D

No, I don't agree because I do not trust B3D

No, I don't agree not because I don't trust B3D but because I want to test myself

I am undecided at the moment and would like to wait-and-see how it all turns out

Reverend

g__day

Saem

YeuEmMaiMai

Calavaro

cho

Tokelil

Himself

g__day

Reverend

Calavaro

Pete

Moderate Nuisance

kyleb

dksuiko

Dave H

Lezmaka

DOOM III

Mephisto

Lezmaka

DOOM III

Similar threads