Whatever happened with Futuremark and nVidia with 53.03?

worm[Futuremark said:
]If a driver does not get our approval to be used with 3DMark03, it simply won't be listed there. We test all new official WHQL drivers as they are released, and we will keep that list of approved drivers updated.

Why there isn't a field in ORB scores saying that the score is either "approved" or "unapproved" in the ORB? That would of course depend on driver and 3dMark versions. If "unapproved" is legally unthinkable, you could just name the fields "approved setup" and "experimental setup" or something. That would make it easier to distinguish approved scores from the others.
 
aapo said:
Why there isn't a field in ORB scores saying that the score is either "approved" or "unapproved" in the ORB? That would of course depend on driver and 3dMark versions. If "unapproved" is legally unthinkable, you could just name the fields "approved setup" and "experimental setup" or something. That would make it easier to distinguish approved scores from the others.

Why let someone publish a score they've gotten with unsanctioned drivers?
the ORB is, afaik, a tool with which users of 3dmark can compare scores across different systems including graphic cards...
Thus it would invalidate the system by letting unsanctioned drivers into it, even if they were labeled "unsanctioned drivers, if you could search for those results...

It is in Futuremarks interest, imo, to keep the ORB as "clean" as possible as they use it to give numbers to OEMs++

And by leaving the "unapproved" part out of the ORB, they make sure that noone is misled by the numbers...
 
Hanners said:
Are there no plans to announce drivers that fail the tests publically, and give reasons for their failure? I think most people (quite rightly) assumed that this would be the way FutureMark would deal with things.
Sorry, but at the moment there are no plans to start posting any specific reasons why some drivers didn't make it to the list. Still, I can't say if we will start doing so at some point. I know that some people (more or less tech-people) who would be very interested to know why some drivers fail our tests. But as I said, at the moment we haven't planned to do so.

digitalwanderer said:
That's IT?!? You won't put 'em on the approved list? What about when websites start benching 'em for reviews? Will you say anything? Or will you happily assume that since the drivers are not on your approved drivers list that people reading the reviews will check your list and realize that the results shouldn't be taken into consideration?

Do you really bloody mean that's it?!?!?!
We sent out the updated 3DMark03 Benchmarking Guidelines to all of our online and offline contacts (media), and we sincerily hope that everyone who got it, follows it when using 3DMark03 in their reviews. When/if a website uses drivers in a review with 3DMark03 which we haven't approved, we will of course contact them. We have encouraged the media to only use the approved drivers when using 3DMark03. If you check out the guidelines we sent out, it clearly says what we ask the reviewer to do when/if he uses 3DMark03. Besides, eventhough we would post the non-approved drivers & reasons for not being approved, people reading the reviews would STILL need to check our lists. Doesn't really do no good. We really hope that all professional reviewers out there understands the situation, and follows our 3DMark03 guidelines whenever they use it. That way the reader is not misinformed, and knows what he/she is reading. The guidelines aren't that complicated & difficult to follow.

If anyone is interested, please check out the guidelines here:

http://www.futuremark.com/companyinfo/reviewers_guide_3dmark03.pdf

Whenever we come up with some new things we think are important when benchmarking with 3DMark03, we will update the guide, and inform the media about it.

Nite_Hawk said:
I have a question about the policy you quoted above though. Given that I'm sure it takes a fair amount of time to actually test the drivers to make sure there arn't any performance altering bugs in them, how should we as readers distinguish between drivers that you have tested and failed (which do not conform to futuremark's guidelines) versus drivers that you have not yet tested or have not finished testing (which may conform to futuremark's guidelines)? This distinction is something that I'm guessing a lot of your users would be interested in knowing.
Well, first of all we only test official WHQL drivers. If you see any website using any leaked/unofficial (even leaked WHQL) drivers and 3DMark03 results, the results are not verified by us in any way. Same thing with drivers which aren't listed on our approved drivers page. We can't be sure that those results are valid. We test all official WHQL drivers the very instant we get our hands on them, and latest the very next day (after the release) we update the approved list accordingly to our findings. Hope this answered your question?

aapo said:
Why there isn't a field in ORB scores saying that the score is either "approved" or "unapproved" in the ORB? That would of course depend on driver and 3dMark versions. If "unapproved" is legally unthinkable, you could just name the fields "approved setup" and "experimental setup" or something. That would make it easier to distinguish approved scores from the others.
We are currently working on the ORB in order to make it "filter out" the non-approved ones. In other words, there will be more search parameters, and the default search parameter will be using only approved drivers. I am not 100% sure when this new parameter will be available, but I am sure the ORB team are doing the best they can to get it working as soon as possible. :)
 
Thanks for coming here and taking the time to address all the questions Worm. I'm sure I'm not the only person here who wants FM to take a much more aggressive stance on driver cheats, but I don't want to scream at you about it anymore since you've made your position quite clear.

Thank you for your time.
 
digitalwanderer said:
Thanks for coming here and taking the time to address all the questions Worm. I'm sure I'm not the only person here who wants FM to take a much more aggressive stance on driver cheats, but I don't want to scream at you about it anymore since you've made your position quite clear.

Thank you for your time.
What exactly do you mean by being more aggressive? I mean, as far as I know, we are the only benchmark company taking ANY stance in topics about driver optimizations, and trying to do something about it.
 
I agree, but looking at your technical partners like DELL, you would think they would also want a honest representation of the products they base their OEM deals on.

In this example, none of your partners seem to care...in fact were manipulated greatly by a company that left your program. I agree Nvidia need to be there, but the patch to fix cheating is getting old...in fact every patch so far was released just for that and a minor fix/addition here and there.
Allowing no driver that hasn't been approved fixes the problem FAST, as that is what counts...the score.
 
worm[Futuremark said:
]
digitalwanderer said:
Thanks for coming here and taking the time to address all the questions Worm. I'm sure I'm not the only person here who wants FM to take a much more aggressive stance on driver cheats, but I don't want to scream at you about it anymore since you've made your position quite clear.

Thank you for your time.
What exactly do you mean by being more aggressive? I mean, as far as I know, we are the only benchmark company taking ANY stance in topics about driver optimizations, and trying to do something about it.
"More aggressive" as in "a whole lot less pansy-assed" in your approach to stating if an IHV is cheating or not. I think you should at LEAST put out a statement saying if you find drivers unacceptable and why.

I'd also not allow posting scores with those drivers to the ORB in any way, shape, or form. I freaked the first time someone pointed out to me some un-official scores they'd posted to the ORB, there was no way to tell that it wasn't an official score. :(

Oh, I'd also take some nVidia upper-management out back and work 'em over with a rubber hose for a while...but I tend to be a bit more extreme than most. :)
 
worm[Futuremark said:
]
Nite_Hawk said:
I have a question about the policy you quoted above though. Given that I'm sure it takes a fair amount of time to actually test the drivers to make sure there arn't any performance altering bugs in them, how should we as readers distinguish between drivers that you have tested and failed (which do not conform to futuremark's guidelines) versus drivers that you have not yet tested or have not finished testing (which may conform to futuremark's guidelines)? This distinction is something that I'm guessing a lot of your users would be interested in knowing.
Well, first of all we only test official WHQL drivers. If you see any website using any leaked/unofficial (even leaked WHQL) drivers and 3DMark03 results, the results are not verified by us in any way. Same thing with drivers which aren't listed on our approved drivers page. We can't be sure that those results are valid. We test all official WHQL drivers the very instant we get our hands on them, and latest the very next day (after the release) we update the approved list accordingly to our findings. Hope this answered your question?

Hi Worm,

Thanks for trying to answer my question. I'm still a bit concerned though. Say that nvidia or ati releases a new set of drivers that are WHQL certified, and they don't appear on the approved driver list after two weeks. At what point should we assume that they have failed FutureMark's tests? Simply the absense of the driver from your list is ambiguous. It would be very useful to be able to claim with certainty that a certain revision of a driver is not suitable to be used (nor will it ever be) with a specific version of futuremark. With the current system, you can prove that future tests using a specific driver and a specific version of 3dmark03 will continue to be valid (if the driver is on the list) but you can not claim that a driver which is absent from the list will never be valid to test with because it could be put on the list at any time.

Does this make any sense?

Thanks,
Nite_Hawk
 
Nite_Hawk said:
Hi Worm,

Thanks for trying to answer my question. I'm still a bit concerned though. Say that nvidia or ati releases a new set of drivers that are WHQL certified, and they don't appear on the approved driver list after two weeks. At what point should we assume that they have failed FutureMark's tests? Simply the absense of the driver from your list is ambiguous. It would be very useful to be able to claim with certainty that a certain revision of a driver is not suitable to be used (nor will it ever be) with a specific version of futuremark. With the current system, you can prove that future tests using a specific driver and a specific version of 3dmark03 will continue to be valid (if the driver is on the list) but you can not claim that a driver which is absent from the list will never be valid to test with because it could be put on the list at any time.

Does this make any sense?

Thanks,
Nite_Hawk

I agree. Even if Futuremark doesn't want to publically state reasons why a driver has not been approved, I think they need to have an "Unapproved Driver List" with newly released drivers appearing on it if they fail to pass Futuremark's certification. That way people will know once and for all whether a driver has been tested and if it is approved or not. The average person will probably just assume that new drivers(like NVIDIA 53.03) may not have been tested yet and that is why they aren't approve.
 
I guess we could all email individually futuremark and ask wether or not drivers have undergone the testing if they refuse to answer or say yes we know they failed. If they say no then we yell ( ask nicely ) at them to hurry up.
 
worm[Futuremark said:
]
What exactly do you mean by being more aggressive? I mean, as far as I know, we are the only benchmark company taking ANY stance in topics about driver optimizations, and trying to do something about it.

What is your stance? All companies should use the same path for running all benchmarks? No "advantages". This is what confuses me. It's a benchmark, companys should actively try and "optimize" And i use that word loosely. Thats like telling AMD and intel to disable sse and 3dnow! It's honestly not going to work. Nvidia is going to be flatheaded and continue to break whatever your company is doing in each patch. It's a war you really can't win.

I'd personally love to hear you stance on this optimization free benchmark world that Futuremark members apparently live in.
 
No not quite right if someone does a FPU benchmark and for some reason intel has a application running on your pc it shouldn't hack the benchmarks code in memmory to replace the FPU you code with SSE code or modify the application in memmory period.
 
Waltar said:
What is your stance? All companies should use the same path for running all benchmarks? No "advantages". This is what confuses me. It's a benchmark, companys should actively try and "optimize" And i use that word loosely. Thats like telling AMD and intel to disable sse and 3dnow! It's honestly not going to work. Nvidia is going to be flatheaded and continue to break whatever your company is doing in each patch. It's a war you really can't win.

I'd personally love to hear you stance on this optimization free benchmark world that Futuremark members apparently live in.
This has been the standard stance for most of the benchmarking industry in the past - it has generally been regarded as illegal to detect the benchmark that is running and to alter your execution or codepath based on implicit knowledge of the code.

This is hardly the first time that this has happened - the most famous incidents have been instances of compilers detecting the standard source code for benchmarks like Whetstone or Dhrystone and altering the code that they generate to make it appear that they can generate faster compiled code than their competitors. Of course this was only true in the specific case of the benchmark that was being run -

http://www.ebenchmarks.com/benchmarkmyths.html

For reference I would also direct you at the following interview with nVidia themselves at Tom's Hardware regarding benchmark detection and optimisation from way back in the days when Winbench 98 and the Riva 128 were new:

http://www.tomshardware.com/graphic/19980214/3dhype-03.html

5) Is it correct that drivers can be easily optimized for one application, e.g. 3D Winbench 98?

Yes, drivers can be tuned specifically for a benchmark. But trying to fool customers is a terrible practice and just won't work for long for any company that tries it. Real customers run real applications. 3D Winbench is just an approximation of a whole bunch of applications. It's a good first step but by no means as good as testing a bunch of real applications.

6) Has nVidia optimized the drivers of the Riva 128 for 3D Winbench98?

Yes! But that's the wrong question. What you should have asked is whether we have optimized our drivers to exaggerate our performance on 3D Winbench98 relative to Riva 128 performance on real 3D applications. The answer to that is an absolute *NO*. We have certainly used 3D Winbench as one of the tools for improving our drivers. 3D Winbench does things with D3D that no existing applications mimic. Any company that has not extensively studied 3D Winbench is arrogant and has likely produced a set of D3D drivers that are far less robust than they should be. Similarly, any company that has high 3D Winbench scores with disproportionately less performance on real applications should be viewed with _great_ suspicion. NVIDIA is extremely self-conscious about Riva 128 performance and quality on real applications and has made a huge investment in testing and quality assurance. Looking great on benchmarks while looking slow on real apps is a slimy way to build a business. We won't do it.
Enough said.
 
Yes! But that's the wrong question. What you should have asked is whether we have optimized our drivers to exaggerate our performance on 3D Winbench98 relative to Riva 128 performance on real 3D applications. The answer to that is an absolute *NO*

Man oh man, that's one stupid statement. Guess we should have seen this coming, huh? :)
 
digitalwanderer said:
"More aggressive" as in "a whole lot less pansy-assed" in your approach to stating if an IHV is cheating or not. I think you should at LEAST put out a statement saying if you find drivers unacceptable and why.
Releasing statements on all drivers we test would be very time consuming, and therefore not possible. As I said earlier, currently we do not have plans on posting the drivers which didn't pass, and why they didn't. In what way do you personally think it would add value to the enforcement process? I mean for the big audience, and not only for the tech-gurus. I would prefer that reviewers would take inititive and follow the 3DMark03 usage guidelines. That way the reader (both mainstream users and tech-gurus) would easily see what has been tested and what the numbers really mean.

digitalwanderer said:
I'd also not allow posting scores with those drivers to the ORB in any way, shape, or form. I freaked the first time someone pointed out to me some un-official scores they'd posted to the ORB, there was no way to tell that it wasn't an official score. :(
This is something we are thinking about, and need to look into. At the moment we are implementing the approved filter to the search, and we'll work from there. At least then you will be able to see if the result is got by using approved drivers or not.

Nite_Hawk said:
Thanks for trying to answer my question. I'm still a bit concerned though. Say that nvidia or ati releases a new set of drivers that are WHQL certified, and they don't appear on the approved driver list after two weeks. At what point should we assume that they have failed FutureMark's tests? Simply the absense of the driver from your list is ambiguous. It would be very useful to be able to claim with certainty that a certain revision of a driver is not suitable to be used (nor will it ever be) with a specific version of futuremark. With the current system, you can prove that future tests using a specific driver and a specific version of 3dmark03 will continue to be valid (if the driver is on the list) but you can not claim that a driver which is absent from the list will never be valid to test with because it could be put on the list at any time.

Does this make any sense?
If any IHV releases official WHQL drivers, you will see some changes on the approved list latest the next working day. Either the driver is approved, which means it will be listed, or then you will at least see that the date has changed (Example: We have inspected all offical SiS Xabre WHQL drivers through December 9th, 2003.). Not sure if I explained this very clearly, but if you take a look at the approved driver lists, you should spot what I mean. :)

StealthHawk said:
I agree. Even if Futuremark doesn't want to publically state reasons why a driver has not been approved, I think they need to have an "Unapproved Driver List" with newly released drivers appearing on it if they fail to pass Futuremark's certification. That way people will know once and for all whether a driver has been tested and if it is approved or not.
Again, at the moment we have no plans on posting the drivers which didn't pass. However, as this seems to be a very hot topic I will ask about this as soon as I get a chance to do so. Can't say what the outcome will be, but let's see.

bloodbob said:
I guess we could all email individually futuremark and ask wether or not drivers have undergone the testing if they refuse to answer or say yes we know they failed. If they say no then we yell ( ask nicely ) at them to hurry up.
If you take a look at the approved driver lists, you should be able to spot dates (Example: We have inspected all offical SiS Xabre WHQL drivers through December 9th, 2003.) which should indicate which drivers we have tested. But in any case, you are all welcome to email us if you have any questions!

Waltar said:
What is your stance? All companies should use the same path for running all benchmarks? No "advantages". This is what confuses me. It's a benchmark, companys should actively try and "optimize" And i use that word loosely. Thats like telling AMD and intel to disable sse and 3dnow! It's honestly not going to work. Nvidia is going to be flatheaded and continue to break whatever your company is doing in each patch. It's a war you really can't win.
Not really sure what you mean by what our stance is? Our stance on the subject has been said several times, and can be read in various of our released pdf's. Please keep in mind that 3DMark is a DirectX benchmark, which means it has no special extensions for any IHV's. We follow the DirectX (D3D) standards, and work from there.
 
Looking great on benchmarks while looking slow on real apps is a slimy way to build a business. We won't do it.

Ha ha! I like that quote! :D

I suppose that technically they have now built their business up so the current benchmarking shenanigans are designed to keep their business at it's current level! Therefore they aren't 'breaking their word' as regards to that particular interview but rather dodging round the edges. :p
 
worm[Futuremark said:
] Releasing statements on all drivers we test would be very time consuming, and therefore not possible. As I said earlier, currently we do not have plans on posting the drivers which didn't pass, and why they didn't. In what way do you personally think it would add value to the enforcement process? I mean for the big audience, and not only for the tech-gurus. I would prefer that reviewers would take inititive and follow the 3DMark03 usage guidelines. That way the reader (both mainstream users and tech-gurus) would easily see what has been tested and what the numbers really mean.

I agree that releasing statements would be prohibitively time-consuming - However, surely a lot would be gained by all parties involved by running a list of drivers that failed approval publically alongside the approved ones? It would certainly clear up any confusion about the current state of play of a particular driver set, and also take the pressure off FutureMark as far as testing drivers quickly would go. With your current system you are basically pressuring yourselves into publishing approved drivers quickly to avoid people assuming that the drivers had failed - For example, imagine if for some reason there was a hold-up with you testing ATi's next driver release, we would be assuming that the drivers had failed when that isn't necessarily the case.

Maybe even a monthly mailing list sent out to hardware sites with a list of approved, unapproved and 'testing in progress' drivers wouldn't be such a bad idea - Then you could safely leave any further announcement of what drivers have and haven't passed up to the sites in question.
 
Hanners said:
Maybe even a monthly mailing list sent out to hardware sites with a list of approved, unapproved and 'testing in progress' drivers wouldn't be such a bad idea - Then you could safely leave any further announcement of what drivers have and haven't passed up to the sites in question.
Hmmm, not a bad idea, though in 1 month there might happen so much that the list we have sent out might get outdated in a couple of days. :? I need to talk with the guys here at the offices and let's see what we come up with.
 
Didn't notice those dates but you might wana make it one day after the release or people might worry that you updated it prior to new release.
 
worm[Futuremark said:
]Hmmm, not a bad idea, though in 1 month there might happen so much that the list we have sent out might get outdated in a couple of days. :? I need to talk with the guys here at the offices and let's see what we come up with.

True, the list becoming outdated would quite possibly be a problem, although I imagine if things were set up in the right way you could probably automate the process, then maybe send out a weekly mail, or even a mail every time the status of a driver is changed/added?
 
Back
Top