Use of Custom Demo's In Reviews

Discussion in 'Graphics and Semiconductor Industry' started by Dave Baumann, Jun 16, 2003.

  1. BRiT

    BRiT (╯°□°)╯
    Moderator Legend Alpha Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    12,517
    Likes Received:
    8,725
    Location:
    Cleveland
    The PCX / PCX-2 ... ?
     
  2. banksie

    Newcomer

    Joined:
    Jun 9, 2003
    Messages:
    213
    Likes Received:
    4
    Location:
    Wellington, New Zealand
    The problem I have with entirely exclusive demos is that it then becomes impossible to verify for yourself the results the reviewer found. I am not implying you are fudging your figures in the slightest but the only check we have as readers is access to the demos used in the review.

    Perhaps a half and half solution?

    Use one demo publicly and have a private unreleased demo for the same game that sanity checks the public demo's figures.

    Philip
     
  3. micron

    micron Diamond Viper 550
    Veteran

    Joined:
    Feb 23, 2003
    Messages:
    1,189
    Likes Received:
    12
    Location:
    U.S.
    Shit. Everytime I think I know what I'm talking about(happens rarely), I dont know what I'm talking about. :(
     
  4. StealthHawk

    Regular

    Joined:
    May 27, 2003
    Messages:
    459
    Likes Received:
    0
    Location:
    I exist
    Well, yeah. Now you are proving your point :)

    But your final sentence is still wrong. The no-AF scores do not show the same speed improvement. In fact, in the original review there are only no-AF no-FSAA scores, which increased from 40/60 in the old review to 60/90 in the new review, respectively. While the FSAA/8xAF scores show 2.5 times the performance, the no-FSAA/no-AF scores only show a +50% improvement.

    Semantics aside, I think this thread proves without a doubt that nvidia is detecting and "optimizing" UT2003: http://www.beyond3d.com/forum/viewtopic.php?t=6480.
     
  5. Hanners

    Regular

    Joined:
    Jul 12, 2002
    Messages:
    816
    Likes Received:
    57
    Location:
    England
    1-2 FPS?! Luxury!! In my day.... :lol:
     
  6. micron

    micron Diamond Viper 550
    Veteran

    Joined:
    Feb 23, 2003
    Messages:
    1,189
    Likes Received:
    12
    Location:
    U.S.
    NOooooo :D
     
  7. Doomtrooper

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,328
    Likes Received:
    0
    Location:
    Ontario, Canada
  8. Sharkfood

    Regular

    Joined:
    Feb 7, 2002
    Messages:
    702
    Likes Received:
    11
    Location:
    Bay Area, California
    Yes, notice Quake1 + Voodoo. I'm talking about Voodoo, PVR, Rendition, Rage days.. looong before even the Voodoo2's and Geforces. PCX1 versus Monster3D was the Website shootout at the time.
     
  9. Sharkfood

    Regular

    Joined:
    Feb 7, 2002
    Messages:
    702
    Likes Received:
    11
    Location:
    Bay Area, California
    I don't think Mike's numbers were meant to be used as end-all, be-all numbers. They were just some "I just slapped the card in my system and ran a few quick benchmarks" and weren't in a review, just casual information posted on a forum thread. He also explained he's not using the full-retail version of UT2003 (just the freebie download/demo version most likely) which is most definately an old revision and may differ from benchmarks elsewhere.
     
  10. Reverend

    Banned

    Joined:
    Jan 31, 2002
    Messages:
    3,266
    Likes Received:
    24
    I think FRAPS is useful probably most for in-game checking. For an absolutely fair benchmark of an app, you'll need a demo run with fixed number of frames that must all be rendered. FRAPS doesn't work that way.
     
  11. nelg

    Veteran

    Joined:
    Jan 26, 2003
    Messages:
    1,557
    Likes Received:
    42
    Location:
    Toronto
    Is it possible for a driver to cause FRAPS to report incorrectly ? Is there no end to what a reviewer has to look out for nowadays ?
     
  12. Reverend

    Banned

    Joined:
    Jan 31, 2002
    Messages:
    3,266
    Likes Received:
    24
    Actually, I do look out for the wife when I'm doing this hobby of mine.

    Time for bed!
     
  13. Sharkfood

    Regular

    Joined:
    Feb 7, 2002
    Messages:
    702
    Likes Received:
    11
    Location:
    Bay Area, California
    No, there is definately an "end"- but the degree of knowledge that is assumed is directly the responsibility of the reviewer. It's totally self-inflicted.

    An example- look at typical Gaming magazines and their hardware reviews. They list the product, price, run a 3DMark and Quake3 timedemo- then score the hardware "86%" or "9 out of 10" and they are done. They haven't taken a bite into anything tangible or technical. Just a joe-generic, low-tech, medium value hardware review.

    The problem starts when a reviewer decides to try to create some sort of technical baseline of comparison. When they start running histograms, providing "comprehensive" image quality analysis, start running side-by-side featureset comparisons and charts.. and all done with less than stellar knowledge of the hardware they are comparing. It leads to putting in print less than scientific results and making a basis which MAY not be correct. It's the time honored "biting off more than one can chew" syndrome and should only be attempted by trained stuntmen. :)

    So that is the reality of the situation- you have MANY sources trying to create scientific/comprehensive comparisons when they lack the knowledge or savvy to attempt such a feat. It's the 10% rule- you have to be 10% smarter than your readers, and if you are instead 50% less knowledgable, sparks are going to fly.

    Benchmarking is no joke, and to do so objectively and thoroughly, you need to understand the elements at play, create controls to ensure your results, and test, test, test and retest! When revealing comprehensive results, a reviewer needs to pretend like some multi-million dollar decision is riding on the accuracy of their results- because in the long run, there very well might be. A review that creates a (false) image of a product shortcoming might cause a delta of >$1 million for lost sales of one product and increased sales of another, with the end-consumer being the real victim. I'm sure if there was accountability with reviewers, you would see MUCH less false benchmarks, AA/AF tests where neither are actually enabled on one IHV versus another, or actual in-depth study/research to figure out what featuresets are supposed to do and how they are delivered by different IHVs.
     
  14. MikeC

    Newcomer

    Joined:
    Feb 9, 2002
    Messages:
    194
    Likes Received:
    0
    Yes, you are correct. My results will not match those from the typical timedemos since they are from "gameplay" sessions. During these sessions, I play the game and use the logging mode in FRAPS to capture frame rates.

    When I publish the results in a review, I always make sure the settings I use are provided. This allows the reader to conduct similar tests on their PC in order to make a comparison. For example, in UT2003 I normally play against six skilled bots on DM-Antalus. I also list the in-game graphics settings and enable high quality sound.
     
  15. MikeC

    Newcomer

    Joined:
    Feb 9, 2002
    Messages:
    194
    Likes Received:
    0
    I finally purchased the full version of UT2003 yesterday :) But I am eager to get started testing the Radeon 9800 Pro under the same gameplay scenerios that I did with the NV35.
     
  16. Doomtrooper

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,328
    Likes Received:
    0
    Location:
    Ontario, Canada
    Taking that one step futher and jumping online for a few matches on a populated server would be the cats ass IMO.
     
  17. YeuEmMaiMai

    Regular

    Joined:
    Sep 11, 2002
    Messages:
    579
    Likes Received:
    4
    question becomes (I doubt that you would do this Brent) what prevents Kyle from giving your recorded timedemo to his friend at nVidia?

     
  18. MikeC

    Newcomer

    Joined:
    Feb 9, 2002
    Messages:
    194
    Likes Received:
    0
    Sounds like a great idea! Will probably do the same with Wolfenstein Enemy Territory. I'm one heck of a medic :)
     
  19. Miksu

    Regular

    Joined:
    Mar 9, 2003
    Messages:
    997
    Likes Received:
    10
    Location:
    Finland
    eVGA e-GeForce FX 5900 Ultra Review

    Anyone read the new FX9500U review from Firingsquad? They've again recorded new timedemos and the scores are again changed: FX is beating R9800Pro with a great margin. And yep, this card is higher clocked than the card from last test.

    In Serious Sam2 they went from this http://firingsquad.gamers.com/hardware/msi_geforce_fx5900-td128_review/page8.asp to this http://firingsquad.gamers.com/hardware/evga_e-geforce_fx_5900_ultra_review/page8.asp

    For example in Quality 1024x768x32 4xAA and 8xAF GeForce scored 161fps, Radeon 9800pro 114fps. It's strange though that in 4 days they have recorded a new demo and have changed the testing methods completely: You can't find SS2 scores with aniso&aa-turned on from the review which was posted few days ago, now you can't find SS2 scores without aniso or aa. Same goes to UT2k3 tests.
     
  20. Randell

    Randell Senior Daddy
    Veteran

    Joined:
    Feb 14, 2002
    Messages:
    1,869
    Likes Received:
    3
    Location:
    London
    Re: eVGA e-GeForce FX 5900 Ultra Review

    yes it's strange - did I read it correct, NASCAR is now more stressful, but SS2 and Q3A are less stressful?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...