[H] 3D Testing Methodology Discussion

Discussion in '3D Hardware, Software & Output Devices' started by Arty, Jan 28, 2008.

Thread Status:
Not open for further replies.
  1. Arty

    Arty KEPLER
    Veteran

    Joined:
    Jun 16, 2005
    Messages:
    1,906
    Likes Received:
    55
    HardOCP's review is laughable, at the end it reads like Brent was self-excusing his lazyness for not retesting with the latest driver.
     
    FrgMstr likes this.
  2. Slyne

    Newcomer

    Joined:
    Jul 26, 2004
    Messages:
    101
    Likes Received:
    3
    I didn't think much of [H]'s review either. Real world review is all well and good but with only 2 cards and 4 titles while not even using the latest drivers, that review doesn't feel finished.

    And I feel that Anand's review came a little short too, mostly because it makes no sense to leave out Crossfire in the X2 review.

    Scott Wasson's review (TR) seemed pretty good though. He even showed results of 3dm6 Pixel and Vertex shader tests.

    But I must note that no other review mentioned complaints found in Damien Triolet's review (in French, at Hardware.fr): HD decoding was broken trying to read an HD-DVD, and the fan speed was constantly oscillating between low and high and the corresponding change in fan noise was annoying to him.

    Which brings me to the main point of my post, WHERE IS BEYOND3D's REVIEW?
    3870/3850 are still MIA, as are 8800GT/GTS512. Is the site going to be renamed BeyondReviews now? I was always looking forward to B3D's different take on GPU reviewing. I totally appreciate articles such as the one on PerfHUD, but is it incompatible with reviewing hardware?
     
  3. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    214
    Location:
    Uffda-land
    [H]'s methods, as Kyle will tell you at length, take longer (and, therefore, he'd also tell you, are more expensive). This would tend to make mid-life driver changes an even bigger pain in their ass than other peoples (not that anyone enjoys it).

    I think that's part of why he can be so shrill about advocating their methods. They cost him more and make reviews later --he has an emotional need to see them as far superior to justify that.
     
  4. Slyne

    Newcomer

    Joined:
    Jul 26, 2004
    Messages:
    101
    Likes Received:
    3
    And I'm saying, if it forces [H] to reduce the scope of their reviews, it makes them different, not better. While I like different, not all alternatives deserve to be explored if doing so would prevent you from providing a finished product.

    Oh well, I'll shut up now. We're veering off-track.
     
  5. FrgMstr

    Newcomer

    Joined:
    Jun 26, 2002
    Messages:
    223
    Likes Received:
    4
    Location:
    Lucas, TX
    Yes, the resources it takes to evaluate video cards by actually gaming with them is considerable.
     
  6. FrgMstr

    Newcomer

    Joined:
    Jun 26, 2002
    Messages:
    223
    Likes Received:
    4
    Location:
    Lucas, TX
    No emotional need, I know they are better and our track record since adopting that pretty much proves it. Point in case, go back and read Anandtech's 2900 XT conclusion and then read ours and tell me who you think was spot on. You might see it differently than me, but months later, it seems as though we called it 100% correct when others did not.
     
  7. FrgMstr

    Newcomer

    Joined:
    Jun 26, 2002
    Messages:
    223
    Likes Received:
    4
    Location:
    Lucas, TX
    It might be laughable but if you actually read the conclusion page you will see where we addressed the driver being used.


    And yes, the evaluation was "done" when we got the THIRD driver from AMD for the card.
     
  8. FrgMstr

    Newcomer

    Joined:
    Jun 26, 2002
    Messages:
    223
    Likes Received:
    4
    Location:
    Lucas, TX
    To be clear we use NO timedemos to evaluate the gameplay experience provided by a video card. We use nothing other than real gameplay. The graphed run-throughs you see are simply there to give our readers a look or "proof" of the experience we were provided by the hardware. All conclusions and opinions are formed through actually playing the game, nothing else.
     
  9. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    18,603
    Likes Received:
    3,121
    Location:
    Winfield, IN USA
    I still think it's a bit over-the-top to go accusing others of bias just for differing in their choice of testing methodology Kyle. Calling out specific questionable benchmarks is one thing, but accusing people of being irresponsible just for using canned benchies seems just a tad irresponsible...IMHO.
     
  10. Brent

    Regular

    Joined:
    Apr 11, 2002
    Messages:
    584
    Likes Received:
    4
    Location:
    Irving, TX
    I think this is a point to emphasize, it appears many believe we derive our evaluation from the 5-10 minutes of graphed data we show for each game, but this is not the case. We sit down, with each video card, and simply play the game with it, actually playing several levels all the way through, and in some games, the entire game. For example if we are having some difficulty nailing down what's playable, we may have to physically try out every single level in the game, to make sure a certain setting is actually playable from start to finish. It is this process we derive our experiences from, and determine the highest playable settings. As you can imagine, this is extremely time consuming, and why no one else does it. Then we make the graph, in a certain level that's pretty intense and encompasses as many of the game attributes there are, as visual data to backup our experiences. Hopefully that helps explain things a bit better.
     
  11. Skrying

    Skrying S K R Y I N G
    Veteran

    Joined:
    Jul 8, 2005
    Messages:
    4,815
    Likes Received:
    61
    Do you do that for every level? No. That's the problem, you're presenting it as the full cake when it's not. It's a piece of it. The frame rate you average in the first half of the game is certainly not the same as say inside the alien ship, or on the carrier, or in the frozen forest. You could play the level and determine the settings and then record a demo so the run through for each card is the same once you show minimum, average, highest, and the graph. While your impressions might be read by a number of people I'd gander that the majority probably doesn't read it all. The graph and chart and then a misrepresentation as people expect that to represent what the videos cards run at with the same scenes, but they're not. And while I understand the issue with canned benchmarks (those included that are not real game scenes) I don't understand why you don't record a in game demo of said level and then graph that, instead the difference in frame rate could be sizable. Especially so in a game such as Crysis where your movements could easily set off a chain of collapses for building and explosions.
     
  12. Sound_Card

    Regular

    Joined:
    Nov 24, 2006
    Messages:
    936
    Likes Received:
    4
    Location:
    San Antonio, TX
    So what qualifies as "Canned" benchmarks? I hear mention of "time demo" or something of that matter... But not all reviewers do generic time demos. So I'm curious as to who meaning what reviewer and what meaning method that you lable as "canned".
     
  13. Skrying

    Skrying S K R Y I N G
    Veteran

    Joined:
    Jul 8, 2005
    Messages:
    4,815
    Likes Received:
    61
    To me a canned benchmark is a demo who's sole purpose is to test the performance a system. The included one in Crysis for instance and the flyby's in UT3. That's different from recording your own while playing the game. That's my definition at least.
     
  14. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,090
    Likes Received:
    694
    Location:
    O Canada!
    While Crysis has a canned benchmark it does actually have run-throughs available for each level in the title.
     
  15. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    18,603
    Likes Received:
    3,121
    Location:
    Winfield, IN USA
    With a self-recorded demo or do you play it through live for the fps chart?
     
  16. hughJ

    Regular

    Joined:
    Feb 7, 2002
    Messages:
    860
    Likes Received:
    415
    My understanding is that they play it through live, and try to replicate the same path and actions each run.

    Are the differences in performance between a custom timedemo and a live run of the same actions that different?
     
  17. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    18,603
    Likes Received:
    3,121
    Location:
    Winfield, IN USA
    Sometimes, yeah. In most timedemos enemy AI & such doesn't come into play is the leading argument of that.

    I was just curious as to how they insure their play-throughs are the same or as close as they can get, I have a bit of a bitch of a time with that one myself.
     
  18. WaltC

    Veteran

    Joined:
    Jul 22, 2002
    Messages:
    2,710
    Likes Received:
    8
    Location:
    BelleVue Sanatorium, Billary, NY. Patient privile
    I actually agree with many of the things both you and Brent are saying here. I think the comments about the "3dfx way of doing things" were a bit over the top, though, because generally in the 3dfx era fps charts would often merely show you the difference between a game that was playable on one piece hardware while running as a slide show on a competing product, and that was the essential value of the fps method that was was born in the 3dfx era.

    I also agree that we've long since departed that era, but even though we have, the fps (frames per second) method of product review is still very much alive to this day. Frankly, I think that bar charts which show one product maxing out at 180 fps and another product maxing out at 100 fps are extremely misleading and I agree with you and Brent 100% about that. Regardless of the connotation such bar charts present--that the card with higher frame rate offers a better game-playing experience--I think you are absolutely correct and helpful and informative to let people know that substantively in such cases there is absolutely no difference between such products in terms of the gaming experience they offer the consumer. I agree with [H] completely about that.

    But...;)...where I see the [H] reviews I've recently read coming up short is specifically in the area of image quality--I'm talking about [H]'s tendency to only test a game in one, maybe two, resolutions, and render lip service, if that much, to all the remaining game resolutions and filtering settings that the cards tested are capable of. [By no means am I suggesting that [H] is alone in doing this--it unfortunately has become a trend that nearly everyone has adopted.]

    For instance, I recall in the "Witcher Performance Comparison" that [H] recently published, an implication to the effect that either in 1280x1024 or else at 1900x1200, the game internally supports no more than 2x FSAA. (Pardons if my memory is incorrect on that point, but that's what I recall.) On my system, however, running an HD 3870 512 at the time, I had no trouble setting the Witcher to 4x FSAA @ 1280x1024--higher than that, though, and the game would only let me choose 2x FSAA.

    Anyway, my thought at the time was that Brent was a little mixed up there possibly because he hadn't actually run the game that much if at all at 1280x1024, and had simply assumed that 4x FSAA was not selectable in the Witcher at that resolution, as it is not at > 1280 x1024.

    Here's a quote from the article:

    "Gamers with smaller monitors (limited to 1280x1024 or lower) can easily get great performance and an excellent gameplay experience out of a Radeon HD 3850."

    At no time, however, was it ever mentioned that playing the game at 1280x1024 allows one to choose a higher than 2x FSAA setting from within the game, which is what led me to believe that although he mentioned it Brent never actually tested at 1280x1024--or he'd have know that and certainly would have mentioned it, as he mentions several times that at the resolutions he tested he was not able to set 4x FSAA from within the game. Also, gamers with "larger monitors" should have no trouble running the game at 1280x1024 or *lower* with higher filtering settings if that is what they desire to do.

    My point here is simply that I do not understand why it is that 3d-card reviews are written which seemingly ignore 75% of the resolutions and filtering settings that the IHVs who manufacture these products go to great lengths of time and trouble to support in those products. For instance, why should I insist on running Crysis at the highest resolution my monitor will support, with no filtering, when the result is little better than a slide show? Crysis, in particular, seems a game that is ideal for testing much lower gaming resolutions with much higher filtering settings. I think such tests might well be eye-openers, and might well inform people of the fact that it is possible to run the game at lower resolutions with higher filtering and get as good if not superior image quality while obtaining much more satisfactory gaming frame-rate performance at the same time. If it isn't possible to do that in the case of a given game inside a given 3d-card review, then I think it is incumbent on the review to cover that and then to also justify it with screen shots and appropriate explanatory commentary.

    This is a dimension in playing Crysis that I have yet to see even a single 3d-card review to date explore in a revelatory manner. Basically, it's as if both the 3d-cards being tested and Crysis itself were limited to only one or two resolutions and only one or two filtering settings--mainly FSAA and Anisotropic filtering settings. Of course, such is not the case. Both the game and the products support lower resolutions and higher filtering settings that often are not being tested and evaluated at all. I think this does short service to [H]'s readers, not to mention the readers of everyone else's 3d-card reviews--not to forget the developers of these games who also take a lot of time to ensure that they run as expected in lower resolutions and with higher filtering settings.

    When [H] reviewed the HD 3870 and the Crysis demo, the *only* resolution for which you presented frame-rate results was 1280x1024 with no FSAA and no AF, if I recall correctly. This you said was done to present the "highest resolution possible" for playing the game at an acceptable frame rate. OK, so what is "real world" about that, exactly? What if I'm a gamer who chooses FSAA and AF filtering at < 1280x1024 resolutions in order to get "playable frame rates"? How exactly would I not be considered "real world" in that event? Just so you won't misunderstand, I see nothing wrong with talking about "real world" results *provided* you aren't assuming a very narrow point of view as to what is "real world" and what is not. IE, playing Crysis at < 1280x1024 with higher levels of FSAA and AF and higher-quality textures set in game would seem to me to be just as "real world" as finding out how high you can set the resolution with no FSAA and AF and medium textures and still comfortably play the game. So it seems to me that such "real world" approaches should cover all of the possible bases. I think that a lot of very interesting and valuable insight is left out of reviews which don't do this.

    It used to be, you know, that when you read a 3d-card review everything from 800x600 and up was tested and evaluated and expounded upon, with example screens shots and so on. I do not think that talking about LCDs and their native resolutions is especially helpful, either, because LCD technology has reached a stage where most LCDs can scale down from their native resolutions as well as any CRT I ever owned could scale down from its max res--and I owned a bunch of CRTs...;) My current monitor, a 27.5" 1900x1200 native LCD has absolutely no trouble scaling downwards beautifully, and I often do it just to test things out myself--the things that today's 3d-card reviews just don't tell me.

    Just as with CRTs, though, often the higher resolutions are better, but the irony is that with this 27.5" LCD I *cannot* do higher than 1900x1200--yet many of the current 3d-card reviews I read often seem dedicated to testing at > 1900x1200 as if *everyone* reading the review could even do that. I rather doubt that I am odd or unusual in this respect, and think it very likely that *most* people reading these reviews are in the same boat I am in with respect to being able to exceed 1900x1200, if even they can reach that resolution with the monitor they are presently using. It's ironic because even with my old 20" CRT I could exceed 1900x1200, but rarely did I ever use 1900x1200 because at 20" of screen I found the pixels at even 1900x1200 to be too small for comfort, and so of course I rarely if ever ventured beyond 1900x1200 even though I could have.

    Last, I'm not saying that there's anything amiss with reviewing product performance and game play at 1900x1200 and higher, of course. What I am saying is that it seems to me that the 3d-card reviews I am reading today are much more remarkable for all it is that they *do not* tell me as opposed to the comparatively small number of things that they *do* tell me. For instance, I would really like to read about how various cards do running Crysis at 800x600 at 4-8xFSAA and 16x AF, or some combination thereof with maximum texture quality. I find it notable that although I haven't yet myself read a Crysis review which explored the game at 800x600 with most everything maxed out, a quick search in Google of "Crysis at 800x600" reveals quite a few other parties who have...;)
     
  19. Arty

    Arty KEPLER
    Veteran

    Joined:
    Jun 16, 2005
    Messages:
    1,906
    Likes Received:
    55
    The evaluation that was done resulted in a gain of +2fps, thats a night/day difference to what AMD promised as fixed in the new release. Something isnt right, either AMD's promise or your test bench.

    Also if the evaluation was done, why not include it even if it made a very small difference to the end result. After all, the testing as you say was complete.

    Sorry, I didnt realise my post would derail the thread. :oops:
     
  20. Mark

    Mark aka Ratchet
    Regular

    Joined:
    Apr 12, 2002
    Messages:
    604
    Likes Received:
    33
    Location:
    Newfoundland, Canada
    I don't really get the whole aversion to timedemos either, never did. When I reviewed graphics cards I always recorded custom demos of real gameplay sequences of me playing the game (these timedemos were always very long, sometimes of entire levels). All the data that my timedemos spit out accurately reflected the results of the "real-world" tests I would verify them against. If the real-world tests and the timedemo results didn't jive with each other, I'd record new timedemos and test again.

    The good thing about doing it that way was I was able to test a lot of cards over a lot of different tests and gather a ton of very accurate results. The disadvantage about doing it that way was... ridicule from [H]?

    I don't really see any advantage to doing it the [H] way. Not only do you have a very long and tedious process that takes up so much time you're forced to greatly reduce the number of tests and samples you can present to the reader, you also introduce a lot of potential for human error that, in the end, makes the results you do manage to gather, the supposedly "real" results, unreliable.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...