Doom3 benches revisited.

Borsti said:
In HighQ:
No AA: 55,0
2xAA: 47,2
4xAA: 40,2
8xA: 54,5
4xAA+8xA: 39,6

Yes, that looks as expected (except for 8xA, unless it's a horrible quality mode)....unfortunately, that's not what the charts ON YOUR WEB SITE REVIEW says. See above post.

I don't see ANY DOOM 3 AA scores for the FX in your review, other than 4X High Quality, and you report the 4X High Qulaity scores in your review as 57.1, not the 40.2 you state above.

What's going on here?

You even gushed about NV35 AA performance based on the charts in your reivew: "When 4xFSAA is enabled, not a single card can hold a candle to the FX 5900 Ultra. "

And yet going by the results you just posted above, the 9800 BEATS the FX in high quality mode with AA.

Me confused, Lars. :?
 
Borsti said:
But there´s also no big performance difference between

Remember, the 256MB board is running 350MHz RAM, while the 128MB is running 340MHz, that could quite easily account for the small performance difference noted there.

However, at one point you concluded that the extra 128MB of memory does not make any difference (in a 1600x1200 with 4X FSAA test) based on the Radeon's scores, but ignoring the massive performance delta between the 128MB NV30 and 256MB NV35, despite their relative closeness in other cases - this should have thrown up some warning signs IMO. Look at any of our fillrate graphs here and you'll see that nearly all current games have a fill-rate performance drop at 1600x1200 with 4X FSAA, indicating that AGP texturing is occuring - if this is the case on curren games, its most assuredly going to be the case with DoomIII and the detailed textures that uses.
 
Joe DeFuria said:
Borsti said:
In my results NV35 was not faster with FSAA. Here some results in 10x7 Medium quality....Looks as expected to me.

Did you read your whole article? ;)

http://www6.tomshardware.com/graphic/20030512/geforce_fx_5900-12.html

http://www6.tomshardware.com/graphic/20030512/geforce_fx_5900-13.html

1024x768

HIGH QUALITY:
Radeon: 61
GeForceFX 5900: 55

HIGH QUALITY + 4X FSAA:
Radeon: 43.3
GeForceFX 5900: 57.1

Radeon acts as one would expect, certainly not the GeForceFX.

About the ATI driver again. I wonder why the driver would detect the larger framebuffer but won´t use it...

Who knows...there can be any number of reasons. Detecting more memory is a far cry from having drivers that utilize it efficiently, or at all. Cat 3.2 was never an official driver for the DDR-II based Radeon. The facts are, ATI told [H] that the 3.2's don't utilize 256 MB. Do you only believe nVidia when they tell you there are driver issues?

But there´s also no big performance difference between 128/256

You simply can't say that, unless we know that the Radeon9800 256 MB version is actually using 256 MB. All you know is that the 9800 with 128 MB, acts the same as a 9800 with 256 MB, that is only using 128 MB. No surprises there.

Ouch... that´s a copy error in the charts. The fsaa results in that table were run in medium quality, not HQ.

I did not want to post more HQ results until I knew more about what happens in Doom III´s HQ modes. At the time we published the article I was still waiting for a response from John. My guess was that AF is on in HQ but I was not sure at that time.

ATI may have told [H] about the cat32 and 256MB but they did not tell me.

"Do you only believe nVidia when they tell you there are driver issues?"... do I read a imputation here? What do you want to say with this in detail? But to give you an answer: Yes, I often to not believe information I get from ATI or NV marketing people prior to the launch of a new card. Simply because they often don´t know. I don´t know where [H]´s info came from but I guess it will be correct. I´ll try to run some tests with 128/256 and cat32 in high-res FSAA to see if it´s using the larger framebuffer or not.

Lars
 
DaveBaumann said:
However, at one point you concluded that the extra 128MB of memory does not make any difference (in a 1600x1200 with 4X FSAA test) based on the Radeon's scores, but ignoring the massive performance delta between the 128MB NV30 and 256MB NV35, despite their relative closeness in other cases - this should have thrown up some warning signs IMO. Look at any of our fillrate graphs here and you'll see that nearly all current games have a fill-rate performance drop at 1600x1200 with 4X FSAA, indicating that AGP texturing is occuring - if this is the case on curren games, its most assuredly going to be the case with DoomIII and the detailed textures that uses.

Hi Dave,

that´s a good point. The lower results of the NV30 compared to NV35 did not surprise me and I thought it´s due to the fact that NV30 has a way lower memory bandwith. But you´re right. At that resolution there should be a difference between 128/256MB cards.

The biggest problem with all the Doom III benchmarking was the limited time. We only had one day to play with it.

I apologize for the error in the FSAA charts and the confusion it made. They were run in medium quality, not HQ.

Lars
 
DaveBaumann said:
Remember, the 256MB board is running 350MHz RAM, while the 128MB is running 340MHz, that could quite easily account for the small performance difference noted there.

However, at one point you concluded that the extra 128MB of memory does not make any difference (in a 1600x1200 with 4X FSAA test) based on the Radeon's scores, but ignoring the massive performance delta between the 128MB NV30 and 256MB NV35, despite their relative closeness in other cases - this should have thrown up some warning signs IMO.
AFAIK the delta between 128MB NV30 & 256MB NV35 is based on "interleaving" used to access memory (same which was used in some 8500's)
 
Hey Lars,

Borsti said:
that´s a good point. The lower results of the NV30 compared to NV35 did not surprise me and I thought it´s due to the fact that NV30 has a way lower memory bandwith. But you´re right. At that resolution there should be a difference between 128/256MB cards.

OK. I hope you've double checked with ATI to clarify the situation with the 256MB issue and it warrants it I assume the article will be updated (although most of the people who are going to read it probably have read it by now),

The biggest problem with all the Doom III benchmarking was the limited time. We only had one day to play with it.

I don't mean to sound pissy or anything but honestly, why did you accept? Whenever something like this is done you are bound to end up with errors that lead to the types of mistakes that have evidently occured here and only having it for one day means you now can't double check that errors have occured. Add to the fact that clearly one of the vendors you are comparing with has no idea that this is happening, has issues with their latest drivers and (although you didn't realise) issues with their previous on their latest board surely allowing for some time for it to be a representative test would have been the fairest thing to do?
 
I agree, I'm sure there was alot of excitement, but people rely on these websites to help them with their purchasing decision...and the most important part should always be accuracy.
 
Borsti said:
Ouch... that´s a copy error in the charts. The fsaa results in that table were run in medium quality, not HQ.

That's a pretty big error.

For the record, you also made a typographical error here:

http://www6.tomshardware.com/graphic/20030512/geforce_fx_5900-11.html

All of the charts on this page say "Medium Quality", though you typed "High Quality" as the title of the page.

I did not want to post more HQ results until I knew more about what happens in Doom III´s HQ modes. At the time we published the article I was still waiting for a response from John. My guess was that AF is on in HQ but I was not sure at that time.

I don't understand. Whether AF is on or off in HQ, why should that prevent you from doing more tests? Or do you think it's only being turned on for one card, and not others?

For the record, it has been said that HQ turns on "8X AF" and turns off texture compression. (I can't of course verify that myself.) It sounds to me like, assuming the AF that is turned on with the FX is it's "application" AF...that the FX AF just takes a much bigger performance hit than ATI's AF. That is nothing new, and explains the results quite nicely, actually.

I assume realizing this, nVidia will in the future not (at least easily) allow their "high quality" AF to be turned on by Doom3....anyone care to make a wager on that?

ATI may have told [H] about the cat32 and 256MB but they did not tell me.

Did you ask ATI what the possible implications were of not using the drivers they gave you, and reverting back to older drivers?

In other words, why were the HQ benchmarks basically discarded? Why wasn't AA tested at those settings? These are 256 MB $500 cards. That would be one of the FIRST things I tested.

This question doesn't pertain just to you, but to H and Anand as well...NO HQ benchmarks?! For these cards? Interestingly, Anand claimed that HQ scores didn't change much from Medium quality scores. YOUR benchmarks seem to go agaist his claims. At least with nVidia hardware. nVidia hardware takes a major hit, and ATI takes a relatively minor hit.

Seems to me that would be a highly relevant characteristic to take note of, and yet not one web site picked up on it. Your review came closest, because at least you ran one set of tests. Although you seem to pass off the large hit as more of a bug, rather than assume it's not.

Honest question: did nVidia request that you benchmark medium quality, or suggest not not benchmark high quality?

"Do you only believe nVidia when they tell you there are driver issues?"... do I read a imputation here? What do you want to say with this in detail?

I'm saying, when the "High Quality" results didn't go as "planned" (nVidia was "only" on Par with ATI), you asked nVidia...."Why?" They said possible bug, which is apparently enough for you to not bother doing further high quality testing?

And yet, when the Cat 3.4's didn't work as expected, you just used the 3.2's without apparently asking ATI what the implications of that could be?

The biggest problem with all the Doom III benchmarking was the limited time. We only had one day to play with it.

I completely understand this problem.

And this is one main reason why I so detest the manner in which the whole Doom3 benchmark thing was handled. Especially when you have a brand new benchmark, having such limited time, and not being able to go back and investigate issues.......things like this are assured to happen. I'm not saying I blame you for running the benchmarks (who wouldn't, given the chance to run Doom3?) But nVidia and Id did the whole community a HUGE disservice for introducing Doom3 performance this way. It's not fair to you, and it's not fair to us, the consumers.
 
chavvdarrr said:
AFAIK the delta between 128MB NV30 & 256MB NV35 is based on "interleaving" used to access memory (same which was used in some 8500's)
I highly doubt it.
nVidia had a far more efficient memory controller with both the GeForce3 and GeForce4 than ATI had with the Radeon 8500.
 
Joe DeFuria said:
Sigh....

It really does sadden me that it looks as if B3D doesn't get Doom3 to do proper benchmarking with...

But it is completely understandable. What puzzles me is why anyone would waste time contemplating anything about this demo software.

(1) It's an established fact the original demo was assembled entirely by nVidia

(2) Carmack admits to "not liking" the nVidia form of the demo, and admits to changing it--but says nothing as to what effect his changes, if any, had on nv35 scores relative to the original made-by-nVidia format. He carefully avoids this topic, in fact. Therefore, whether Carmack's changes improved nv35 numbers, or degraded them, simply isn't known.

(3) Although nVidia had weeks if not months to prepare drivers for the demo, ATi was given no advance notice of it and did not, as nVidia did, prepare an OpenGL driver specific to it.

(4) Not only was the demo not made available to the public, most web sites are cut out of it, too. Hence, there is very little PR value to ID Software here--at least no more than the company might have obtained by simply releasing some new screen shots of Doom III. So the PR value here is largely for the benefit of nVidia's nv35 products--which takes us in circular fashion back to item (1) above.

(5) Why is it that ID Software considers the demo adequate for a 3D card comparison, but inadequate for public consumption? Indeed, inadequate for the consumption of most web sites? If it's not polished enough for ID Software to feel comfortable with it as a genuine representation of the game which can be released to the public, why is it suitable for a 3D card comparison (even if you forget the biased situation relative to ATi)? To me, this appears as a contradiction (but only if it is assumed that the demo had some other purpose apart from making R350 look bad by comparison.)

For these reasons I see this demo as nothing except a publicity event sponsored for nVidia's nv35 marketing benefit, with little to no value apart from that. Releasing this demo to the public where it could be meticulously dissected and analyzed, however, would answer many of the questions I raise. But I suspect that this is the actual reason behind the so-called "security" concerns which in effect do nothing except protect whatever interesting "optimizations" exist within this demo.
 
..so in the end this whole doom3 benchmark stunt is a total mess. Wrong numbers , wrong settings , wrong interpreted.
 
tEd said:
..so in the end this whole doom3 benchmark stunt is a total mess. Wrong numbers , wrong settings , wrong interpreted.

Yes and no.

Some of the numbers may be right, some wrong, and some are misinterpreted. Problem is, we have no real way of finding out "the truth." Total mess indeed.
 
WaltC said:
Why is it that ID Software considers the demo adequate for a 3D card comparison, but inadequate for public consumption? Indeed, inadequate for the consumption of most web sites? If it's not polished enough for ID Software to feel comfortable with it as a genuine representation of the game which can be released to the public, why is it suitable for a 3D card comparison (even if you forget the biased situation relative to ATi)?

100% agreement.

Even the way that Epic did the UT 2003 benchmark introduction was better (though far from ideal as well). At least Anand had complete access to run all kinds of UT2K3 benchmarks, with no time constraints, and direct access to Sweeney to answer questions.

Indeed, one of Carmack's "concerns" was apparently that the visuals at this stage (art) doesn't necessarily reflect that of the final game, and he didn't want people looking at the demos / screenshots as final quality stuff.

Well, if the artwork is going to change that drastically (including adding and taking away lights, I imagine), that can significantly impact performance results.

I still have to ask Id software "why now." Why at this point in time was it "important" to have a hardware comparison? Why not 6 months ago? Will it be important again in 2-3 months when the ATI R360 is launched?
 
I'm a bit disappointed with this act of ID and especially Carmack. His objectivity was reliable so far, but who knows if his opinions can be 'trusted' in the future? I only hope that this was just a single case of payback for last year's Doom3 leak...
 
This thread, and others like it, get way too much attention.

WaltC pretty much summed it up - good job. As others did as well.

Only reason this stuff was released, in my opinion, was to get attention and/or hits. The tests are invalid and should have been thrown out. Anyone with half a brain could have figured that out. Hell - look at the damn Splinter Cell results, shader tests, then Doom 3 - wtf? Something is amuck. But no - got to release the attention getter.

Bah! This crap from Toms, H, and Anands is nothing more than armature porn...rookies for sure.
 
Interesting how all the mistrust nVidia engendered due to their 3DM03 hijinks has somehow turned around and bit JC/Id on the ass. The only two people I'm newly upset with are nVidia, for their driver stunts, and Anand, for what appears to be errors on almost every single benchmark in his review (Doom 3, Quake 3, *and* Splinter Cell--gimme a break!). I'm sorry to say, Lars, but I still don't entirely trust you/Tom's for benchmark numbers with correct analysis. Sadly, I'm lumping Anand into that list.

I think none of us would have had a problem with D3 benches if reviewers had omitted the scores of cards that had trouble with it, like they do in most other benchmarks. But it's sort of understandable why they'd be anxious to publish the numbers anyway--more hits means more advertising $ means making a profit. Unfortunately this understanding goes out the window when we see what are probably the two most (only?!) profitable HW sites on the web, Anand's and Tom's, commit such amateurish errors. It's a sad reflection of how reliant our reviewers are to the various IHV's for review samples. I'd think both sites would have enough money and clout to buy cards at retail and tell their readers to wait for their (potentially less biased) reviews using retail cards and shipping drivers.

As for ATi coming up smelling like roses, it doesn't seem too far-fetched to me that they might have known their cards perform worse than nV's in D3 and thus purposefully sabotaged their Cat 3.4's in order to have an excuse for their low numbers or lack of optimization. Sad that I now have to consider evil motivations behind every questionable result.
 
So when can we expect the review to be updated?
While you are updating you might as well put up all of the numbers you have too. ;)
 
WaltC said:
(1) It's an established fact the original demo was assembled entirely by nVidia

The demo I tested with was recorded by Tim.

(2) Carmack admits to "not liking" the nVidia form of the demo, and admits to changing it--but says nothing as to what effect his changes, if any, had on nv35 scores relative to the original made-by-nVidia format. He carefully avoids this topic, in fact. Therefore, whether Carmack's changes improved nv35 numbers, or degraded them, simply isn't known.

There were still some missing or wrong textures in the demo level. id did not want us to see that. That´s why the demos were changed. My demo was recorded on saturday... not much time until sunday to optimize a driver for. But you´re right that we can´t be 100% sure that it happened that way.

(3) Although nVidia had weeks if not months to prepare drivers for the demo, ATi was given no advance notice of it and did not, as nVidia did, prepare an OpenGL driver specific to it.

NVIDIA and ATI always tell us that they are working very close with id. I was very surprised to hear from ATI that there´s no optimiaztion for Doom III implemented in their drivers yet.

(4) Not only was the demo not made available to the public, most web sites are cut out of it, too. Hence, there is very little PR value to ID Software here--at least no more than the company might have obtained by simply releasing some new screen shots of Doom III. So the PR value here is largely for the benefit of nVidia's nv35 products--which takes us in circular fashion back to item (1) above.

(5) Why is it that ID Software considers the demo adequate for a 3D card comparison, but inadequate for public consumption? Indeed, inadequate for the consumption of most web sites? If it's not polished enough for ID Software to feel comfortable with it as a genuine representation of the game which can be released to the public, why is it suitable for a 3D card comparison (even if you forget the biased situation relative to ATi)? To me, this appears as a contradiction (but only if it is assumed that the demo had some other purpose apart from making R350 look bad by comparison.)

I would like to talk on that as well but I would violate NDA information so I cant :(

One reason for id was that many people have performance numbers from the leaked Alpha in mind. They wanted to show that reality is different. id was also VERY carefully this time. They did not want to see a new leak again. The game was dongled as well.

We´ll see how usefull those results were when the first demo will be released.

@Joe
My results are correct - as far as I can say that. I messed up the table description but that does not change anything on the results (numbers). I don´t know why Anand said that there´s no difference between HQ or medium. You have to ask him. I found a big difference between those settings. And John confirmed that AF and noTC is beeing used in that mode. The reason I did not ran more HQ tests for the article was that I did not know what happened. There was a performance difference between the AF the app wants and the AF setting the driver is forcing. I spoke to JC about this and he thinks about solutions for that.

The default AF rendering in Detonator FX is Quality - as well as in ATIs driver. There was a little difference between HQ and the AF driver settings (about 0,5 to 0,2FPS depending on the resolution). So HQ shows correct numbers of the high quality AF modes of both cards. One other reason why I did not more HQ testings was simply time. I thought it´s more important to run tests with as many cards I can. And it takes a lot time to run these tests. I had to decide between detailed tests with two cards or running as many cards I can. In the end I did a mix of both.

I posted the HQ numbers since I think they are important and interresting. The Radeon wins at lower resolutions, that´s what I wanted to show. There were also no rules from NV that we had to test specific quality modes only or anything else.

I now know that I could have done things better and I learned from it. id wanted to show some Doom III numbers. From what I learned from id´s politics so far is that they do not publish anything until it´s done. So I still trust in those numbers. JC would´nt have allowed this if there´s still trouble with the codepaths. But in the end it did not change anything in the NV35 review. Those DIII numbers were just a bonus. There´s no word in my final conclusion about it.

I know that ATI feels handicaped here. But it´s not my fault that they did´nt optimize the driver yet (as they say). I remember one nice ATI PR from last year on that:
http://www.ati.com/companyinfo/press/2002/4495.html

I also know that ATI would have liked to do the same thing with Halflife2, since it´s developed on ATI cards and the drivers are allready optimized for the game. The reason why that did not happen was that Valve did not like to do this (yet). I would´nt have any problem in running HL2 benchmarks at this time. If the game developer says that they´re ready for a performance evaluation there´s no reason not to do that. If one company neglected their optimization for a certain game yet - well, shame on them!

ATI is well aware of the development status of Doom III. They also know that we´ll see test version of it in the near future. So shame on ATI if they did not optimize their driver yet. The same for NVIDIA if it´s true what ATI says, that the shadercode in HL2 runs 10x faster on ATI cards than on NV cards. Shame on NV if that´s the case.

I don´t want to take care if it fits into the marketing of one company if run benchmarks with a game, even if it´s not yet released. Why should I take care? Everybody says that they are working very close with the game developers but if you run some benchmark everything is suddenly different?

Let´s take HL2 as an (virtual) example. If you want to buy a new card right now and you´re waiting for that game. If I run benchmarks at this time with an alpha the ATI cards might look better. So what´s the conclusion for that guy? He´ll buy an ATI card. Is that unfair? Unfair for whom? For NV or for the consumer? Who is more important? If NV did´nt optimize for that game it´s their own fault. And in the end: If ATI started optimization a long time ago and NV starts right now... which card will have better performance when the game is released? And how long will it take for NV to close the gap?

One can argue that there might be an optimization but it´s not implemented in the public drivers yet. But why should they make such a silly descission?

Lars
 
Back
Top