First few GFX benches

Sharkfood · Jan 7, 2003

Now Reverend..

Your comments really open up a can of worms, but I have to say I side with your feelings on making non-nonsense comparisons. It all comes down to the age old decision of- who is the audience for a given piece concerning hardware?

I look at genuine hardware reviews falling into two disctinct categories-
1) Technology/potential illustration.
2) Buyer value/delivery evaluation.

I find that between yourself, and Matt + the folks at 3dgpu, category #2 is the template you use and has the most amount of value for people looking for hardware to play X game at setting Y and performance Z. Using this target audience, it's almost certain that subjective commentary will be crucial, IQ baselines for similar featuresets need to be created and commentary will be of a very personal nature. It has to be since you need to explain and illustrate the reasonings behind your baselines of comparison in order to dispell the throves of other sources that are nothing more than expensive marketing campaigns.

Technology is more the stuff of a Beyond3D analysis. Straight-forward, to the point description of featuresets, non-nonsense benchmarks to show chipset *behavior*, and a comprehensive look at the features so as there is little room for misconception.

I don't believe the two styles can be compatible, which is where the struggle begins. Either you are going to lay out features and potentials with lots of data and no stipulations.. *OR* you try to create baselines of value to those looking for a new videocard for gaming and create a suite of examples to illustrate pro's/con's. You can't do both since they are mutually exclusive. Trying to do both with also alienate and reduce your readership/target audience.

So I look at your typical Beyond3D analysis and your typical "Reverend" review as being two different beasties.. and I think they should stay that way as the two together can be a valuable tool for consumers. I also hope for this reason you will continue to coax hosts to put up your Rev-style reviews on all future products (R300 is sorely missed) so as the value from this can also be added to the fray.

I think between Beyond3D, 3DGPU and a Rev article, a gamer and a tech head can pretty much extrapolate much of what they need to know about a particular piece of hardware. They can then visit the Anands, Firing Squads, Toms and digitlifes to have a good laugh, knee slap and chuckle.

Sharkfood · Jan 7, 2003

Democoder-

No, the problem is that IQ isn't quantifable.

This is where we disagree.. and the heart of the conflict. It's a cheap, wanna-be Jedi mind trick to try and convince sheep that IQ can't be quantified, when in reality it surely can.

I'm not talking the requirement to ultra-zoom in on an image at 700x to pick out a few stray pixels, I'm talking stark differences that an "average" configuration and an "average" person can surely notice. And by "average" I mean something realistic and tangible.

We can go over signal theory and over-FUD the crap out of a valid discussion, or hone in on the real world. It's up to you. I usually don't waste my time with such juvenile antics, but it can also be entertaining.

The point being- there are some (read- many) conditions where IQ is quantifiable by the "majority"- as in like what Doomtrooper posted. To suggest that an average Joe couldnt tell the difference between those two screenshots, or even better yet, the two conditions in motion is just lunacy. Again, the really staunch advocates might need to proclude their statements with a hidden/missing (*) = (on a B&W 13" TV using composite TV-OUT at 640x480) after each of their sentences of "The average Joe wouldnt notice that (*)" and "There is no difference between 2x and 4x AA (*)" kind of nonsense.

I've heard it all before though. I remember specifically people clearly explaining they noticed no difference in Quake between software mode and 3D Accelerated mode. The same anti-3D people also kept this illusion alive well into Quake2. There were even those that went so far as to explain Quake2 in software mode looked "far superior" - living that myth of "IQ is subjective" nonsense and stretching subjectivity beyond the voids of infinity. It really doesn't fool anyone nor does it gain any followers as far as objectivity goes.

Obviously the fuzz or grey for how far a variance should be measured is what varies (usually from person to person), but it is rather ignorant to suggest the bounds of variance can be infinite. That is the point. A pin prick vs getting your finger chopped off with a machete are obvious a noticeable difference, about as stark as some of the "no difference" comparisons some would try create. There are some (not all) universally acceptable deltas in image quality and to try and fuzz them up with "subjective" nonsense is just foolhardy.

It's either that or those folks join the "Quake2 in Software Mode" annals of the history of objectiveness.

Randell · Jan 7, 2003

Although its out of date, because the style is different, could B3D please consider hosting Rev's R300 article? If not just as a favour to him. as a favout to those B3D regulars who appreciate his POV.

Obviously he has long standing links to this site and I hope any issues are a thing of the past.

LeStoffer · Jan 7, 2003

Randell said:
Although its out of date, because the style is different, could B3D please consider hosting Rev's R300 article? If not just as a favour to him. as a favout to those B3D regulars who appreciate his POV.

Obviously he has long standing links to this site and I hope any issues are a thing of the past.

I'll second that.

In fact, when I come to think of it, it could be a nice gesture of peace over those troubled waters of the past. :idea:

martrox · Jan 7, 2003

LeStoffer said:
Randell said:

Although its out of date, because the style is different, could B3D please consider hosting Rev's R300 article? If not just as a favour to him. as a favout to those B3D regulars who appreciate his POV.

Obviously he has long standing links to this site and I hope any issues are a thing of the past.

Click to expand...

I'll second that.

In fact, when I come to think of it, it could be a nice gesture of peace over those troubled waters of the past.

I'l add my voice to the choir..........

Fuz · Jan 7, 2003

I think we might be seeing some of Rev's stuff sooner or later.

Reverend said:
Hopefully, the New Year will treat all of us well and wars will be avoided. Having spent some time thinking about the now and the future I cannot imagine anything more important than Peace right now... my almost-4-year-old son is a constant reminder of how important being alive is. Happy 2003 everyone.

PS. I shall return.

He posted that on New Years Eve.

DemoCoder · Jan 7, 2003

This is where we disagree.. and the heart of the conflict. It's a cheap, wanna-be Jedi mind trick to try and convince sheep that IQ can't be quantified, when in reality it surely can.

Main Entry: quanÂ·tiÂ·fy
Pronunciation: -"fI
Function: transitive verb
Inflected Form(s): -fied; -fyÂ·ing
Etymology: Medieval Latin quantificare, from Latin quantus how much
Date: circa 1840
1 a (1) : to limit by a quantifier (2) : to bind by prefixing a quantifier b : to make explicit the logical quantity of
2 : to determine, express, or measure the quantity of

When you understand the difference between quantity and quality, rejoin the discussion. Otherwise, please produce an objective procedure which can quantify image quality

I'll also be waiting for your Movie-Review-o-Matic algorithm which can automatically and objectively assign numerical ratings to any movie, dispensing with the Academy and Roger Ebert, so that we will have absolute mathematical proof that Citizen Kane is better than Freddy Got Fingered.

BoardBonobo · Jan 7, 2003

DemoCoder said:
Main Entry: quanÂ·tiÂ·fy
Pronunciation: -"fI
Function: transitive verb
Inflected Form(s): -fied; -fyÂ·ing
Etymology: Medieval Latin quantificare, from Latin quantus how much
Date: circa 1840
1 a (1) : to limit by a quantifier (2) : to bind by prefixing a quantifier b : to make explicit the logical quantity of
2 : to determine, express, or measure the quantity of

Click to expand...

When you understand the difference between quantity and quality, rejoin the discussion.

I'm sure there must be some far out statistcal way of quantifying the qualitative feeling of a subjective experience in an objective manner... :?

no_way · Jan 7, 2003

DemoCoder said:
Perhaps the other suggestion is better: equalize the framerates, compare the IQ. E.g. @ 1600x1200 @ ~60fps, whose got best IQ.

Nice try .. but failed again. Then you get people of kind E, that come claiming that 60fps is not the framerate of choice ( might not be IQ/speed sweet spot for vendor A or Bs HW ) and say that average gamer is not able to tell the difference between 40fps and 60fps or the other way around.. whichever is the trend of the day.
Still, such section in comparsion/shootout would be interesting to see though. ie. take app N
~30fps,1024 best IQ screenshots for card A & B
~30fps,1600 best IQ screenshots for card A & B
~60fps,1024 best IQ screenshots for card A & B
~60fps,1600 best IQ screenshots for card A & B

But even then you get people who come claiming that for card A you didnt use the best settings ( one selected higher-level AA over AF or turned off trilinear etc ) whereas the person in question would have opted for different settings

Reverend · Jan 7, 2003

Guys, thanks for the vote of confidence. I had already approached Dave about returning to B3D and his first and only suggestion is for me to work on a R300 IQ article, of which I gave him a sample of my 9700Pro-vs-GF4Ti4600 performance + IQ shootout that was originally due to appear at VE. We have both agreed that I can expand on the article (the original was too short for my liking due to VE's word count limitation), something which I haven't done due the past Xmas-to-New-Year-holiday-feeling period.

I'll get started on that once I receive a reply from Dave about a reverend@beyond3d email request. Dave, you didn't reply to my PM of such a request...

I'm disinclined to re-do my original VE 9700Pro review for it to appear at B3D (if I'm officially accepted), something which should be done (re-doing it that is) but would basically mean I'll have to re-write the entire review, something I view as tiresome in my mind.

To be honest, however, I'm just simply currently enjoying my time of not having to work at any site for the moment... the 9700Pro really is a lovely card to game with and my current fav-game is 007 Nightfire (a bit too much NOLF-ish but that's a good thing!). 1280x1024x32bit + 6xAA + 16xAF running extremely well is a very nice gaming feeling.

Fred · Jan 7, 2003

Democoder,

Why dont you think that Antialiasing image quality isn't quantifiable?

For instance, I could make the assumption that AA error ~ AA iq.

From there it should be straight forward to take something like the following

sqrt ( (ideal - actual)^2) at each sample point in fourier transform space. Numerical integration should be possible here.

I could even throw in a human bias kernel, preferentially adding a larger measure to certain trouble spots in the spectrum (like near horizontal and vertical lines).

Thats one aspect of Antialiasing image quality, of course there are others (like say intensity between ideal and actual. Gamma corrections etc. Another error measure could be invented here, etc etc)

DemoCoder · Jan 7, 2003

I suggested such a method earlier (create ideal software rasterizer, compare HW against idealized scene, calculate error)

But a RMS of the difference won't take into account psycho-perceptual qualities when we perceive the distribution of that error. For example, in dithering algorithms and image quantizers we try to achieve error dispersion. (e.g. floyd steinberg). In AA, a fixed jittered grid might have the same RMS as a truly stochastic AA, but the stochastic version might be perceived as better because the error distribution isn't a fixed pattern throughout the scene. Edge aliasing is perceived differently than texture aliasing, so those distributions might have to be weighted differently as well.

I just don't think any reliable algorithmic method can be used, which is why groups like ISO/MPEG/JPEG use human trials for audio and video codecs, because something like an RMS of the error just doesn't fully encapsulate what we perceive as quality difference.

demalion · Jan 7, 2003

DemoCoder said:
This is where we disagree.. and the heart of the conflict. It's a cheap, wanna-be Jedi mind trick to try and convince sheep that IQ can't be quantified, when in reality it surely can.

Click to expand...

Main Entry: quanÂ·tiÂ·fy
Pronunciation: -"fI
Function: transitive verb
Inflected Form(s): -fied; -fyÂ·ing
Etymology: Medieval Latin quantificare, from Latin quantus how much
Date: circa 1840
1 a (1) : to limit by a quantifier (2) : to bind by prefixing a quantifier b : to make explicit the logical quantity of
2 : to determine, express, or measure the quantity of

Click to expand...

When you understand the difference between quantity and quality, rejoin the discussion. Otherwise, please produce an objective procedure which can quantify image quality

I'll also be waiting for your Movie-Review-o-Matic algorithm which can automatically and objectively assign numerical ratings to any movie, dispensing with the Academy and Roger Ebert, so that we will have absolute mathematical proof that Citizen Kane is better than Freddy Got Fingered.

"quantity: The measurable, countable, or comparable property or aspect of a thing"

Image quality isn't comparable? If it is, your sidetrack doesn't hold water. If it is not, please illustrate? To me it looks like you are attempting to end run around the argument by playing word games, and the definitions of the words you are using don't support it. Which to me, would mean that text of yours is what semantic dishonesty looks like, some people seem confused that any "semantic" discussion is an attempt at dishonesty. Or is it only when other people discuss semantics that it is dishonest, nevermind the content or purpose of the discussion?

As for your movie review analogy, people are not arguing for a "Movie-Review-o-Matic", they are arguing for efforts akin to "the Academy and Roger Ebert". I'm not sure where you are going with that, the analogy doesn't seem to hold together.

Sharkfood · Jan 7, 2003

I'll also be waiting for your Movie-Review-o-Matic algorithm which can automatically and objectively assign numerical ratings to any movie, dispensing with the Academy and Roger Ebert, so that we will have absolute mathematical proof that Citizen Kane is better than Freddy Got Fingered.

Thank you for quantifying just how much BS someone is willing to shovel just to get around a known fact concerning comparisons of quality. That quantity of shovel fodder is obviously quite large.

So then, by your basis of judgement, there is no quantifiable difference possible between these two images:

And albeit to a narrower scale of quantifiable difference than shown above, it is impossible to quantify the quality difference between:

And no, nobody has to consult Roger and Ebert to point out which of the pairs of images above have a quantifiable improvement between the two. The amount of which is of no concern. The general hypothesis of A > B can be derived. No review board of scholars is needed. No consulting of 500X magnified pixels is needed.. and no delta image analysis tools are required. Any "average Joe" walking off the street can deduct the same opinion concerning quality.

It is a pretty desperate and dire tactic to simply truncate all noise into nullspace, regardless of how much detail is lost in the noise, when it comes quantifying image quality.

It is absurd, if not downright laughable, to suggest linear physics or quantum mechanics are needed to discern such sizeable, quantifiable differences in IQ when the delta is so large that there is no need for intense scrutiny.

So A > B to the entire species, except when IHV = DC's favorite, at which point it's time to stipulate that IQ is obviously subjective and that discerning quantifiable deltas requires rocket science and therefore shouldnt be brought to the table.

Althornin · Jan 7, 2003

DemoCoder said:
I suggested such a method earlier (create ideal software rasterizer, compare HW against idealized scene, calculate error)

But a RMS of the difference won't take into account psycho-perceptual qualities when we perceive the distribution of that error. For example, in dithering algorithms and image quantizers we try to achieve error dispersion. (e.g. floyd steinberg). In AA, a fixed jittered grid might have the same RMS as a truly stochastic AA, but the stochastic version might be perceived as better because the error distribution isn't a fixed pattern throughout the scene. Edge aliasing is perceived differently than texture aliasing, so those distributions might have to be weighted differently as well.

I just don't think any reliable algorithmic method can be used, which is why groups like ISO/MPEG/JPEG use human trials for audio and video codecs, because something like an RMS of the error just doesn't fully encapsulate what we perceive as quality difference.

right, and besides, what woudl be more noticeable - one glaring pixel error every 100 pixels, or 10 pixels errors that are barely noticeable in a 700x zoomed in shot every 100 pixels?
If you try and mathematically quantify image quality, your algorithm will have points where it would say the error is "equal"...when one looks much better.'

Nice try .. but failed again. Then you get people of kind E, that come claiming that 60fps is not the framerate of choice ( might not be IQ/speed sweet spot for vendor A or Bs HW ) and say that average gamer is not able to tell the difference between 40fps and 60fps or the other way around.. whichever is the trend of the day.
Still, such section in comparsion/shootout would be interesting to see though. ie. take app N
~30fps,1024 best IQ screenshots for card A & B
~30fps,1600 best IQ screenshots for card A & B
~60fps,1024 best IQ screenshots for card A & B
~60fps,1600 best IQ screenshots for card A & B

But even then you get people who come claiming that for card A you didnt use the best settings ( one selected higher-level AA over AF or turned off trilinear etc ) whereas the person in question would have opted for different settings

As for this, sorry , its not a failure because some else thinks the sweet spot is in a different place.
At least you are basing your subjective comparision upon an actual equality of speed, instead of trying to base speed upon some meaningless (subjective) IQ equality.

You cant argue that 60 fps avg != 60fps avg
Your baseline for comparision needs to be something that is NOT subjective - IQ, unfortunately, is.

Joe DeFuria · Jan 7, 2003

On "measuring image quality."

I pretty much agree with DemoCoder that trying to "objectively measure" image quality is pretty much impossible. For the sole reason that no matter what type of "error" calculation one might come up with, it cannot tell the whole story. Two images might have similar "absolute error" values when compared to some ideal reference image, but the images can still have the error distributed differently. And one person might prefer one image over the other, and vice versa.

That being said, I fully agree with Walt and Shark that the reviewer's subjective opinion on relative image quality is imperative.

Going with the movie review analogy....

There are two different movies, A and B, that I want to see. I can only see one of them. So I take a quick look at the "star" ratings of each movie from 5 different reviewers:

A few things can happen:

1) Movie A universally scores higher than B.
2) Movie B universally scores higher than A.
3) Opinions are mixed.

If number one or two happens, I'll go and see the relevant movie, without even reading many of the reviews in detail.

If number 3 happens, that's when as the consumer, I check the reviews in detail, to see why some reviewers scored each movie the way that they did. Some reviewers may have scored movie B low, for example, for some reason that I don't care about. I try and find the reviewer's that based their scores on similar things that I personally find important.

Now, here's the real important part.

After reading several reviews over several weeks, and watching some of the movies they rate, I get to "know" each of the reviewers. I find myself tending to agree with the "analysis" of one or two of them more consistently than not. I find that these two particular reviewer's suibjective analysis is typically in line with my own.

So, now, instead of looking a many, many reviews each week...I only look at the one or two. I have a high degree of confidence that what they like, I will also like. No guarantees of course.

And this is where 3D card product reviews should be headed, IMO.

Each "reviewer" (Anand, Tom, Rev, Matt...) should be supplying subjective image qualty opinions (and overall subjective product opinions) backed up with numbers / screeshots to support their opinions. The point is NOT for every reviewer to come up with the "same" answer.

The point is that the reveiwers should start to build up some confidence with their readers. Not only that the objective numbers are "correctly" measured and presented, but that the subjective opinions have a basis that the readers can identify with.

Each reviewer may have different personal feelings about "what is most important." And hopefully, that stays consistent over all of their product reviews. And there lies the current problem. What many "fan sites" (and some purported 'neutral sites') seem to find most important is "who makes the card."

Depending on the manufacturer, "what is more important" can change from review to review. Is it top speed with reduced image quality? Top quality with a minimum FPS? How fast it can run next-gen feature tech demos?

So, what needs to happen is:
1) Reviewers start to include subjective opinions, and providing reasons for their opinions.
2) Do it in a consistent manner.

If that happens, we'll still end up with different reviewers "liking" different products, but then we can all start to gravitiate toward the reviewers that more often than provide reasonable answers.

And the reviewers that most often resonate with the populace, should become the more popular reads....

Joe DeFuria · Jan 7, 2003

You cant argue that 60 fps avg != 60fps avg

No, but there are also problems with trying to compare on the basis of identical frame rate:

1) Just as it can be kinda "impossible" to have identical image quality as a starting point, it can be impossible to have identical FPS. What if you are at max quality, and the closest you can get is one card getting 70 FPS, and the other card getting 78 FPS? And if you decrease the quality "one notch" of the 70 FPS card, you get 85 FPS. How do you compare them?

2) You can't guarantee that 60 FPS avg = 60 FPS avg. The instantaneous FPS profile is likely different.

Doomtrooper · Jan 7, 2003

The most irritating part about the reviews from the big sites is their FSAA comparisons back when ATI was using Super Sampling with no regard to IQ (two different implementations yet benchmarked according to slider)..

which of course is a joke...

Quincinux was the real laugh blur vs texture filtering on Supersampling, then post nice graphs that show the multisampling smoking supersampling not once thinking (hey I should raise the FSAA level to try to match the IQ of the supersampling card)

I disagree with anyone that thinks a baseline for IQ could not be set, you set a resolution with a desired frame rate and start playing with the settings for good IQ and performance.

Say 1024 x 768 100 fps on one DX title and one OGL title- tweak whatever it takes to to match the IQ (lod, AF, FSAA) and once thats initially set (could be very time consumung initially) then the party can start with benchmarking and never touching those settings again.
Common sense needs to come in to play here, everyones taste is slightly different but generally we all went good texture quality (no blur), good AA and good depth perception (AF).

tamattack · Jan 7, 2003

Joe DeFuria said:
And the reviewers that most often resonate with the populace, should become the more popular reads....

Ideally, yes. But I think that we're all old enough to realize that popularity does not equal "truth". In this day and age, marketing has reached the point where corporations have no small bearing on what may or may not become "popular".

Sharkfood · Jan 7, 2003

Going with the movie review analogy....

There are two different movies, A and B, that I want to see. I can only see one of them. So I take a quick look at the "star" ratings of each movie from 5 different reviewers

I don't think the "movie review" analogy applies the variances that I am describing.

They aren't matters of opinion or subjectively arguable.. or at least not with any amount of common sense or logic.

If you were to use the "movie review" analogy- it would be:
Situation A) Air conditioned theatre with *some* form of movie playing (be it bad or good)
Situation B) Sitting in an empty theatre that is 120 degrees with a ripped up canvas screen, yet the movie projector never is turned on nor is a movie being shown. You sit there in the sweltering heat for 2 hours and the ushers finally kick you out.

The thing is, those that might suggest the "movie" in situation (b) might be arguable to be superior would just simply be reminded that there was no movie shown. There was no cast of actors, no foreshadowing, no interesting camera angles or lighting, no symbolic meaning or plot. Nada.

There are some things in the world that are unanimously objective to declare as devoid or null. There is no subjective analysis needed to point these out.

We have "sources" willing to compare as "comparable" (due to this whole "subjective" nonsense theorem) images of smooth and correct color, smooth edges, near perfect alpha blends, sharp and correct texture translation/mapping versus absolutely horrible banding, absurd display errors, missing geometry from z-buffer problems, and aliasing that makes the whole scene look like a sparkler on the 4th of July.

IQ can indeed be subjective... to a point. The problem is when that point is stretched beyond the surreal in order to maintain equivalence in favor of one IHV versus another, or otherwise allow completely apples vs. oranges comparisons.. that would be obviously unanimous dissention by anyone.. well, other than Stevie Wonder or Helen Keller.

First few GFX benches

Sharkfood

Sharkfood

Randell

Senior Daddy

LeStoffer

martrox

Old Fart

Fuz

DemoCoder

BoardBonobo

My hat is white(ish)!

no_way

Reverend

Fred

DemoCoder

demalion

Sharkfood

Althornin

Senior Lurker

Joe DeFuria

Joe DeFuria

Doomtrooper

tamattack

Sharkfood