new way of benchmark results exposition possible?

I always thought a histogram would be nice, like the way they do it on storage review for access time, but with some modifications. All the bars below a certain frame rate would be coloured different, like blue for above 60 fps, yellow for between 30 and 60, and red for below 30.

But now that I've seen the box plot, I think it would be awesome if you could plot a histogram by either colour or opacity. You could even superimpose a line or box for anyone who doesn't care about histograms.

Can FRAPS be adjusted so that it could take data about every tenth of a second? I'm guessing it'll have to do a bit of interpolation to get the timing right, or do a 1-second running average which would also be sort of nice, actually.
 
PP said:
But, tell me if I'am wrong because maths are not my strong point. You are taking into account FPS on average and under average, right? My method only takes into account FPS under a given point (that couldn't be average because others card result in the same resolution will have its own averages) just because the interesting thing is what falls under, not everything. Do I explained it correctly? (sorry for my english)
Make sure you read Xmas's point.

For the sequence [61, 61, 61, 61, 31], you get avg 55, and avg fall of 29.

For the sequence [64, 64, 58, 58, 31], you get avg 55, and avg fall of 11.

If you really want to use this idea, then when you do your average, you take the sum of all fps below X and then divide by the total number of frames (including those over 60 fps). The only problem is that your numbers won't be big enough for a dramatic effect. The first sequence above will give you 5.8, and the second 6.6, and this is when one fifth of the frames are at 31 fps. Of course you can do some arbitrary scaling and call it your "drop index".
 
example 1:
Average : 55
Standard Deviation : 12
Median : 61
Min : 31
Max : 64

example 2:
Average : 55
Standard Deviation : 12.296
Median : 58
Min : 31
Max : 61

The Median seems a good replacment for Min/Max. (Xmas idea.)
So Average, Standard Deviation & Median should be enough.

Counting frames slower than a given treshold can be interesting, but then, you need to have each and every frame duration and give number as '% of frames under treshold'.
(I believe it's what you are trying to get back from what FRAPS gives you.)
 
Ingenu said:
Counting frames slower than a given treshold can be interesting, but then, you need to have each and every frame duration and give number as '% of frames under treshold'.
(I believe it's what you are trying to get back from what FRAPS gives you.)

FRAPS don't work very well for this.

But if you have the frame duration for every frame you can do things like this:



Sorry for the german but it is used by a german print magazin. But I hope that most parts are understandable.
 
Understandable to me, yes, but I should imagine the average consumer will look at those bar charts and be mislead into thinking that the 6800 Ultra and the X850 XTPE outperform the 7800 GTX (by virtue of the notion that bigger bars = more performance).
 
Neeyik said:
Understandable to me, yes, but I should imagine the average consumer will look at those bar charts and be mislead into thinking that the 6800 Ultra and the X850 XTPE outperform the 7800 GTX (by virtue of the notion that bigger bars = more performance).

There was a description how to read this new "Spielbarkeits" (Playability) charts in the magazin. Green bars are good, Red bars are bad and something like this. Maybe we have the advantage that readers of print magazins normaly read the text and not only look at the bars.
 
Demirug said:
Sorry for the german but it is used by a german print magazin. But I hope that most parts are understandable.
That's quite handy. A simplified histogram such as that seems an acceptable compromise b/w bulky timeplots and simple averages. The minimums in that excerpt seem accurate, too, and considering the MSRP comparison, you've got pretty much everything you need to determine which card is worth what (to you).

Well, except for the IQ comparison, obviously. In a print mag, you'll probably have to just trust the reviewer, whereas online you can (to varying degrees) verify. Then again, even online it's up to the reviewer to communicate aliased textures and possibly MIP map boundaries. (I just played through Halo and my 9800P's default bilinear spec map blends was horribly distracting. Thankfully, I was able to force tri on all layers.)
 
Nice to see so many people interested in this discussion :smile:


For the sequence [61, 61, 61, 61, 31], you get avg 55, and avg fall of 29.

For the sequence [64, 64, 58, 58, 31], you get avg 55, and avg fall of 11.

If you really want to use this idea, then when you do your average, you take the sum of all fps below X and then divide by the total number of frames (including those over 60 fps). The only problem is that your numbers won't be big enough for a dramatic effect. The first sequence above will give you 5.8, and the second 6.6, and this is when one fifth of the frames are at 31 fps. Of course you can do some arbitrary scaling and call it your "drop index".


Yes, that could happen in some rare cases I think. I changed my mind, but not for that possible cases. My idea was to change the "drop index" for every test if needed (you can see chart I posted), but that's not practical. Second attempt was around average as drop index, and the result would be a % prortional to average, so results are comparable. That or just standard deviation (proportional to average too) could be my bet. Just an average and one of those values. Too much data could be negative like someone said, and 2 values can be represented easily in one bar. I'm against using min or max or some derivate of those values.
 
I have always advocated the use of cumulative distribution charts for the FRAPs data.
For those who aren't familiar, you can make one by taking the [H]-like FRAPS data, and sorting the FPS numbers from lowest to highest, then change the x-axis from time units so that it spans from 0-100%. From this, you can read the percentage of time the game is running below any given FPS. Also, it is much easier to tell if one card is faster than another, since their cumulative display curves would be offset, or one may be hitting low FPS more often, etc.

ERK
 
ERK, that sounds like a good idea (and is basically what I meant when I wrote probability curve, p(fps < x), just with flipped axes ;)).
 
New charts! (It was mentioned here, but I thought it deserved a link here.) Intel's suggestions don't vary wildly from some of ours. The % below a threshold framerate seems to be a consensus* pick.

* The consensus may be just me, but it's just so darn sensible! ;-)
 
ERK said:
I have always advocated the use of cumulative distribution charts for the FRAPs data.
For those who aren't familiar, you can make one by taking the [H]-like FRAPS data, and sorting the FPS numbers from lowest to highest, then change the x-axis from time units so that it spans from 0-100%. From this, you can read the percentage of time the game is running below any given FPS. Also, it is much easier to tell if one card is faster than another, since their cumulative display curves would be offset, or one may be hitting low FPS more often, etc.

ERK
Can you repeat that again, not only to me, but to the millions that generally buy video cards based on reviews offerred by the majority of sites/printed media?
 
Intel's suggestions don't vary wildly from some of ours. The % below a threshold framerate seems to be a consensus* pick.

I think the same, but once I started to gather results I saw that method is less practical than just a standard deviation, by example.

Here's my review, I made charts with average and standard deviation:

http://www.gpumania.com/varticulo.php?indicea=21&pag=6

6800 ultra frame rates are less consistent in most cases, x1800 xl wins by small margin in that aspect, but 7800 gt scores better in just average FPS.

Such data is interesting taking another point of view: this is a simple way to see how stable are the framerates in every game. From an (ameteur) level designer perspective, this is very useful.
 
this type chart is hard to make , and i think the average number will still be the most important benchmark results .

we can use video record to show the gaming experience :)
 
PP said:
I'm preparing a product review for my site (in spanish), and I was thinking in a new way of exposing benchmark results. We have averages, max and min values. IMO, max and minimun FPS that some sites show in their reviews are completely unuseful. Graphs of FPS logs like HardOCP of PCPers don't give you precise data, just some headaches occasionally when you try to find out what FPS averages are more consistent.

I did some investigation with excel and a fraps fps log, and made this (some figures aren't real, because I only have a 6800 gt now):

gpumania31fc.png


top chart is an FPS average, bottom chart is a "new" (at least I have'nt seen before) FPS fall average below a given FPS point (<60, <40, etc). This is a method to expose how solid FPS are, do you think I should use it?


Just stick with the normal straight forward approach and all will be well. No need to make it overly complicated
 
The charts aren't the hard part, not time consuming at least. Recording FPS with Fraps and producing deviation data is what could make life worse for reviewers. Don't know if it's worthwhile, but we need new approachs to tell people what gaming experience it's better (like Intel says :LOL:). This thread it's about that, very interesting ideas have been exposed here.
 
Back
Top