Personally, I think this topic is very interesting.
It brings up a very specific contrast between optimizing for benchmarks, and optimizing for game performance. That is, it may be better for 99.9% of all benchmarks to just allow the textures to overflow for on frame out of 1000. This way, most of the time the average framerate will be high, but there will be one or two frames of massive stuttering.
Instead, it may be possible with the exact same video card to always load a few textures over AGP memory, so that most frames are a little bit slower than previously, but there is no frame that stutters massively. In the end, this second method may end up with a lower average framerate (which is typically the only reported rate), but will be much more playable.
This is why I, personally, feel that there are three numbers that would be important in determining how well a card performs:
1. Average framerate
2. Framerate deviation
3. Number of massive drops in framerate
This way one could encompass all aspects of performance: how fast it usually is, how much the framerate deviates from that speed, and how frequently there are significant performance issues.