Entropy: Hi,
You make a good observation; a number of the attributes are not independent in this data set. Things that otherwise wouldn't be good predicters are actually encoding information from other attributes (like your example with chipsets actually encoding information about cpu type/speed). This is actually the primary reason why I haven't been using Baysian filtering as it can't deal with these kinds of problems.
To get around this for "really" telling how well the chipset predicts the score you could first examine how well cpu and chipset predict the score individually (probably along with memory speed!), and then see how well they predict the score together. You should be able to then see how much redundent information there is encoded in each attribute.
As for prediction, I'm really curious how a multivariate regression model would work out. Unfortunately my statistics backround is pretty limited so I need to take some more classes or atleast study up on things a bit more. I'm actually wondering if M5 Rules is doing something like this (I believe you can accomplish multivariate regression by using a number of linear regression models in tandem?)
Something I've noticed is that across the benchmarks I've studied (UT2003, SS:SE, Quake3, RTCW, 3DMark01, 3DMark03), some are cpu dependent, and others are gpu dependent. Memory speed plays into things too, and of course the various quality settings have an effect. When I have time, It would be nice to go through and to exhaust the total possible pairings of all attributes (15!) and then to so when certain attributes start becoming less important at predicting the score. This should help tell when information is becoming redundant, and what attributes really are the good predictors.
Nite_Hawk