Rating = (PC x MACh) x [((MACh / FACh) - 1) x (MPPD / FPPD) + 1]
PC = Member's postcount
MACh = Member's average characters per post (post length)
FACh = Forum's average characters per post (post length)
MPPD = Member's average number of posts per day
FPPD = Forum's average number of posts per day (per user)
Now, my rationale behind the equation:
(PC x MACh) x [((MACh / FACh) - 1) x (MPPD / FPPD) + 1]
(PC x MACh) This is simply the product of number of posts and average characters per post giving the total number of characters posted by a given user. I'll call this "bulk volume."
(PC x MACh) x [((MACh / FACh) - 1) x (MPPD / FPPD) + 1]
(MACh / FACh) This is the ratio of a member's average characters per post to the forum's average characters per post. The result is greater than one if a user is above average in post length, and less than one if they are below average. Subtracting 1 from this term gives the "percent above or below" the forum average. Therefore, above average members are positive, below average members are negative. This is important. I'll call this the "post rating."
(PC x MACh) x [((MACh / FACh) - 1) x (MPPD / FPPD) + 1]
(MPPD / FPPD) This is the ratio of a member's post per day average and that of the forum's post per day (per user) average. A value greater than one means they post more often than the average user, less than one means less often. Call this their "post frequency."
(PC x MACh) x [[red]((MACh / FACh) - 1) x (MPPD / FPPD)[/red] + 1]
((MACh / FACh) - 1) x (MPPD / FPPD) This is the interesting part. Recall that the first term, the "post rating," can returen a positive, zero, or negative value. Multiplying this by the member's relative "post frequency" has a most curious result, which I call the member's "spam rating."
If a member has an average post length roughly the same as the forum average, then the first term will be nearly zero. No matter how frequently or infrequently they post, the product of the two will remain close to zero.
If a member has an above average post length, then their "post rating" will be positive. If they post infrequently (relative to the forum average), then that positive number will be reduced (but still positive). If they post frequently, then that positive number will be increased accordingly.
If a member has a below average post length, then their "post rating" will be neagive. If they post infrequently, it will become "less negative," and if they post frequently, it will become "more negative."
It is important to note that post frequency can't change a negative "post rating" into a positive "spam rating" or positive "post rating" into a negative "spam rating." It can only increase or decrease the maginitude of the positive or negative value.
I think this part of the equation very nicely captures the essence of what defines "spam." Intuitively, if a member posts fairly long messages, then a high post rate indicates very useful activity in the forum, and their "spam rating" will be significantly greater than zero (good). If they're posting fairly short messages at a high rate, then it is likely a great many of them fall under the "spam" category, and their "spam rating" will be significantly less than zero (bad).
For members that have average post length, no rate of posting can effect their "spam rating," as it hovers near zero (as it should). For members that post infrequently (or have been around for ages), then the effect of their average post length is reduced, and their "spam rating" is brought closer to zero.
The final term is the addition of a "1" resulting in a percentage of their "bulk volume" that is returned by the equation. For spammers, their bulk volume is reduced by their "spam rating," and for members who spend a great deal of time on posts their "bulk volume" is increased. Put simply, if you post short posts, then posting often hurts you. If you post long posts, then posting often helps you. The baseline is members with average post length, in which case post frequcency doesn't matter, and their "bulk volume" is unmodified (as it should be for an average poster).