On Wed, Dec 14, 2005 at 12:01:19PM -0500, Eben King wrote:
> I put Spamassassin on my machine to deal with the spam issue (which is not
> severe, by "abuse@" standards, but still), and it doesn't seem to be getting
> beter. Correction, the two-week average of "% spams caught" has gone from
> 38% two weeks ago to 44% now. There have been no false positives, for which
> I'm grateful, but I'd put up with a few false positives for a better kill
> rate. I got fed up with it, so I made a spreadsheet to see _how bad_ it
> was. It's at http://24.94.123.65:81/spam.xls . I've fed every missed spam
> to it by "| sa-learn --spam" from pine. What am I doing wrong? Shouldn't
> it be improving?
As mentioned elsewhere, you must train it on _both_ spam and ham. And
there's a lower limit below which it won't make any difference. But let
me make a point here about Bayesian spam filtering. Your spam/ham
database will grow until it takes over the whole internet! ;-}
Seriously, every time you feed it a spam/ham to learn from, the database
will grow. I've had a bogofilter file up to 18M before (of course, I get
500-600 emails a day, 95% of which is spam). If you think that doesn't
slow things down, you're wrong. This is why I started writing my own
procmail spam recipes, in addition to using SpamAssassin (without its
bayesian capabilities).
Paul
-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS). Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.
This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 19:53:06 EDT