Re: [SLUG] Bogofilter and spam/ham files

From: Paul M Foster (paulf@quillandmouse.com)
Date: Tue Apr 01 2003 - 23:27:43 EST


On Tue, Apr 01, 2003 at 06:32:20PM -0500, Derek Glidden wrote:

> On Tue, 2003-04-01 at 18:07, Paul M Foster wrote:
> > On Tue, Apr 01, 2003 at 11:07:39AM -0500, Derek Glidden wrote:
>
> > > It's a good thing. The more words in each list, the more accurate it
> > > is.
> >
> > This I know. The problem is that at some point (now), it starts to slow
> > down receipt of mail because there is such a large file(s) to read. What
> > happens when your files get to be 50M?
>
> Err, you get a beefier server? You upgrade to a newer bogofilter that
> doesn't have the problem? You write to the bogofilter list and
> complain?
>
> I haven't really had the problem. My mail server is an Athlon 700,
> though.
>

Ah, well, my machine is not worthy to lick the surface mount components
of your video card, then. Perhaps not even worthy to feel the breeze
from your CPU fan. <sniffle>

> The files are Berkeley DB files in the version I'm running anyway, so
> I'm surprised that you would see any sort of slowdown during mail
> delivery due to bogofilter. Granted, it'll have to do some
> calculations, but that really shouldn't affect it much unless you have
> really huge, wordy emails that it has to generate lots of statistics
> for.

Well, there is that guy from Africa that keeps sending me emails about
this money he wants me to get for him. And then there all those women
that... well, we won't go into that. Let's just say I'm thankful I use
mutt, and images don't display in my MUA.

> Looking things up in the db files should be quite speedy.

It's gone from, say, 1/20 of a second when first installed to almost a
second now per email, with a lot of disk thrashing. Better than
spambouncer or junkfilter's performance, but not as perky as it used to
be.

> If you're using procmail, maybe you'd want to do 'strace bogofilter'
> once just to see what it's doing. Maybe you've got a bad block on your
> filesystem right under one of its db files or something... ??
>
> What version of bogofilter are you using?

0.11.1.3, I think it's the version that comes with Debian testing.

I could readily see that the spam/ham files would grow indefinitely
unhindered, so the point of my question was whether this was a mutually
experienced problem, and if any bogofilter-coded solution was planned,
known, or in the works. I guess not.

Paul



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 17:53:36 EDT