Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

gav (2710)

  (email not shown publicly)
AOL IM: flufflegavin (Add Buddy, Send Message)

Hacker in NYC.

Journal of gav (2710)

Sunday April 11, 2004
09:22 PM

Bayes Training for SpamAssassin

[ #18292 ]

I've been too lazy to train SpamAssassin's Bayesian classifier with spams that didn't get marked as spam. After messing with things for a bit instead of doing some real work (like laundry or something; have you ever noticed how much more productive you are when you can put off doing other things?), this is the way I set things up. This maybe helpful to somebody, it's probably going to be helpful to me when I forget how I did it.

Firstly I created a folder called SPAM in Apple Mail. This is where I drag any spams I want to train SpamAssassin with.

Then I spent a while being annoyed at SA for not doing what I wanted it to do. See my mail doesn't go to me, it goes to another user which can't log in. There isn't an obvious way to tell sa-learn that you want it to work for another user. To get around this I set bayes_path to an absolute path in that user's ~/.spamassassin/user_prefs.

Then I wrote this little shell script:

rsync -zv -e ssh $mbox $user@$host:/home/$sa_user/tmp
ssh $user@$host sa-learn \
   -p /home/$sa_user/.spamassassin/user_prefs \
   --showdots --mbox --spam /home/$sa_user/tmp/mbox

This seems neater than setting up a mailbox to receive spams because I don't have to worry that any other headers have sneaked in there.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.