SpamAssassin Learning

I don’t recall where I read it: probably on the symbiosis forum. Wherever it was, I found a setup that got SpamAssassin to trawl through the mailbox folders each night, learning from what had been marked as spam (and what hadn’t).
This only worked to a limited extent, because users tend to delete spam instead of marking it as junk. But still worth having. I just can’t remember how I set it up.
Can anyone help?

I’ve done something similar in the past, and read sent mailboxes as additional training for ‘ham’.

The thing is that I currently have it working on a server I plan to wipe tomorrow. I can’t remember how I set it up or where the files are. So unless I remember soon, or someone reminds me…

Local mirror of the Bytemark forum to the rescue!

Splendid! Wiping that forum was a disgusting trick. OhNoMart didn’t like people complaining about the drop in service, so they made it much worse.

1 Like

I’ve also posted the same command on this forum Spam not being tagged nor moved to the Spam folder

I’m still using it, having recently migrated to Sympl on MythicBeasts VM. It works well, though I’ve once had a friend have to send me something through another email address as it originally bounced as spam.

Wouldn’t this classify new (unread and spammy) email as ham?

I’ve tried training SpamAssassin on .Sent folders before now.

In previous installations, I’ve found it necessary to turn off SpamAssassin’s bayes learning, and just rely on network tests and a few bespoke rules.

It would do initially, however once it’s been marked as spam it will relearn the tokens appropriately.

From https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html

–ham

Learn the input message(s) as ham. If you have previously learnt any of the messages as spam, SpamAssassin will forget them first, then re-learn them as ham. Alternatively, if you have previously learnt them as ham, it’ll skip them this time around. If the messages have already been filtered through SpamAssassin, the learner will ignore any modifications SpamAssassin may have made.

–spam

Learn the input message(s) as spam. If you have previously learnt any of the messages as ham, SpamAssassin will forget them first, then re-learn them as spam. Alternatively, if you have previously learnt them as spam, it’ll skip them this time around. If the messages have already been filtered through SpamAssassin, the learner will ignore any modifications SpamAssassin may have made.

I’ve had it working this way for years, and it’s worked well. I’ll get the same type of spam for a few days/weeks, and it’ll then reduce/stop as I mark it spam. Even if something gets into my inbox between me going into bed, and midnight when it runs, a few days later the spam score increases for similar items, and the number of the same type of spam reduces over time.

I use the show headers in my email client to see what spam score the message got, which is how I know that things tend towards being more likely to be marked as spam, before they are blocked.