Improving Spam Filtering

Category: E-Mail
Created: 2019-11-25

This documents describes how spam filtering with LiveConfig works, and gives some hints on improving the spam detection rate.

Overview

Ideally, there are several levels of spam filtering:

  1. DNS Blacklist - the IP address of every incoming e-mail connection is checked (in real-time) against one or more DNS blacklists. If the sender is known to be a spammer, the mail is rejected with an appropriate error message.

  2. Greylisting - if an e-mail from an (up to now) unknown sender IP is being received, it is temporarily rejected for some few minutes. Every “normal” mail server retries e-mail delivery, many SPAM sender do not (because of performance).

    E-Mails of large, well-known senders (like Google Mail, Microsoft, and so on) can be accepted immediately by configuring a DNS Whitelist.

  3. Content Analysis - the mail content is checked using a large list of rules, resuling in a certain score. The higher the score, the more likely the mail is “spam”.

The order is very important: a DNS blacklist check is very lightweight and can easily be done for hundreds of connections at the same time, while a content analysis is very resource-hungry. So the goal is to filter out as much spam as possible before running the “expensive” content checks.

Configuration

To enable and configure spam filtering, log in as admin and go to Server Management -> Mail. Edit the settings of Postfix:

Screenshot E-Mail Spam Filter Configuration

DNS Blacklist

There are numerous DNS blacklists available, most of them for free. But be careful when selecting a DNS blacklist - it might decide which e-mails are allowed and which are rejected to your server. Only use serious DNS blacklists which fulfill requirements like

  • transparent unlisting process
  • active maintenance (no “dead” lists)
  • short update intervals

You can use multiple DNS blacklists with LiveConfig, they’re all checked in order of their appearance. We recommend to not add more than 2-3 DNS blacklists.

Some examples for DNS blacklists are:

Greylisting

The idea behind greylisting is that spammers try to send out their mails as fast as possible, without checking for errors. With greylisting, a mail server saves a triple consisting of sender IP, sender address and recipient address. Each mail with an (up to now) unknown triple is temporarily rejected for some few minutes. Every correctly configured mail server retries delivery for several times, so the mail is only delayed but successfully delivered at the end.

Each triple of a successful mail delivery is saved in a database, so subsequent e-mails from the same sender are usually delivered immediately.

Greylisting requires that the package postgrey is installed.

With LiveConfig, users can enable greylisting per mailbox. So for example a user can generally enable greylisting for all mailboxes, but also create an e-mail address like emergency-support@example.org with greylisting disabled.

DNS-Whitelist

When greylisting is enabled, you can optionally configure a DNS whitelist. If the IP address of an incoming mail connection is found on a whitelist, it is considered to be legitimate and excluded from blacklisting and greylisting.

A well-known DNS whitelist is dnswl.org. To add the dnswl.org DNS whitelist while only considering “real” matches (no error matches), use the following syntax:

list.dnswl.org=127.0.[0..255].[0..3]

Content Analysis: SpamAssassin

SpamAssassin is a free software for analyzing the content of e-mails. You need to have the package spamassassin installed and running. When you enable SpamAssassin with LiveConfig, a service called lcsam (LiveConfig SpamAssassin Milter) is registered and started, which connects Postfix with SpamAssassin.

The operation of SpamAssassin is quite simple: it is fed with an e-mail, and returns a score between -999 and +999 on how likely the mail is spam. So if a user enables SpamAssassin for a mailbox, he has to define two thresholds: one for marking e-mails as “suspicious” for being spam, and another one for rejecting e-mails.

If a mail exceeds the reject threshold, it is not accepted by the mail server - i.e. the sending mail server will report an error to the sender. When a mail exceeds the warn threshold, it is delivered, but the mail subject is prefixed with ***Suspicious SPAM*** (localized to the actual language of the LiveConfig user creating/editing that mailbox).

Detailed Spam Report

Every e-mail scanned with SpamAssassin gets three additional mail headers:

  • X-Spam-Flag - always NO (all mails with YES are automatically rejected)

  • X-Spam-Score - the spam propability score as calculated by SpamAssassin

  • X-Spam-Status - a compact spam status, containing the score, the configured thresholds and the matches tests. Example:

    X-Spam-Status: No score=-0.7 tagged_above=3.0 required=5.0 tests=[RCVD_IN_DNSWL_LOW]

Optionally, lcsam may add a detailed spam report to every e-mail (using the X-Spam-Report: header). To enable this, run the command systemctl edit lcsam.service and create the following service override:

[Service]
ExecStart=
ExecStart=/usr/lib/liveconfig/lcsam -g spamd -U postfix -r

This effectively adds the option -r to lcsam. Restart the service then (service lcsam restart).

All e-mails scanned by SpamAssassin now have an additional X-Spam-Report: header, for example:

X-Spam-Report: Spam detection software, running on the system "mx.example.org",
        has NOT identified this incoming email as spam.  The original
        message has been attached to this so you can view it or label
        similar future email.  If you have any questions, see
        @@CONTACT_ADDRESS@@ for details.
        
        Content preview:  [...]
           
        
        Content analysis details:   (-0.7 points, 5.0 required)
        
         pts rule name              description
        ---- ---------------------- --------------------------------------------------
        -0.7 RCVD_IN_DNSWL_LOW      RBL: Sender listed at http://www.dnswl.org/,
                                    low trust
                                    [198.51.100.100 listed in list.dnswl.org]

The list of matched rules is the most imporant part for tuning SpamAssassin. You can increase and decrease rule scores individually by appending them to /etc/spamassassin/local.cf (don’t forget to reload SpamAssassin thereafter).

Example:

# tag mails immediately as SPAM if they appear in any URI blacklist:
score XPRIO 0
score URIBL_DBL_SPAM 6.0
score URIBL_BLACK 6.0
score RCVD_IN_BRBL_LASTEXT 6.0
score URIBL_ABUSE_SURBL 6.0

# skip URI blacklist checks for certain well-known domains:
uridnsbl_skip_domain googleapis.com goo.gl googlegroups.com docs.google.com
uridnsbl_skip_domain youtu.be linkedin.com fbcdn.net licdn.com twimg.com redbox.com
uridnsbl_skip_domain amazon.ca amazonses.com amazonaws.com ssl-images-amazon.com images-amazon.com media-amazon.com
uridnsbl_skip_domain instagram.com pinterest.com pinimg.com facebookmail.com yahoodns.net tumblr.com
uridnsbl_skip_domain groupon.com grouponcdn.com office365.com booking.com

Default Values

All default values for new mailboxes (e.g. if greylisting is to be enabled or threshold values for SpamAssassin) are configurable via LCDefaults - see mail.greylisting.enabled, mail.spam.enabled etc.

Recommendations

In our experience, greylisting can filter up to 80% of all spam mails and is nearly as accurate and much less resource-intensive than SpamAssassin in its default configuration. We also think that Bayes filters are highly overrated.

We recommend to

  • enable DNS blacklists (2-3, only trustworty ones!)
  • enable Greylisting
  • use a DNS whitelist
  • enable SpamAssassin; warn threshold around 2.6-2.8, reject threshold around 5.0

Last but not least, we strongly advise not to automatically sort suspicious e-mails into a “spam folder”. Most users don’t regularily look into that folder and so might miss important e-mails. LiveConfig intentionally doesn’t provide that feature, though it may be realized with Sieve scripts.

Troubleshooting

/var/run/spamd.sock is not created, but SpamAssassin is running

Propably you have re-installed SpamAssassin or overwritten the configuration.

  1. Stop SpamAssassin: service spamassassin stop

  2. Edit the file /etc/defaults/spamsassassin. Search for the line beginning with OPTIONS=, comment it out, and then insert the following line:

    OPTIONS="-m 5 -H --socketpath=/var/run/spamd.sock --socketowner=root --socketgroup=spamd --socketmode=0660 -x --virtual-config-dir=/var/lib/spamassassin/%u/ -u spamd"
  3. restart SpamAssassin: service spamassassin start