Slow sending and receiving of email

Problem Description

For the last few days one of my Sympl hosts has been incredibly slow when sending emails. Usually when I launch Thunderbird it loads my IMAP inbox instantly, but it’s currently taking closer to a minute. Likewise emails used to be send immediately but are now taking a minute or two and I can’t do anything else in Thunderbird whilst this is happening (they do get sent eventually).

I’ve also received emails (copied below) about services failing and entries in the paniclog. Usually I get another email saying all the services have PASSED a few minutes later, but the slow email problem continues.

There’s nothing in the Exim log (/var/log/exim4/mainlog), other than there’s a gap of about a minute between the email being received and then being sent out.

Things I’ve tried/checked:

  • Rebooting the server: Doesn’t seem to have any effect.
  • Deleting the paniclog file: This seems to restore everything to normal for a few hours, then the problem re-occurs
  • Process list: Nothing unusual here, no processes I don’t expect or any taking up large amounts of CPU or RAM.
  • User list: Again, nothing unusual - my SSH sessions are the only ones shown.

The other things I haven’t tried yet are:

  • Upgrading the VPS to have more RAM, in case this is an out of memory issue (I have had this problem with spamd/SpamAssassin on my previous bespoke mail setup)
  • Upgrading to Debian Bullseye, though I’m not sure if this would fix the problem.

In my /srv/domain/config folder I have the following zero-sized files:

antispam
blacklists/zen.spamhaus.org

These have been in place since I started using Sympl for mail.

I do have another Sympl host with the same mail configuration and a similar volume of incoming mail (99% of which is spam) which isn’t experiencing problems.

Any Error Messages

2022-01-13 11:04:09 1n7xth-0002U4-49 spam acl condition: error reading from spamd [127.0.0.1]:783, socket: Connection reset by peer
2022-01-13 11:04:09 1n7xth-0002U4-49 spam acl condition: all spamd servers failed
2022-01-13 11:04:09 1n7xth-0002U4-49 spam acl condition: all spamd servers failed
2022-01-13 11:04:09 1n7xth-0002U4-49 spam acl condition: all spamd servers failed
2022-01-13 11:04:09 1n7xtg-0002U2-H5 spam acl condition: error reading from spamd [127.0.0.1]:783, socket: Connection reset by peer
2022-01-13 11:04:09 1n7xtg-0002U2-H5 spam acl condition: all spamd servers failed
2022-01-13 11:04:09 1n7xtg-0002U2-H5 spam acl condition: all spamd servers failed
2022-01-13 11:04:09 1n7xtg-0002U1-4u spam acl condition: error reading from spamd [127.0.0.1]:783, socket: Connection reset by peer
2022-01-13 11:04:09 1n7xtg-0002U1-4u spam acl condition: all spamd servers failed
2022-01-13 11:04:09 1n7xtg-0002U1-4u spam acl condition: all spamd servers failed
Started Sympl service monitor.
 INFO Runner: spamassassin: Checking service is enabled
 INFO Runner: spamassassin: Checking process
 INFO Runner: spamassassin: Testing connection to localhost:spamd
 INFO Runner: spamassassin: > PING SPAMC/1.3
 INFO Runner: spamassassin: Connection test temporarily failed: execution expired
 INFO Runner: spamassassin: Attempting to stop spamassassin
 INFO Runner: spamassassin: Attempting to start spamassassin
 WARN Runner: spamassassin: RETRYING (following Temporary failure)
 INFO Runner: spamassassin: Checking service is enabled
 INFO Runner: spamassassin: Checking process
 INFO Runner: spamassassin: Testing connection to localhost:spamd
 INFO Runner: spamassassin: > PING SPAMC/1.3
 INFO Runner: spamassassin: Connection test temporarily failed: execution expired
 INFO Runner: spamassassin: Attempting to stop spamassassin
 INFO Runner: spamassassin: Attempting to start spamassassin
 WARN Runner: spamassassin: FAILED: Temporary failure
 INFO Runner: RESULT: 9/10 passed.
sympl-monit.service: Main process exited, code=exited, status=1/FAILURE
sympl-monit.service: Failed with result 'exit-code'.
sympl-monit.service: Triggering OnFailure= dependencies.
$ free -m
              total        used        free      shared  buff/cache   available
Mem:           1995         457         431          42        1106        1323
Swap:             0           0           0
$ uptime
 08:39:53 up 1 day, 21:33,  1 user,  load average: 0.07, 0.03, 0.00

Environment

  • Sympl Version: 10.0
  • Sympl Testing Version? No
  • Debian Version: Buster
  • Hardware Type? Virtual
  • Hosted On? Mythic Beasts

I’m hosting on Mythic Beasts VPS. I’ve found 4GB of ram with the other things that are running on there too be just not quite enough member and get some of the symptoms and processes such as spamassasin would struggle and die. Upgrading to 6GB of memory and the problem goes away.

Worth trying a memory upgrade, it’s always possible to contact support and get a downgrade again.

Unlikely to be the case here but I recently found a ‘slow’ Thunderbird instance (windows) which was down to a corrupt cache.

Operationally, the only thing it’ll do is remove a log entry. :wink: Despite the filename, it’s fairly routine to see entries when services become (temporarily) unavailable.

The first invokes spamassassin (resource heavy) and the latter a dnsrbl service (minimal impact). /srv/*/config/antispam triggers clamd which is the usual candidate for resource problems.

Fwiw, I’ve been running a 2GB symbiosis stretch machine with custom dnsrbl, spamassassin and clamd - with extra signatures from sanesecurity - and eventually hit memory problems. However, these were instantly solved by adding a 2GB swap file and the thing is still chugging away, years later. (Lazy, but I’m trying to avoid moving to sympl/beasts/bullseye until dns is more automated … but it’s getting trickier).

admin@vm1:~$ sudo /srv/.all-sites/utils/rblinfo
   23 rbl services currently configured
   14 show rejections (exim4 logs with 10 day history)
      non-spamhaus.org services might only 'tag'

  service                         sites     rejections
--------------------------------------------------------
  zen.dq.spamhaus.net                33            288
  dbl.dq.spamhaus.net                33            118
  combined.mail.abusix.zone          19            114
  hostkarma.junkemailfilter.com      19             52
  bl.mailspike.net                   20             19
  b.barracudacentral.org             21             14
  [...]
  TOTAL                               -            637
--------------------------------------------------------
  spamassassin                       33             14
  clamav                             33             27
--------------------------------------------------------
  clamav logs (oldest file last modified 81 days ago)
  - Sanesecurity.Phishing             -            101
  - Sanesecurity.Spam                 -             32
  - Heuristics.Phishing               -             10
  - Sanesecurity.Jurlbl               -              6
  - Porcupine.Junk                    -              3
  - Sanesecurity.Junk                 -              2
  TOTAL                               -            154
--------------------------------------------------------
  v20200214 : ~0.20s

If upping the memory doesn’t fix it, try running spamd in debug mode in case, say, a recent update is causing issues.