Fetchmail Multiple Instances Increasing Load Average

Home » CentOS » Fetchmail Multiple Instances Increasing Load Average
CentOS 4 Comments

Hi,

I’m running CentOS 5.7 with sendmail-8.13.8-8.1.el5_7, fetchmail-6.3.6-4.el5 and procmail-3.22-17.1.el5.CentOS. My server is having around 2000 mailboxes and this server used to fetch mails for all these users using fetchmail from the other MX server. I’ve configured the below cron job using webmin for downloading these mail

5,10,15,20,25,30,35,40,45,50,55 * * * * /etc/webmin/fetchmail/check.pl
–file /var/log/fetchmaillog.

But the load average for the server goes high automatically after every 4
to 5 hours due to multiple fetchmail instances and after that I’ve to either kill all the fetchmail jobs or restart the server for making the system up again.

Kindly suggest how can i reduce this load average issue or any other way out for downloading the mails from the parent server running on sendmail again.

Warm Regards, Anshul Chauhan

4 thoughts on - Fetchmail Multiple Instances Increasing Load Average

  • get cron to call a script that establishes a lock file, thus next round of cron will not start fetchmail again until the first invocation completes.

  • Kindly suggest is this right way to start the cronjob with lock if i’ve not mis undestood.

    */5 * * * * /usr/bin/flock -n */etc/webmin/fetchmail/check.pl <http://check.pl>
    *>* –file /var/log/fetchmaillog*

    Warm Regards, Anshul Chauhan

  • Kindly suggest is this right way to start the cronjob with lock if i’ve not mis understood.

    */5 * * * * /usr/bin/flock -n */etc/webmin/fetchmail/check.pl <http://check.pl>
    *>* –file /var/log/fetchmaillog*

    Warm Regards, Anshul Chauhan

  • Honestly, the best solution to your problem is to fix this setup. The standard configuration for a backup MX is to configure it to accept mail for your domain, but treat those messages as non-local. That is, the server should queue that mail and then deliver it to the primary servers via SMTP. You should not be using fetchmail for this purpose. Using fetchmail means that all of your users’ passwords are stored in plain text somewhere on your primary server for no good reason. The setup you’ve got is bad for security and as you are seeing, it is bad for reliability.

    Load is a number that indicates how many processes are in a non-sleeping state. By itself, it is not an indication of a problem. The first thing you have to do is identify the actual problem.

    Start with “ps ax” and look at the processes that *don’t* have an ‘S’ in their STAT column. Load is a count of those processes. From there, you have to figure out why they aren’t sleeping.

    Is there a large number of processes eating a lot of CPU? Use “top” or some variant to find out. How many processes are there? Is it actually fetchmail, or something else? How much CPU time is each one using?

    Is there a lot of disk IO? Use “iostat -x” to find out. Which disks are seeing a lot of IO?

    Fetchmail already uses a lock file, so that’s probably not the issue.
    I’ve never seen fetchmail cause a high load, so I don’t have a good guess as to the problem, but my first guess based on what little you’ve said about your setup is that you have a mail loop somewhere. Either one use has fetchmail configured to check the local server, so that fetchmail is pulling messages from the local server, then feeding them in, and then fetching them again in an eternal loop, or one user is forwarding email in such a way that it’s heading out to your backup MX, and looping that way. You’re going to have to go over your logs to see if the high-load events happen when a particular user is checked, or if there’s some other common trigger.