Cleanup Maildir folders (archive/delete old mails)

Maildir is a huge improvement over mbox for storing local mails of users. Why? I will not go into a long explanation about this, because it is not the scope of this post, but just consider that saving each mail in its own file on the disk (Maildir) opposed to saving all mails in a single file (mbox) is much faster. Not only faster, but it is also much easier to manipulate the files (that are individual mails) on the system. For more details on maildir vs. mbox you can see http://www.courier-mta.org/mbox-vs-maildir/

Now, even if Maildir is much faster for many emails kept on the server, when we reach a huge number of files in a single folder the access times to that folder will be considerably slower. Now I am not talking here about a couple of hundreds of mails, but some huge mailboxes with thousands of mails and huge sizes (over 3-5GB in size). You will be amazed that there are peoples that will do that… They are most probably using IMAP and keeping all their mails on the server, or even POP3 and saving one copy of each mail on the server.

Now since Maildir saves each mail in a separate file, it is much easier than it used to be to manipulate the mails. Everyone can write some simple script to do some cleanup based on its needs. A while ago I have stumbled across this python script that does most of the things I needed to cleanup maildir folders. What features was I looking for?

  • ability to archive or delete older mails (configurable age of mails)
  • archiving old mails in Maildir format, so they can be still used from the email client.
  • usage that can be automated in a cronjob.

cleanup-maildir has all the features that I needed (and some others). So I would recommend this little script for a quick solution to cleanup maildirs. Here are some useful options:
-n, –trial-run
Do not actually touch any files; just say what would be done.
-a, –age=N
Only touch messages older than N days. Default is 14 days.
–archive-folder=F
Use F as the base for constructing archive folders. For example, if F is ‘Archive’, messages from 2004 might be put in the folder ‘Archive.2004’.
-d, –archive-hierarchy-depth=N
Specify number of subfolders in archive hierarchy; 1 is just the year, 2 is year/month (default), 3 is year/month/day.
–maildir-root=F
Specifies folder that contains mail folders. Default is “$HOME/Maildir”.

COMMANDS
archive- move old messages to subfolders based on message date
trash- move old message to trash folder
delete- permanently delete old messages

For all the available options check out the help page (cleanup-maildir –help)

Sample usages

Archive messages in ‘Sent Items’ folder over 30 days old (from the maildir of the running user):

cleanup-maildir --age=30 archive 'Sent Items'

Delete messages in the current maildir of the running user older than 60 days:

cleanup-maildir --age=60 delete ''

Archive mails older than 3 months from a specific user maildir (/home/someuser/.Maildir) and place them in yearly archive folders in the form: Archive.YEAR:

cleanup-maildir --age=90 --archive-folder=Archive --archive-hierarchy-depth=1 --maildir-root='/home/someuser/.Maildir' archive ''

Notes:

  • the -n parameter can be very useful when you are familiarizing with the script (it will only show you what it will do without actually touching any file).
  • when running the script against a different user maildir the script will create the folders with the running user permission. So if you are archiving the mails for usage also from an email client, you will have to run after that the proper chmod command to fix the permissions.
  • after the script finishes you will be presented with some general statistics on what it has done. This looks like:
    INFO:cleanup-maildir:Total messages:     51507
    INFO:cleanup-maildir:Affected messages:  37611 (moved to archive)
    INFO:cleanup-maildir:Untouched messages: 13896
    
  • the script looks inside the mail headers to get the dates and not from the date of the actual files on the system.
comments powered by Disqus