Cleanup Maildir folders (archive/delete old mails)
Maildir is a huge improvement over mbox for storing local mails of users. Why? I will not go into a long explanation about this, because it is not the scope of this post, but just consider that saving each mail in its own file on the disk (Maildir) opposed to saving all mails in a single file (mbox) is much faster. Not only faster, but it is also much easier to manipulate the files (that are individual mails) on the system. For more details on maildir vs. mbox you can see http://www.courier-mta.org/mbox-vs-maildir/
Now, even if Maildir is much faster for many emails kept on the server, when we reach a huge number of files in a single folder the access times to that folder will be considerably slower. Now I am not talking here about a couple of hundreds of mails, but some huge mailboxes with thousands of mails and huge sizes (over 3-5GB in size). You will be amazed that there are peoples that will do that… They are most probably using IMAP and keeping all their mails on the server, or even POP3 and saving one copy of each mail on the server.
Now since Maildir saves each mail in a separate file, it is much easier than it used to be to manipulate the mails. Everyone can write some simple script to do some cleanup based on its needs. A while ago I have stumbled across this python script that does most of the things I needed to cleanup maildir folders. What features was I looking for?
- ability to archive or delete older mails (configurable age of mails)
- archiving old mails in Maildir format, so they can be still used from the email client.
- usage that can be automated in a cronjob.
cleanup-maildir has all the features that I needed (and some others). So I would recommend this little script for a quick solution to cleanup maildirs. Here are some useful options:
-n, –trial-run
Do not actually touch any files; just say what would be done.
-a, –age=N
Only touch messages older than N days. Default is 14 days.
–archive-folder=F
Use F as the base for constructing archive folders. For example, if F is ‘Archive’, messages from 2004 might be put in the folder ‘Archive.2004′.
-d, –archive-hierarchy-depth=N
Specify number of subfolders in archive hierarchy; 1 is just the year, 2 is year/month (default), 3 is year/month/day.
–maildir-root=F
Specifies folder that contains mail folders. Default is “$HOME/Maildir”.
COMMANDS
archive - move old messages to subfolders based on message date
trash - move old message to trash folder
delete - permanently delete old messages
For all the available options check out the help page (cleanup-maildir –help)
Sample usages
Archive messages in ‘Sent Items’ folder over 30 days old (from the maildir of the running user):
cleanup-maildir --age=30 archive 'Sent Items'
Delete messages in the current maildir of the running user older than 60 days:
cleanup-maildir --age=60 delete ''
Archive mails older than 3 months from a specific user maildir (/home/someuser/.Maildir) and place them in yearly archive folders in the form: Archive.YEAR:
cleanup-maildir --age=90 --archive-folder=Archive --archive-hierarchy-depth=1 --maildir-root='/home/someuser/.Maildir' archive ''
Notes:
- the -n parameter can be very useful when you are familiarizing with the script (it will only show you what it will do without actually touching any file).
- when running the script against a different user maildir the script will create the folders with the running user permission. So if you are archiving the mails for usage also from an email client, you will have to run after that the proper chmod command to fix the permissions.
- after the script finishes you will be presented with some general statistics on what it has done. This looks like:
INFO:cleanup-maildir:Total messages: 51507
INFO:cleanup-maildir:Affected messages: 37611 (moved to archive)
INFO:cleanup-maildir:Untouched messages: 13896
- the script looks inside the mail headers to get the dates and not from the date of the actual files on the system.
Tags: maildir
Share This







26th November 2006, 12:57
[...] Read More… « Installing Zenoss Network Monitor on a Ubuntu Server | [...]
11th May 2007, 19:29
i would just need to have any job deleted from this /tmp/jo sub-directory that is older than 7 days from the current date if there is any bash script for the same
14th June 2007, 11:33
[...] Cleanup Maildir folders (archive/delete old mails) | MDLog:/sysadmin [...]
4th October 2007, 14:43
I’s use the find command to locate the (ctime) files over 7 days old and the pipe it through xargs rm… like this
find /tmp/jo -type f -ctime +7 | xargs rm
To test, replace the rm with ls -la so you can see the files and ages.
12th October 2007, 19:31
I don’t see any mention of the metadata that courier appears to store about messages. There is the courierimapkeywords/:list file that holds keyword data. If the files are archived, won’t this information be lost?
11th December 2007, 13:33
Hi, This is interesting, but what if we want to make it for all the users? if my user base is 10000 or more ???
11th December 2007, 15:23
Jayen: if you want to use this particular script for your purpose you will need to write a wrapper an call it similar to my last example (with –maildir-root=’/home/someuser/.Maildir’) and then follow it by a chown -r to the particular user on the resulted archive folder. All these in a loop that will iterate for each of the existing users.
ps. maybe there are some other similar scripts that are available to better fit to your need… no idea. Let me know what you ended up using in your situation.
11th December 2007, 18:21
Hi Marius,
Thanks for your prompt reply, Frankly i do not know how to write wrapper. Let me try to find out from google.. if you have any ready script would you give me?
Regards,
11th December 2007, 18:37
Marius, Another thing, Can’t we make like wildcard like %U ??
Scemerio is like this, I have virtual domain.. and my Maildir is /home/vpopmail/domain/xyx.com/user/Maildir .. and there are several users.. it may reach 10k or more… what would you do in that situation?
11th December 2007, 22:09
Jayen: sorry no I don’t have such a script available. Here is an idea to get you started: You need to get a list of all the users, something like this:
Again this is just an idea to get you started. hth.
13th December 2007, 06:18
Thanks for your input, I have given this project to a programmer who can write it for me.
Thanks again for your reply & inputs.
Regards,
Jayen
2nd July 2008, 13:17
A quick note. For python 2.5 you need to change message.fp.name to message.fp._file.name
3rd July 2008, 21:47
I can“t get the script working on a computer with newer version o Python. Could somewone expain more exact how to get it to work. I get this error message:
Cleaning Spam Folder for patrik
Traceback (most recent call last):
File “/bin/cleanup-maildir”, line 499, in
cleaner.clean(mode, dir, minAge)
File “/bin/cleanup-maildir”, line 363, in clean
if (msg.getAge() >= minAge) and ((not self.keepRead) or (self.keepRead and msg.isNew())):
File “/bin/cleanup-maildir”, line 261, in getAge
msgTime = self.getDateRecd()
File “/bin/cleanup-maildir”, line 247, in getDateRecd
return os.stat(self.fp.name)[8]
AttributeError: _ProxyFile instance has no attribute ‘name’
[root@server cron.daily]# python –version
Python 2.5.2
/Albin
3rd July 2008, 21:52
Albin: Look at my comment. You need to change fp.name to fp._file.name
3rd July 2008, 22:05
I have seen your comment, but where do i find fp.name?
3rd July 2008, 22:16
I find this in the code:
tryCount = 1
srcFile = msg.fp.name;
in line 122
is it this one?
3rd July 2008, 23:18
Well, there are many. I remember message.fp.name and self.fp.name. You need to change all the ocurrences of it. Using a text editor you can replace the string.
4th July 2008, 07:36
I try it out Thanks for your reply =)
4th July 2008, 08:09
I got it working =)
here is the new code working with python 2.5
http://prologic.se/cleanup-maildir.tar.gz
/Albin
21st July 2008, 06:30
I’m trying it on a VPS with Python 2.5 and seems like even the modified version doesn’t recognize date of messages. For example:
$ /usr/local/bin/cleanup-maildir -n -v –age=30 –maildir-root=/var/spool/imap/test/Maildir archive ”
…
INFO:cleanup-maildir:Total messages: 102
INFO:cleanup-maildir:Affected messages: 0
INFO:cleanup-maildir:Untouched messages: 102
but I’m sure that Maildir contains also messages older than 30 days!
If that can help, messages are delivered into Maildir/new/ by Procmail and my IMAP server is Dovecot.
Any suggestions?
Thanks,
Corrado Fiore
21st July 2008, 07:56
Ok, found the trick: the cleanup script looks at the date on filesystem, not inside each message. So, if I copied a Maildir without keeping original modification datetimes, I would end up with the results above: all files seem like new to the script.
Kind regards,
Corrado Fiore