Maildir++

In this document:

HOWTO.maildirquota

The remaining portion of this document is a technical description of the maildir quota extension. This section is a brief overview of this extension.

What is a maildirquota?

If you would like to have a quota on your maildir mailboxes, the best solution is to always use filesystem-based quotas: per-user usage quotas that is enforced by the operating system.

This is the best solution when the default Maildir is located in each account's home directory. This solution will NOT work if Maildirs are stored elsewhere, or if you have a large virtual domain setup where a single userid is used to hold many individual Maildirs, one for each virtual user.

This extension to the maildir format allows a "voluntary" maildir quota implementation that does not rely on filesystem-based quotas.

When maildirquota will not work.

For this quota mechanism to work, all software that accesses a maildir must observe this quota protocol. It follows that this quota mechanism can be easily circumvented if users have direct (shell) access to the filesystem containing the users' maildirs.

Furthermore, this quota mechanism is not 100% effective. It is possible to have a situation where someone may go over quota. This quota implementation uses a deliverate trade-off. It is necessary to use some form of locking in order to have a complete bulletproof quota enforcement, but maildirs mail stores were explicitly designed to avoid any kind of locking. This quota approach does not use locking, and the tradeoff is that sometimes it is possible for a few extra messages to be delivered to the maildir, before the door is permanently shot.

For best performance, all maildir clients should support this quota extension, however there's a wide degree of tolerance here. As long as the mail delivery agent that puts new messages into a Maildir uses this extension, the quota will be enforced without excessive degradation.

In the worst case scenario, quotas are automatically recalculated every fifteen minutes. If a maildir goes over quota, and a mail client that does not support this quota extension removes enough mail from the maildir, the mail delivery agent will not be immediately informed that the maildir is now under quota. However, eventually the correct quota will be recalculated and mail delivery will resume.

Mail user agents sometimes put messages into the maildir themselves. Messages added to a maildir by a mail user agent that does not understand the quota extension will not be immediately counted towards the overall quota, and may not be counted for an extensive period of time. Additionally, if there are a lot of messages that have been added to a maildir from these mail user agents, quota recalculation may impose non-trivial load on the system, as the quota recalculator will have to issue the stat system call for each message.

How to implement the quota

The best way to do that is to modify your mail server to implement the protocol defined by this document. Not everyone, of course, has this ability. Therefore, an alternate approach is available.

This package builds two small utility programs: "maildirmake" and "deliverquota". maildirmake is an extended version of the Maildir creation utility, with some additional options, including quota support.

The -qoptions to maildirmake installs the maildirsize file in an existing Maildir, which enables quota support:

maildirmake -q 10000000S ./Maildir

./Maildir is an existing maildir, and this -q options sets a quota of about 10 megabytes.

deliverquota reads the message from standard input, then delivers it to the maildir specified by the first argument to deliverquota, observing any quota that's set for the maildir. If the maildir is over quota, deliverquota terminates with exit code 77. Otherwise, it delivers the message, updates the quota, and terminates with exit code 0.

You will need to configure your mail server to use deliverquota instead of delivering directly to maildirs. The instructions for doing so depends on which mail server you use. For example, if you use Qmail and your maildirs are all located in $HOME/Maildir, replace the './Maildir/' argument to qmail-start with the following:

'| /usr/local/bin/deliverquota ./Maildir'

Then, run maildirmake with the -q option to set up quotas on all the maildirs.

That's pretty much it. If you handle a moderate amount of mail, I have one more suggestion. If possible, use deliverquota to deliver mail for a few weeks beforing setting up any quotas. Even if quotas are not used, deliverquota uses certain optimizations that permit very fast quota recalculation. Messages delivered by deliverquota have their message size encoded in their filename; this makes it possible to avoid stat-ing all files in the Maildir, when recalculating the quota. Then, after most messages in your maildirs have been delivered by deliverquota, activate the quotas.

maildirquota-enhanced applications

This is a list of applications that have been enhanced to support the maildirquota extension:

Quotas and deleted messages

The default application configuration that uses this maildirquota library does not count deleted messages, and any contents of the Trash folder, against the quota. Messages that are marked as deleted (but not yet actually removed), or messages that are moved to the Trash folder (which is subject to automatic purging) do not count towards the set quota.

It is possible to recompile the library to include all messages in the Maildir against the quota. This is done by using the --with-trashquota option to the configure script. Note that this option MUST be used to compile EVERY application that uses this maildirquota library. So, for example, if you have both maildrop and SqWebMail installed, you must use this option to recompile both applications.


Mission statement

Maildir++ is a mail storage structure that's based on the Maildir structure, first used in the Qmail mail server. Actually, Maildir++ is just a minor extension to the standard Maildir structure.

For more information, see http://www.courier-mta.org/maildir.html. I am not going to include the definition of a Maildir in this document. Consider it included right here. This document only describes the differences.

Maildir++ adds a couple of things to a standard Maildir: folders and quotas.

Quotas enforce a maximum allowable size of a Maildir. In many situations, using the quota mechanism of the underlying filesystem won't work very well. If a filesystem quota mechanism is used, then when a Maildir goes over quota, Qmail does not bounce additional mail, but keeps it queued, changing one bad situation into another bad situation. Not only do you have an account that's backed up, but now your queue starts to back up too.

Definitions, and goals

Maildir++ and Maildir shall be completely interchangeable. A Maildir++ client will be able to use a standard Maildir, automatically "upgrading" it in the process. A Maildir client will be able to use a Maildir++ just like a regular Maildir. Of course, a plain Maildir client won't be able to enforce a quota, and won't be able to access messages stored in folders.

Folders are created as subdirectories under the main Maildir. The name of the subdirectory always starts with a period. For example, a folder named "Important" will be a subdirectory called ".Important". You can't have subdirectories that start with two periods.

A Maildir++ client ignores anything in the main Maildir that starts with a period, but is not a subdirectory.

Each subdirectory is a fully-fledged Maildir of its own, that is you have .Important/tmp, .Important/new, and .Important/cur. Everything that applies to the main Maildir applies equally well to the subdirectory, including automatically cleaning up old files in tmp. A Maildir++ enhancement is that a message can be moved between folders and/or the main Maildir simply by moving/renaming the file (into the cur subdirectory of the destination folder). Therefore, the entire Maildir++ must reside on the same filesystem.

Within each subdirectory there's an empty file, maildirfolder. Its existence tells the mail delivery agent that this Maildir is a really a folder underneath a parent Maildir++.

Only one special folder is reserved: Trash (subdirectory .Trash). Instead of marking deleted messages with the D flag, Maildir++ clients move the message into the Trash folder. Maildir++ readers are responsible for expunging messages from Trash after a system-defined retention interval.

When a Maildir++ reader sees a message marked with a D flag it may at its option: remove the message immediately, move it into Trash, or ignore it.

Can folders have subfolders, defined in a recursive fashion? The answer is no. If you want to have a client with a hierarchy of folders, emulate it. Pick a hierarchy separator character, say ":". Then, folder foo/bar is subdirectory .foo:bar.

This is all that there's to say about folders. The rest of this document deals with quotas.

The purpose of quotas is to temporarily disable a Maildir, if it goes over the quota. There is one and only major goal that this quota implementation tries to achieve:

That's it. To achieve that goal, certain compromises are made: This won't be a perfect solution, but it will hopefully be good enough. Maildirs are simply designed to rely on the filesystem to enforce individual quotas. If a filesystem-based quota works for you, use it.

A Maildir++ may contain the following additional file: maildirsize.

Contents of maildirsize

maildirsize contains two or more lines terminated by newline characters.

The first line contains a copy of the quota definition as used by the system's mail server. Each application that uses the maildir must know what it's quota is. Instead of configuring each application with the quota logic, and making sure that every application's quota definition for the same maildir is exactly the same, the quota specification used by the system mail server is saved as the first line of the maildirsize file. All other application that enforce the maildir quota simply read the first line of maildirsize.

The quota definition is a list, separate by commas. Each member of the list consists of an integer followed by a letter, specifying the nature of the quota. Currently defined quota types are 'S' - total size of all messages, and 'C' - the maximum count of messages in the maildir. For example, 10000000S,1000C specifies a quota of 10,000,000 bytes or 1,000 messages, whichever comes first.

All remaining lines all contain two whitespace-delimited integers. The first integer is interpreted as a byte count. The second integer is interpreted as a file count. A Maildir++ writer can add up all byte counts and file counts from maildirsize and enforce a quota based either on number of messages or the total size of all the messages.

The current implementation of Maildir++ in Courier inserts whitespace padding on each line so that each line (including the terminating \n) is 14 bytes in size. This minimizes the impact of appending-related bugs in some NFS implementations.

Calculating maildirsize

In most cases, changes to maildirsize are recorded by appending an additional line. Under some conditions maildirsize has to be recalculated from scratch. These conditions are defined later. This is the procedure that's used to recalculate maildirsize:

  1. If we find a maildirfolder within the directory, we're delivering to a folder, so back up to the parent directory, and start again.
  2. Read the contents of the new and cur subdirectories. Also, read the contents of the new and cur subdirectories in each Maildir++ folder, except Trash. Before reading each subdirectory, stat() the subdirectory itself, and keep track of the latest timestamp you get.
  3. If the filename of each message is of the form xxxxx,S=nnnnn or xxxxx,S=nnnnn:xxxxx where "xxxxx" represents arbitrary text, then use nnnnn as the size of the file (which will be conveniently recorded in the filename by a Maildir++ writer, within the conventions of filename naming in a Maildir). If the message was not written by a Maildir++ writer, stat() it to obtain the message size. If stat() fails, a race condition removed the file, so just ignore it and move on to the next one.
  4. When done, you have the grand total of the number of messages and their total size. Create a new maildirsize by: creating the file in the tmp subdirectory, observing the conventions for writing to a Maildir. Then rename the file as maildirsize. Afterwards, stat all new and cur subdirectories again. If you find a timestamp later than the saved timestamp, either remove maildirsize and proceed, or repeat the recalculation.
  5. Before running this calculation procedure, the Maildir++ user wanted to know the size of the Maildir++, so return the calculated values. This is done even if maildirsize was removed.

Calculating the quota for a Maildir++

This is the procedure for reading the contents of maildirsize for the purpose of determine if the Maildir++ is over quota.

  1. If maildirsize does not exist, or if its size is at least 5120 bytes, recalculate it using the procedure defined above, and use the recalculated numbers. Otherwise, read the contents of maildirsize, and add up the totals.
  2. The most efficient way of doing this is to: open maildirsize, then start reading it into a 5120 byte buffer (some broken NFS implementations may return less than 5120 bytes read even before reaching the end of the file). If we fill it, which, in most cases, will happen with one read, close it, and run the recalculation procedure.
  3. In many cases the quota calculation is for the purpose of adding or removing messages from a Maildir++, so keep the file descriptor to maildirsize open. A file descriptor will not be available if quota recalculation ended up removing maildirsize due to a race condition, so the caller may or may not get a file descriptor together with the Maildir++ size.
  4. If the numbers we got indicated that the Maildir++ is over quota, some additional logic is in order: if we did not recalculate maildirsize, if the numbers in maildirsize indicated that we are over quota, then if maildirsize was more than one line long, or if the timestamp on maildirsize indicated that it's at least 15 minutes old, throw out the totals, and recalculate maildirsize from scratch.

Eventually the 5120 byte limitation will always cause maildirsize to be recalculated, which will compensate for any race conditions which previously threw off the totals. Each time a message is delivered or removed from a Maildir++, one line is added to maildirsize (this is described below in greater detail). Most messages are less than 10K long, so each line appended to maildirsize will be either between seven and nine bytes long (four bytes for message count, space, digit 1, newline, optional minus sign in front of both counts if the message was removed). This results in about 640 Maildir++ operations before a recalculation is forced. Since most messages are added once and removed once from a Maildir, expect recalculation to happen approximately every 320 messages, keeping the overhead of a recalculation to a minimum. Even if most messages include large attachments, most attachments are less than 100K long, which brings down the average recalculation frequency to about 150 messages.

Also, the effect of having non-Maildir++ clients accessing the Maildir++ is reduced by forcing a recalculation when we're potentially over quota. Even if non-Maildir++ clients are used to remove messages from the Maildir, the fact that the Maildir++ is still over quota will be verified every 15 minutes.

Delivering to a Maildir++

Delivering to a Maildir++ is like delivering to a Maildir, with the following exceptions:

  1. Follow the usual Maildir conventions for naming the filename used to store the message, except that append ,S=nnnnn to the name of the file, where nnnnn is the size of the file. This eliminates the need to stat() most messages when calculating the quota. If the size of the message is not known at the beginning, append ,S=nnnnn when renaming the message from tmp to new.
  2. As soon as the size of the message is known (hopefully before it is written into tmp), calculate Maildir++'s quota, using the procedure defined previously. If the message is over quota, back out, cleaning up anything that was created in tmp.
  3. If a file descriptor to maildirsize was opened for us, after moving the file from tmp to new append a line to the file containing the message size, and "1".

Reading from a Maildir++

Maildir++ readers should mind the following additional tasks:

  1. Make sure to create the maildirfolder file in any new folders created within the Maildir++.
  2. When moving a message to the Trash folder, append a line to maildirsize, containing a negative message size and a '-1'.
  3. When moving a message from the Trash folder, follow the steps described in "Delivering to Maildir++", as far as quota logic goes. That is, refuse to move messages out of Trash if the Maildir++ is over quota.
  4. Moving a message between other folders carries no additional requirements.