History of MH

The Overview of this Appendix has an overview of the development of MH since the late 1970s. The Section From Bruce Borden was written by Bruce S. Borden, the person responsible for much of the early MH programming. The Section From Stockton Gaines was written by R. Stockton Gaines after he read the section that Bruce wrote. Stock, along with Norman Z. Shapiro, wrote a memo that laid out some of the design principles for MH. The section The Original MH Proposal contains that memo (thanks to the RAND Corporation for permission to include it in the online version of this book).

Overview

Early in 1977, R. Stockton Gaines and Norman Z. Shapiro of the RAND Corporation laid out the MH principles in a way that's been followed amazingly well since. At that time RAND had an electronic mail system called MS. MS worked the way most mail software still does today: it was a monolithic system which didn't take advantage of the UNIX file and directory structure. Among the ideas laid out in the MH memo were: storing messages in a directory as normal text files, which could then be read by other UNIX programs as well as MH; deleting a message by changing its name (moving it to another directory); and having a "user environment" file that keeps track of what the user did last. The MH commands were a lot like MS commands except that they became individual programs, one for each task, executed with a UNIX shell. This original memo is in the Section The Original MH Proposal.

By 1979, Bruce S. Borden had developed MH; it has remained conceptually the same ever since. Of course, some changes and a fair number of additions have been made to MH since it was created. Since 1982, Marshall T. Rose, aided by John L. Romine, with some help from Einar Stefferud, Jerry Sweet, and others at the University of California, Irvine (UCI), have extended and maintained MH. (Marshall Rose has since left UCI.) Performance enhancements were also made at the University of California, Berkeley, and MH has been included with later versions of Berkeley UNIX (4BSD). Versions of MH also come with Digital's ULTRIX, IBM's AIX, and others. People at UCI, along with help from contributors, updated MH until the late 1990s.

In 1997, Richard Coleman started work on nmh, the "new MH". As MH development at UCI ended, the nmh developers -- first Richard Coleman, then a team spread across the Internet -- have revised some MH programs, added new features, and tried to fix its code to be simpler and more portable. At this writing, in mid-2003, nmh development has slowed but not stopped.

From Bruce Borden

I joined the Rand Corporation in 1978. My first assignment was to "improve" the MS mail system, which had been developed over the previous two years by Dave Crocker and others at Rand. MS was synthesized from the various mail packages the authors had used and researched on other systems, most notably, Tenex. It was the ultimate in monolithic mail packages, attempting to provide every feature provided by all other packages. It was terrible. It was so unlike common UNIX programs that I found it totally unusable. It was also huge and slow. (We were running on a PDP-11/70!) I was supposed to speed it up and make it more robust. After about a month, I gave up. I went to my management and recommended that MS be discarded, and a much simpler package built from the ground up. MS was developed on government contract, and Rand was committed to delivering a product.

At that point, I started talking with Stockton Gaines and Norm Shapiro about a memo they had written, in which they had proposed that standard UNIX files and directories be used for mail messages, along with standard UNIX commands like ls and cat to list and display messages. They also proposed that UNIX environments be used to hold things like current message number. Finally, they suggested that the user chdir into a working folder to operate on it. They had proposed these ideas at the start of the MS project, but they were not able to convince anyone that such a system would be fast enough to be usable. I proposed a very short project to prove the basic concepts, and my management agreed.

Looking back, I realize that I had been very lucky with my first design. Without nearly enough design work, I built a working environment and some header files with key structures and wrote the first few MH commands: inc, show/next/prev, and comp. show/next/prev were one command -- it looked at its name to determine which flavor to be. With these three, I was able to convince people that the structure was viable. This took about three weeks.

About this time, I also came up with the name MH -- Mail Handler; I needed a name, and I couldn't think of anything better. I've never liked the name!

Over the next six months, I completed the basic MH commands: inc, show, next, prev, comp, repl, forw, (Steve Tepper wrote dist), rmm, rmf, folder(s), scan, refile, and pick. I then wrote mhmail, anno, ali, and prompter (because I was tired of using vi to do simple composes).

There were so many "small" decisions made during this process, it is amazing how consistent MH turned out to be. For example, I needed a way to name a folder as an argument to the MH commands, and I didn't want the user to have to type -folder foo. Even with abbreviations (a very non-UNIX design decision), this was too cumbersome. So, I introduced the +folder syntax. This also simplified the syntax when two folders could be specified (refile, for example). Because everything was modularized, I was able to add message names, like first and last without changing anything but a library routine. Many initial users wanted shorter names for commands -- even the mostly four-letter lengths were too long for most users. Rather than rename the basic commands, I designed MH for use with shell aliases. Most users preferred n and p for next and prev, for example. Another common request was to combine rmm and next, which was commonly aliased as rn, or, for me, as , (that's right, a comma).

There are a few other design decisions which have been very successful. Default switches and global settings in the .mh_profile file worked very well. Pulling files out of the user's mail drop into an MH folder with inc provided a clean interface between the external mail delivery environment and MH.

Some early decisions have been changed by later developers of MH. For example, I felt that the backquote conventions of the shell were too clumsy for most users, so I didn't provide an mhl program, and pick had scan and file switches to make it useful. I also kept most changeable information in the .mh_profile file in an attempt to speed up MH operations. Most of these variables have been moved into other context files within the MH tree.

I think MH worked and has survived for many reasons. First, it is very UNIX-like. There isn't much to learn to use it. Second, it keeps its own context, which is almost completely independent from anything else the user is doing. A user can run inc or comp anywhere and any time without affecting his current context. Mail isn't something you stop to do -- mail processing is interwoven into the fabric of a user's daily activities. You're running a program and discover a bug, you send a quick mail message, perhaps piping the output of the program into mail. No other package that I know of makes this type of interwoven mail handling so easy and intuitive. Finally, the structure of the source tree and the implementation of a comprehensive support library have made MH command development and support very easy. Any good UNIX programmer can modify an MH command, fix a bug, or add a new command with a few hours of source tree review.

I have a few regrets with MH. After using MH for a few years, I decided that some fundamental functionality of e-mail communication was missing. For example, I'd send a message to someone asking some simple question, and when I finally got a "yes" message back, I had no idea what the original question was, and no easy way to find it with MH. The bigger requirement here is for conversation support. Embedding replied-to messages in the body of a reply message is insanity. E-mail packages should provide automatic retrieval of the replied-to message. The In-reply-to: component is sufficient for this. Imagine being able to walk down a multi-branching tree of messages which represent a long-running conversation on some topic and its related topics. This is still missing from MH and other mail packages.

For many years, MH was limited to 999 messages in a folder. I made this decision consciously -- anyone with that many messages in one folder needed to divide it up into subfolders. I'm not sure I should have imposed my own views this way.

It was many years from the time MH was completed until it was put in the public domain. I developed MH on Rand's own money (the MS development contract had been completed), and Rand worried about legal ramifications of releasing MH to the world. I'm very glad that MH has become public domain and that it is so widely used. Although I've done many exciting things in my career, I get the most satisfaction from MH, knowing how widely it is used and how well it has aged. I am also thankful to all the people who have worked on MH and enhanced it over the years. MH still has the same flavor, and when I look at the source tree, it is still familiar after 14 years!

Bruce Borden, July 1992

From Stockton Gaines

It is now 15 years since the beginning of MH, and inevitably there are some differences in what we all remember about those days. Herewith I include some of my own recollections...

The memo from Norm and me speaks for itself. After the memo, there was a meeting to discuss it, at which almost everyone present (who shall remain nameless) opposed it. Arguments were given about inefficiencies, etc. Bruce arrived at Rand a month or so after this.

When he discovered our memo in the late spring of 1977, he came to talk to me and told me that he thought it would be pretty straightforward to create a mail system such as Norm and I had described. At that time, I headed a project funded by the Air Force, and I thought that this work would be appropriate, so I provided the support for Bruce.

My recollection is that six days after our conversation, Bruce showed us an initial version with about six commands working! I was extremely impressed with what Bruce was able to do, and naturally pleased that the ideas from the memo were validated. Bruce suggests that there was an initial working version in about three weeks, so probably what he demonstrated earlier wasn't complete enough to use.

The next several months were quite exciting. It is a prime example of experimental computer science, and it is impossible to imagine that MH would have evolved to what it became with more formal software engineering practices. To have begun with a full requirements specification and a top-level design would have been to rob the whole project of its creative energy.

During the initial period of development, all of the work and most of the ideas came from Bruce. However, others did contribute, including Norm and me, and also Bob Anderson.

Bruce made one significant invention that I found particularly impressive. The various commands for handling messages (for example, forw) needed to be able to work on subsets of the messages in a directory. Specifying a range was easy, but specifying by date or other contents of a message or header was not. We appeared to be in danger of ending up with an extremely complex command format for MH.

Bruce's elegant solution was to define a separate function, pick, to do the selection. The initial implementation simply linked all selected messages into a subfolder, from which the desired activities could be carried out (also an elegant idea). Subsequently, other ways of using the results of the pick command have been devised, but the insight of making pick a separate function was profound and has contributed greatly to the success of MH.

Stock Gaines, July 1992

The Original MH Proposal

To: Bob Anderson
From: Stock Gaines, Norm Shapiro
Subject: THE NEXT MESSAGE SYSTEM
Copies: Dave Crocker, Dave Farber, Carl Sunshine, Steve Tepper, Steve Zucker

While the creators of MS are to be congratulated in having produced a substantial advance over SND and MSG, the current system, in a couple of ways, falls short of the software for dealing with messages that we should have in UNIX. MS as it stands is in two fundamental and important ways at odds with the UNIX philosophy and approach. We think that another iteration on message software should take place which will provide us with software dealing with messages that is again an advance over MS and will fit in naturally with UNIX in a way that MS does not, from which a number of practical advantages will follow. The two ways in which MS is basically incompatible with the UNIX approach are first that it is a monolithic system rather than being a set of functions which are callable from wherever is appropriate, and second that the storage of messages is not done by making appropriate use of the file and directory structure (an exceedingly elegant, simple and powerful one) already existing in UNIX.

Let us discuss the UNIX way of storing messages first. As an alternative to the clumsy method of using a text file and a structure file, we suggest that instead a mailbox be simply a directory. Each message would then be a separate file in that directory. If it is necessary to keep additional information about the files in the directory, that can be done by entering in the message directory a file containing information about the messages in the directory. Notice how many of the things we are trying to do with the structure file get handled automatically if this occurs. For instance, each time a message is written or read, the file system already automatically updates this information. Therefore, a clear indication that we have a new message in a mail folder is that the instant of writing and reading is the same. If they are different then we can test the time last written to see if the message was received recently or not. Dave Crocker has in the past pointed out that the rm command has the disadvantage that it throws a file away. It would be quite appropriate to add a shell command called, say, dis (for discard) which moves a message from the directory it is in to a subdirectory of that directory which we may think of as the discarded messages directory. These messages can be cleaned out by some sort of a cleanup command or by software that carries out this task at appropriate times. The point is that IF the garbage retrieval function is desireable for messages, then it is so for files. Of course, in the directory structure we have no information concerning the contents of messages. However, there is some reason to believe that the current design which retains pointers to each of the components of a message is of no advantage and may be more costly in execution time than if no such information were available. In any event, it is merely an effort towards efficiency and one which appears to have little value.

The additional value which would accrue if messages were files is substantial. They then become accessible to other software in the system in a natural, convenient and highly useful way. The lack of such accessibility of messages is currently one of the major deficiencies of MS. As Steve Tepper has suggested, the draft message might itself be a directory to expedite its processing, although it is not clear that the advantages of this outweigh the advantages of leaving the whole draft as a single file.

The second major difference we are suggesting between the current MS and the approach we believe is appropriate for UNIX is that the functions for dealing with messages should be embodied in individual command level routines which can be executed by themselves rather than only being available through a subsystem. The subsystem approach is appropriate for special situations such as NED, but inappropriate where there is not some overriding consideration such as the consistency which must be maintained between the different functions in a special environment. It is, of course, desirable to maintain a certain amount of consistency between the functions. Right now, for instance, it is nice (but not critical) that MS remembers which message a user last referred to, and will show the next message without his having to remember what number to type next. However, there is a natural and useful way of achieving that effect without a subsystem; have a "user environment" file available which the message software (in contrast to a message system) knows about, updates and understands. In such a user environment there could reasonably be a description of which message was last examined (and in which directory). This approach has the advantage that such information is not lost, as it currently is when one exits from MS.

It is quite evident how to implement most of the current MS functions as individual subroutines. For instance, the scan routine must examine a mailbox which is now a directory and summarize the messages in it. This is nothing but an extension of ls, which reads some information from the header of each message in addition to reading the directory itself, and would be very straightforward to implement. Reply clearly initializes the draft message in a very straightforward way. The show command is nothing but a variety of 1. Next and previous work in a straightforward manner if the user message environment file is maintained.

There are other advantages to this approach. Users who learn UNIX would not have to become familiar with a whole new language but only with three or four new functions. As message handling software evolves much of it will be applicable to other text handling functions. For example, a program to display all messages, in a given directory or set of directorys from, to or about a given UNIX user, would also be useable to display all files in a "source" directory of c programs which use a given function. Dually, as general UNIX software evolves, it will tend to be more applicable to message handling.

MS has made important contributions to our ideas concerning messages and how to handle them. It is not the subsystem itself, but the basic ideas about messages underlying it, which represents the important contribution of its creators. It seems likely that breaking the functions of MS out of a subsystem into a set of separately executable subroutines would not be a terribly difficult task, would give us an opportunity to redo some of those in ways that correct some of the existing flaws, and would integrate message handling into UNIX in a much more natural and useful way. We suggest that this approach should be followed in contrast to investing very much more effort into upgrading MS.

We would be delighted to discuss these ideas more fully.