pmail Manual
Table of Contents


One time set up.
Click here to see/save the Perl script. It is just a TEXT file. Under UNIX/Linux just create an executable file containing the text; under MacOS, create a Droplet containing the text.

Open the Perl script (pmail), set the shebang line appropriately and then find and set one hash, one array and two strings to appropriate values. (They are currently set to "XXXX".) Neither the hash nor the array need have more than one entry. The entries in the hash '%mail_servers' are used by 'pmail' to set a default SMTP server address to use to send your messages. There is a way, the Via:-line, to override it but here we describe the default. Usually, SMTP servers are linked to a collection of IP numbers. Recall that an IP number is a collection of four numbers between 0 and 255, separated by periods: 'X.X.X.X', where each X is a number between 0 and 255. The SMTP expects the machine making the mail request to have an IP number whose first several numbers are fixed. Each SMTP server you use should have a '%mail_server' entry. The key for the entry should be as much of the IP number as the server will insist be matched and the value should be the SMTP server address. As an example, the Notre Dame SMTP server is 'smtp.nd.edu' and expects IP numbers to begin '129.74.' so its entry would be '129.74.'=>'smtp.nd.edu'. Actually, you only enough of the IP number to make it unique amongst the ones you are entering, so '129.'=>'smtp.nd.edu' would work if you have no other servers with IP numbers starting with 129. When 'pmail' starts up, it gets the IP number of the machine on which it is running. (You will be warned that you are not connected to the internet if 'pmail' fails to find a local IP number - most often on Macs not currently connected to the network, but the script continues so you can compose off-line and only go on-line to send the message.) Then the default SMTP server address will be value of the '%mail_servers' hash whose key appears in the local IP number. If there is more than one such key, you will get the first match, but the search order is the usual Perl keys order and so which one you get is effectively undetermined. If there is no such key, then the empty string is the default SMTP server address and if one is not set using the Via:-line, no messages will be sent. If you have no idea what you IP number is, just start 'pmail' with no message and you will get a warning that your ip number can't be found, and in particular, your current ip number is printed. You are on your own as far as determining your SMTP server but 'smtp.domain_name' is a common choice.

You are done with the set up. Write message files and use pmail to send them. Users in a hurry might want to read a couple of the sample files and get started.


An Overview.
'pmail' expects to get a list of message files as it starts up, either by the usual $ARGV mechanism from UNIX or drag&drop in the MAC world. Each message file is processed independently, so we will only describe how one file is processed.

The simplest function that 'pmail' can perform is to allow you to compose an email message using any text editor and then send it using pmail. But 'pmail' can do much more. A good example to keep is mind is mail-merge.

Recall the basic format of a mail merge. First there is the main-document, in our case the contents of the file being processed. Then there are data-records: each data-record contains information which is substituted into the main-document to produce the different messages. 'pmail' assembles its data-information off the first line of the main-document, the To:-line. Then comes a Subject:-line. Then there is the option of having a From:-line and/or a Via:-line to manage the multiple SMTP servers and return addresses. Then comes the text of the message. We support CC: (and BCC:) in a fairly robust way. Finally, the message can be ended with an !END!-line, after which you may have information in the file which will not be emailed.

Finally, 'pmail' can also evaluate Perl code embedded in the message which gives it even more power.


Format of a main-document file.
The To:-line
The first line of the message contains information needed to assemble a list of email addresses, and perhaps other data as well. The line can begin with anything you like, but the important data occurs after a 'To:' is found. Actually, the 'To:' can be a 'XX:', a 'TT:' or an 'Ag:'but we will refer this line as the To:-line.

The data on the 'To:' line is either a list of email addresses, separated by spaces or tabs, or else it is a list of file names separated by spaces or tabs. 'pmail' distinguishes these options as follows. The 'To:'-line is collected as is. If it contains a @, then it is assumed to be a list of email addresses. If not, then it is a list of files.

Since Mac and PC files can have spaces in their names we support a compromise. If there is a tab in the 'To:'-line, we assume that tabs are being used to delineate the file names and then file names can have spaces, but if no tabs are found, we assume that space is the delineator and then files can not have spaces in their names. We never try to open files with blank file names, so if you only have one file but it has spaces in its name, just end the name with a TAB and it will be read as one file, not several. We do not support file names that begin or end with spaces.
The data after the 'To:' can continue over several lines if desired, indeed, the address data continues until the 'Subject:'-line is encountered. The 'To:'-line also supports the ability to append fields to the records as discussed here. There is also a way to filter the list of addresses, the ~m filtering mechanism discussed here.

The stuff before the 'To:' on the first line does serve a function. The string is saved and then this "comment-string" is removed from the start of every line on which it occurs.

As an example, when composing TEX files to pmail, one does not want to be always TEX'ing the To: and Subject: part, so if we use a '%To:' on the 'To:' line, then we can start all the lines we don't want TEX'ed with a %. To get a TEX comment line in the mailed file, start the line with a '%%'. The first '%' will be removed before the file is sent, but not the second. Click message.tex for a sample TEX file. To pmail a C program that you want to compile as you write, check out message.c.

Finally, we explain the different starts to the 'To:'-line. First we need to remark that after a message is sent, lines are added at the end of the file to indicate to whom it was sent, when it was sent, to whom it was CC'ed or BCC'ed, which return address was used and which SMTP server was used.

The To:-Ag: differentiation works well in practice. If you need to send the message to a new list of people, just add the list on the To: line and all the new people on the list will get the message, but none of the people to whom you have already mailed it will get a second copy. I installed this locking mechanism after inadvertently resending messages to the same people, either because I forgot to change the addresses on the To:-line or because these people were also on the new list that I used. This filtering of email addresses is case insensitive as are email addresses. Adding addresses to the CC:-list and CC'ing old messages is NOT supported.

The Subject:-line
The 'Subject:' line is the first line that contains 'Subject:'. It can have stuff before it, but the stuff on the line after the 'Subject:' becomes the 'Subject:' line of the email message. The 'Subject:' cannot be spread over more than one line.

The From:-line
The 'From:' line is optional. If there is none, the first address in your '@return_addresses' array is used as the return address. If there is a 'From:' line, it can have have several formats. If the text after the 'From:' is two or fewer characters, then it assumed to be a number and that entry in the '@return_addresses' array is used as the return address. If it is more that two characters long, then the '@return_addresses' array is searched to see if any entries contain this text. The first entry that does is used as the return address. If there is no match, then the entire text is used as the return address. The 'From:' and 'Via:' lines can be in either order.

One use of an alternate return address is to have a "No Spam" return address for certain emails. Things like "meNOSPAM@whereever" are popular but spammers have caught on and just delete the "NOSPAM' part to get a working address. Something like "me.XXXX.@wherever" should be OK but spaces do not work well since email address are not supposed to have spaces. 'pmail' enforces this by removing all spaces in the text. Adding the line
From:XXXX
will use "me.XXXX.@wherever" as your return address if "me.XXXX.@wherever" is one of your '@return_addresses' array entries.

The Via:-line
The 'Via:' line is usually optional. If there is none, the default entry is used as your SMTP server address, but there are some cases in which the default entry is not a valid address and then the 'Via:' line is required. If there is a 'Via:' line, the text after the 'Via:' is assumed to be an SMTP server address, or at least part of one. First the '%mail_array' hash is searched to see if the text occurs in any of the SMTP server addresses. If it does, then the first one found is used as the address. If the text is not found in any of the known address then the entire text is used as your SMTP server address. The 'Via:' and 'From:' lines can be in either order.

The 'Via:' line is of little use under UNIX since usually there is only one SMTP server that will work with the UNIX machine you are on and that will be set automatically if you have set your '%mail_address' hash correctly. Under MacOS where your IP number may change as you move your portable around the '%mail_address' hash is designed to handle the usual places you use for email, but if you have plugged into a new network, the 'Via:' line can be used to get to the SMTP server for this network.

The message
After the 'To:' line, 'Subject:' line and any From: or Via: lines comes the message. This can be as simple as just an ordinary email message or much more complicated as described below. It continues until the end of the file or until a line with just an !END! on it is found.

The message can contain !-!-substitutions and Perl code.

CC: and BCC: At the end of the message, just before the !END! if there is one, you can have a CC: list or a BCC: list. Start with a 'CC:' (or a 'BCC:') at the start of a line. The rest of the message down to the end of the file or to the !END!-line is a list of email addresses, or a list of files which contain addresses. The format of the list is the same as for the list on the 'To:'-line except that we do not support the [file{list}=>{list}]-option. Indeed, we only collect the email addresses for a CC: list even if other data is present. We will call the returned list the 'CC:'-list even if it comes from a 'BCC:'. We do support the ~m filtering mechanism in these lists, but not any scheme to add to the CC:-list after a message have been sent.

If a message has a 'CC:'-list, the message is assembled as usual: first the !-!-substitutions are done, then the Perl code is evaluated and then the message is sent off to the address. Then the identical message is mailed to all the addresses on the 'CC:'-list. We do NOT re-substitute or reevaluate Perl code. The people on the 'CC:'-list see the identical message that the main recipient sees.

The difference between 'CC:' and 'BCC:' is that the addresses on the 'CC:'-list are appended to the message after the Perl Code is evaluated and before the message is sent. They are listed (and sent) in the same order in which they were input. For a 'BCC:', no addresses are appended. Compare the CC-version to the BCC-version. We do not support a mixed 'CC:'-'BCC:' possibility.

Indeed, BCC: is rarely needed. If you want to send a generic message to a list of people, this can be done via the To:-line. One use is to do a BCC to yourself and get a copy of the message emailed to you without the recipient being aware of it, but you already have the original, but sometimes it is useful to let people know that you have handled some message.

Both CC and BCC also need to be used with caution. If you email to a long list of addresses and there is a CC or BCC in the message, each message in the list is also sent to the CC:-list. You should ponder how annoying this will be to these people before you send. See the warning in the CC-version and BCC-version examples.

The !END!-line.
No !END! is required, but if there is one on a line by itself, the message ends just above it. Since the "comment-string" is removed at the start, this line can actually be 'comment-string!END!'. Being able to end the message before the end of the file has several uses. One use is that if we need certain data to compose the email message, we can keep it with the message file below the !END! and we can see it when composing the message, but it will not be sent in the email message. See message.c and message.tex for examples of this use of the !END! mechanism.

The Sent-list.
A second use of the !END! mechanism is that whenever a message is finally emailed, the address, date and time that the message was sent is appended to the file. (If there is no !END!-line, one is created.) This Sent-list gives you a record of having sent the message "to whom when" and what return address and what SMTP server. Actually, this time-stamp is the time the message was successfully passed to the mail program; we have no way of knowing what happened after that. The addresses are listed in the ordered they were shipped to the mail program and the main recipient is distinguished from those on the CC:-list or the BCC:-list.


Assembling each message.
After the data-information is collected for a message-file, it is time to assemble and mail the messages. Recall that the data is now a hash of records with the hash-key being the email address to which to send the message. We loop through the records while sorting the keys with the usual Perl ASCII sort.

For each data-record, we make a copy of the message and do the !-!-substitutions. Then we evaluate the Perl code. If there is a CC:-list, we append the email addresses to which we will CC. Finally we email the message (or send it to the screen), and do any CC or BCC emails required.

Then we get the next record and repeat.

!-!-substitutions.
Each record in the message-data consists of a list of fields. One field is an email address, but just as in traditional mail merges, there can be other fields as well. We can refer to the fields by number if they were collected as an array, or by name if they were collected as a hash. To refer to a field if you have collected by array, use ![number]!; to refer to a field if you have collected by hash, use !{name}!. See here for an example. Array fields have their traditional C/Perl numbering, so the first field is field 0. We support negative numbers as usual: -1 is the last column, -2 is the next to the last, etc. If the 'number' is out of range or the 'name' hash does not exist, you get a warning. If we are actually going to mail the messages, the program shuts down after such an error, but if we are merely sending the messages to the screen, we plow on. If the field happens to be empty, you will get a warning, but the program continues and will email these messages.

Perl code.
A cool feature of pmail is that it is possible to include Perl code in the message which will execute before the message is sent. The mechanism is the following. The script searches for a [[[ and then for a ]]]. All the text between these two symbols is evaluated as Perl code. You may have more than one instance of [[[...]]]. The current values of the message-record are passed as $fields[0], $fields[1], etc. if the data was collected as an array, or as $fields{name1}, $fields{name2}, etc. if the data was collected as a hash.

The gory details are as follows. We locate the first [[[...]]] and proceed as follows. First we divide the message into, $start, the part before the [[[, $end, the part after ]]] and then we evaluate the text between [[[ and ]]] as a Perl program. You may use $start, $end and other global variables in your code. As your code returns, the value of the last statement executed is saved in $middle and the message now reads $start.$middle.$end. (If you just want the message to be $start.$end, just have your last statement return("");. ) After we have assembled the message, we check for the next instance of [[[...]]] starting after the substring $start.$middle and evaluate it. We continue to process the text of the message until we reach the end. Since each search for a '[[[' begins after the previous ']]]', it is difficult, but not impossible, to write infinite loops.

The array (or hash) 'fields' is global to the message, so it can be used to pass values computed in one [[[...]]] to a successor. It is also true that the address to which the message is GOING TO BE SENT is one of the fields, but by now, it has been saved elsewhere and the copy will be used. We do not support computed addresses. You also have no access to the CC:-list data, so you can not compute a CC:-list either. We run under 'strict', so your variables need to be my'ed. To be double safe on variable conflicts, no variable used in 'pmail' starts with an 'x' or an 'X'.

You do have access to some global variables in addition to $fields. It is not intended that you alter the values of any of these.

You do have access to all of the functions in 'pmail', including several that are designed to make some standard tasks easier and/or possible.

Recall that the !-!-substitutions are done before the Perl code is evaluated, so a bit of code like '$xx=![0]!;' has much the same result as '$xx=$fields[0];' except that by the time the Perl code is executed, ![0]! has been replaced by a constant whereas $fields[0] can be changed by the executing code, including some bit that executed from an earlier [[[...]]] in the message.

If you absolutely need [[['s or ![1]!'s in your message, it can be done: for example, just write [[[return("\[\[\[");]]]. Because the search for the next [[[ begins after $start.$middle, infinite loops are avoided. Likewise [[[return("!\[1\]!");]]] is not a substitution when the !-!-substitutions are done.


Format of the record data.
The record-data maintained by 'pmail' for a fixed message-file is a hash of records, %emails, with the records being themselves either arrays or hashes. The hash-key for a record is the email address to which the message for this record is going to be sent.

An individual record can be either an array or a hash. If it is an array, then individual fields in the record are referred to by numbers. If it is a hash, then the individual fields in the record are referred to by name.

These examples should help.


Format of a data-document file.

The #!address line.
A data-document file must begin with a '#!address' on the first line - otherwise an error will occur. The #!address-line can have several switches.

The switches.

The -t switch is the most interesting. There are two ways to collect the data-information. The data-information is a hash of records. The hash-key is the email address to which the message is to be sent. Each record is kept either as an array or as a hash. If each record is being kept as an array, then we say that the data-information is being collected as an array; if each record is being kept as a hash, then we say that the data-information is being collected as a hash.

If the data-information is being collected as an array, then each line after the #!address line in the file is a record. If the data is being collected as a hash, then the second line is an array of field names, separated by the delineator. Each line after this is a record. In either case, records are separated by the delineator in the file. The data is then assembled as follows. The hash-key for a record is the email address and the record is stored as an array if we are collecting by array or as a hash if we are collecting by hash. The hash-keys for a record are the entries in the columns on line 2 and the item associated to a particular hash-key comes from the corresponding entry in the record.

If the -a switch is present, the entry is the number of the column for the email addresses, in C/Perl style with the first being 0. Use the number even if you want to collect by hash. Since you are looking at the file in order to enter this item, finding the number should be no problem. For files with many columns, you may enter a negative number: -1 means the last column, -2 the next to the last, etc.

The -d switch is straightforward. It can be any string and then that sting is assumed to be the field delineator for the records. TAB and SPACE are two popular choices.

The -x switch is designed to help keep address files up-to-date. If you have an address file whose entries are date sensitive, say a class list or a committee list, you can add a -x'mm/dd/yyyy' to the #!address-line and after mm/dd/yyyy the file cannot be used. Of course the data is still there and the file can be rendered usable by changing the date to the future, but it prevents you from mailing to an old class or committee because you forgot to update the file.

Data collection problems.
There is a potential problem if data is being collected from more than one file: the various parameters might not match up correctly. So once we have begun to collect data, a sanity check is performed on subsequent files. First, the collection scheme, array or hash, must be the same for all the files or else we just collect the email addresses. If we are collecting by array at the time of this error, then all the data is collected by array in column 0. Otherwise we collect by hash, but with only the email field, which has the name it had in the initial file.

If we are collecting by array, all the files must have the same number of columns and the email addresses must be in the same column for all the files, or again we just collect the email addresses in column 0.

Collecting by hash is a bit more robust. If ever the hash-key for the email addresses differs from its value in the first file, then again we just collect email addresses. As long as this key remains constant, we collect all the remaining fields that are common to all the files.

The switch defaults.
The default behavior when one or more of the various switches is not present is as follows.

Emails on the To:-line.
This hash of records setup is in place even if there are no data-information files and all the information was read off the To:-line as a list of email addresses. In this case, the data is collected as a 1-element array and the entry in this 1-element array is, redundantly, the email address.

Indeed, the email address is always present redundantly and you are free to modify the copy in the record as you wish. But the address to which the message is to be sent has been saved elsewhere so any modification on the record copy of the email address has no affect on the destination of the message. We do not support calculated email addresses.

Handling multiple copies of the same address.
Because the hash-key for the data is an email address, we will never end up sending the same message twice no matter how many times the address appears in the file(s). If there is additional data associated with the address, then the copy of the data that you have is the one in the last file that you read containing that address. More usefully, you can safely say "To: big_list1 big_list2" and know that only one copy of the message will be sent to each individual address. For those rare occasions when several people are sharing an email account and you want to send a message to each of them individually, we support the following mechanism. In the email field, append a string to each copy of an address. The simplest way to distinguish two copies of xx@yy is to append the strings ‹1› and ‹2› but any strings starting with ‹, ending with › and otherwise containing only letters and numbers which make the various addresses distinct will do. 'pmail' will use xx@yy‹string› for the hash-key, but before emailing, the '‹string›' will be stripped off.

In a 'CC:'-list this mechanism can lead to weird results if you CC to the same address twice. How would the two CC'ed recipients know the difference? The CC list included with the message contains the <...>'s so the two recipients might be able to figure out what is going on, but if you've done a BCC all that will happen is that two copies of the same message will appear at the address.

Since the <...> shows up in 'CC:'-lists, you should be careful what you actually put here.


Appending fields to the data-records.
This section is only relevant when the data on the 'To:'-line us a list of file names and we are collecting data by hash. Then the record-data is being collected as described above. Occasionally, one wishes to add data from another file to each record and we support this as follows. On the 'To:'-line, one can write '[file name {xxx,yyy,...}=>{XXX,YYY,...}]' (written without the two 's). 'file name' is the name of a file.

Initial and final spaces in the file name text are deleted: we support neither initial nor terminal spaces in file names.
The file is assumed to be data, one record to a line. An #!address line is not required and if one is present it is largely ignored. It is a requirement that the first, un-ignored line be a list of field names. The remaining lines of the file are the fields we wish to append, identified by the names on the first line. The records from the file are hashed into an array with the hash-key computed from the second {}-pair as follows. First, there must be one or more strings between the {}'s separated by commas. The hash-key is computed by first extracting the entry for field 'XXX', then the one for 'YYY' if it is present, and then the remaining entries: then the strings are joined, in order, separated by a #, to produce the hash-key. We call this hashed array the "data-to-be-appended". The delineator for the file will be set from the #!address line if it is present and by the usual TAB-SPACE-COLON scheme if it is not.

The current data_records are then modified as follows. We loop through the data-records and for each data-record, we compute the string obtained from the entry in the field 'xxx', followed by the entry, if present, in the field 'yyy', followed by any other entries from the first {}-pair. These stings are joined by a # into a single string and then the data with that hash-key from the "data-to-be-appended" array is added to the current data-record.

As an example [scores{last,first}=>{last,first}] assumes we have data records with at least two fields named last and first. It also assumes that in the file named 'scores', there are also two fields named last and first. Each data-record is examined and the fields in the 'scores' for the person with the same first and last names are appended to the data-records. This example illustrates a point. There may fields with the same names in both the data-records and the "data-to-be-appended". Since we are collecting by hash, for any field name common to both the data-records and the "data-to-be-appended", the value in the "data-to-be-appended" will replace the value currently in the data-records. New field names in the "data-to-be-appended" will actually just be appended. Here is an example of this feature in action.


Filtering and address lines.
When collection email address data from files (or CC/BCC data from files) we can filter the data as we go. The mechanism is the following. On the To:-line, separated from other data by spaces, add a Perl ~m test and this test will be done to each line of the files being read in and these lines will be added to our data only if they pass the test. The To:-line (and CC/BCC lines if any) are read in order and you may change the ~m test before a file is read. However, once set, a test remains in force until it is overridden by a new ~m test. The test ~m// will result in no test being done.

This is not intended for complicated filtering, but if you have a file of faculty members containing email address and first and last names, then
To: ~m/Joe/ faculty
will send your message to Joe (more precisely, to all people on the faculty whose email address, last name or first name contains the string Joe).

We only do the test between the slashes, so we can not do a case insensitive test. More involved tests can be done using Perl code in the body of the message. Continuing our example, if we put
[[[if($fields{first} ne "Joe") {Do_not_send();}]]]
Then we will not send the message to anyone whose first name is not Joe. The ~m on the To:-line is included because it is shorter than adding this code and it runs more quickly. The ~m test filters out email addresses before we start, whereas the [[[...]]] test requires that the message be tried for every address on the list.

As remarked above, ~m tests are not permitted if you have entered a list of email addresses on the line rather than reading them from a file.


A List of error messages and warnings.
Here is a (hopefully) complete list of possible error messages and warnings with some discussion as to what might have caused one to appear.