PREVIOUS  TABLE OF CONTENTS  NEXT 

E-mail with Attachments

Dan Sugalski

Packages Used:

MIME::Lite

Recommended:  libnet

You know the drill. Someone in accounting has asked that daily reports get e-mailed to him, or your manager has decided that your sales e-mail autoresponder needs to be sending off Microsoft Word documents, or marketing has this great idea for an e-mail newsletter just like the one that Amazon sends out, HTML and all. To do that, of course, you need to send out MIME-encoded mail. But MIME is a dark and mysterious thing, almost impossible to do properly, right?

Well, no. It's pretty easy to MIME-encode mail, far simpler than you'd think given how many programmers get it wrong. What we're going to do in this article is cover the basics of MIME and show you how to build and send your own MIME mail.

One thing this article isn't going to do is show you how to build a MIME mail body by hand. While building MIME mail isn't enormously tough, there are a couple of modules available on CPAN that'll do it for you. Since there's no pressing reason to rewrite one, we won't. Instead, we'll use MIME::Lite, available from CPAN. If you need heavier-duty tools, or need to decode MIME messages, check out MIME::Tools instead.

What is MIME, and why do we care?

MIME is short for Multipurpose Internet Mail Extensions, It's an Internet standard, currently documented in RFCs 2045 through 2048. If you want to read them yourself, and you probably should if you're going to do a lot of work with MIME, the RFCs are available on the web at http://www.rfc-editor.org, or via FTP at ftp://ftp.isi.edu/in-notes.

Put simply, MIME is a way of imposing structure and encoding data in the body of a mail message. When mail was standardized way back in 1982 with RFC822, most of the attention was focused on the mail headers. The body of a mail message was very intentionally left alone. The only limits placed on the body were the character set (7-bit ASCII) and the maximum line length (1000 characters).

The MIME standard covers both the encoding and the structure of a mail message. There are two times when you'd want to use MIME mail -- when your message isn't entirely plain 7-bit US-ASCII or when you need to attach files to a mail message. We'll show you how to do both.

How does MIME encode data?

Table 1: Sending with Mail::Mailer

Since normal mail is restricted to the 7-bit US-ASCII character set, there's obviously a lot of data that can't go into a mail message. To get around this problem, MIME encodes the data to make it safe to use in mail. There are three different MIME encodings: None, quoted-printable, and base 64. These are called content transfer encodings.

The "no encoding" type is the simplest encoding method--it's not really an encoding method at all. It's used for multi-part messages that have a text component, or for folks willing to take the chance that their mail server handles 8-bit data. It's also used by webservers when they transfer data to clients. (HTTP transfers use MIME encoded data.) There are three different keywords: 7bit, 8bit, and binary, that tell the recipient program how to interpret the message at its lowest level: byte-by-byte.

Quoted-printable encoding hex-escapes illegal bytes. Any byte outside the printable 7-bit US-ASCII set is replaced by an equals sign and a two-digit hex value. Space, for example, can be encoded as =20, and an equals sign as =3D. Data that's been encoded as quoted-printable must have lines no longer than 76 characters.

Finally, base64 encoding takes arbitrary data and encodes it using a set of 65 characters. The encoded data ends up being one-third larger, and unlike quoted-printable encodings is unreadable by humans. The encoded data is guaranteed to go through gateways unmangled, and will even travel unscathed through EBCDIC systems.

In addition to encoding data, MIME tags it with a media type that indicates exactly what has been encoded. You've probably seen these in mail, or as messages from your web broswer--things like application/x-zip or image/jpeg. It's the first part of the Content-type header.

MIME marks data with both a media type and subtype. There's a full list of types in Table 2.

Each of those typescan have list of different sub-types. The full list can be found at ftp://ftp.isi.edu/in-notes-iana/assignments/media-types/media-types.

Table 3 lists the most common type/subtype combinations.

It's important that you use the right MIME media type when you're building a MIME message. A good, standards-conforming MIME decoder will use this information to determine how it should handle each chunk of a MIME message. Unfortunately, there are a depressingly large number of non-compliant MIME decoders, and on most systems the media type information is lost as soon as the file is saved, so it's a good idea to also tag individual pieces with filenames that have standard extensions.

Content-type: text/plain
Content-transfer-encoding: 7bit

This is just a plain old piece of text

Figure 1 shows a symbolic representation of a one-part MIME message.

A subtype of octet-stream indicates that this part of the MIME message is filled with data of some sort, and that any MIME application that deals with it should treat it like a raw stream of bytes with no intrinsic meaning. A good MIME application is also supposed to treat any subtype that it doesn't understand as if it were octet-stream.

If you find that you need a media subtype not on the standard list, it's perfectly acceptable to roll your own. All subtypes beginning with x- are reserved for general use, much like the X- mail headers. So if, for example, you were MIME-encoding a tar file, it'd be reasonable to tag it as application/x-tar.

Content-Type: multipart/mixed
Content-type: text/plain

Just some plain text
There's an image attached to this message

Content-type: image/gif
Content-transfer-encoding: base64

asd44DF*CKKTewmntn8845HHURKKMMGHHEWRNG
FOWLLJJTJ436llslkk62kkj62kkj6sfdg99g99wgg

Figure 2 shows a sample multipart/mixed MIME message.

Multiple pieces of MIME

Encoding and media typing is useful, and is one of the underpinnings of the web, but one of the main uses for MIME in mail is to send attachments. There are two media types, multipart/mixed and multipart/alternative. Both indicate that the MIME message has multiple parts.

With multipart/mixed, the parts are all considered separate--this is the media type used for mail messages with attachments. A section in a multipart/mixed message can itself be multipart/mixed, which means you can nest them. Many mail clients don't handle this all that well, so it's probably best not to do this unless you know the receiver can handle it.

Content-type: multipart/alternative

Content-type: text/plain

My home page
This is what my home page looks like!

Content-type: text/html

<HEAD>My home page</HEAD>
<BODY>
This is what <B>my</B> home page looks like!
</BODY>

Figure 3 shows what you might get if you pasted a very simple HTML document into a MIME-aware mail client.

If you do nest them, each multipart/mixed section (not the pieces inside, mind you, but the thing as a whole) can't be encoded with quoted-printable or base64. That'd require the MIME client to decode the multipart/mixed section and then handle each individual piece inside, which might themselves be multipart/mixed. The folks that wrote the standards decided that this was an awful lot to expect of a client, so it's explicitly forbidden.

multipart/alternative lets you attach multiple versions of the same thing. You might, for example, have one part of a multipart/alternative be text/plain and another be audio/basic with a spoken version of the message if you were sending it to someone who had impaired vision.

So, how do I create a MIME message?

Creating a MIME message is pretty simple with MIME::Lite. In fact, it can be as simple as three Perl statements:

# Build the message
$message = MIME::Lite->new(
        From     => 'me@here.org',
        To       => 'you@there.com',
        Subject  => "A test",
        Type     => "text/plain",
        Encoding => '7bit',
        Data     => "Just a test message");
# Tell MIME::Lite to use Net::SMTP instead of sendmail
MIME::Lite->send('smtp', 'localhost', Timeout => 20);
# Send the message
$message->send;

That code builds and sends a mail message via Net::SMTP. If you prefer to use sendmail, you can omit the second line of code. It's okay to have multiple folks in the To list--just pass them in an anonymous array. MIME::Lite::new takes the standard RFC822 headers, so if you want to specify a Cc or Bcc list, or set the Sender field, you can.

In that last example, the message we were sending was specified explicitly with Data. If you'd like to get fancier, you can try something like this:

# Build the message
$message = MIME::Lite->new(
         From     => 'me@here.org',
         To       => 'you@there.com',
         Subject  => "The Net::SMTP docs"
         Type     => "text/plain",
         Encoding => '7bit',
         Filename => 'Net_SMTP.txt'
         Path     => 'perldoc Net::SMTP |');
# Tell MIME::Lite to use Net::SMTP instead of sendmail
MIME::Lite->send('smtp', 'localhost', Timeout => 20);
# Send the message
$message->send;

In this example, we've substituted Path for Data. Path specifies the full path to be passed to open(). It can be a plain file but, as the example shows, doesn't have to be. In this case, we're using some of open()'s magic to spawn off perldoc to get the plain text version of the Net::SMTP docs. The Filename option above provides the MIME section that's created with a filename of Net_SMTP.txt. While this doesn't affect where the data comes from, MIME clients will often use it as a default filename for extracting the data.

There's no real reason that the message you build needs to be text, of course. You could, for example, use a JPEG image or PDF file instead. Normally, though, you'll start with plain text and, if you need to send non-text stuff, attach other things to the end.

Building a mail message with attachments is nearly as simple as building a single-part mail message. For example, take this chunk of code from a mythical tech support autoresponder:

# Build the message
$message = MIME::Lite->new(
           From     => 'me@here.org',
           To       => 'you@there.com',
           Subject  => "Technote PDF Autoresponder",
           Type     => "text/plain",
           Encoding => '7bit',
           Data     => "This is the technote you requested");
# Add the attachment
$message->attach(
           Type     => "application/pdf",
           Encoding => "base64",
           Path     => $technote_requested,
           Filename => "Technote.pdf");
# Tell MIME::Lite to use Net::SMTP instead of sendmail
MIME::Lite->send('smtp', 'localhost', Timeout => 20);
# Send the message
$message->send;

Pretty simple. You can attach as many files as you like this way just by adding extra calls to attach().

Going an alternate route

MIME::Lite has support built in for sending out its MIME messages either via sendmail or through Net::SMTP. That's not always what you want to do, though. Instead, you might want to save the resulting message for later use, or send it with some alternate mailer that has less overhead -- such as the simple SMTP subroutine that appears elsewhere in this issue. The easiest way to extract the message is with the as_string() method, like so:

# Build the message
$message = MIME::Lite->new(
          From     => 'me@here.org',
          To       => 'you@there.com;,
          Subject  => "A test",
          Type     => "text/plain",
          Encoding => '7bit',
          Data     => "Just a test message");
$message_text = $message->as_string;

This sticks the entire message, headers and all, into $message_text. From there you can do what you like with it. If instead you'd rather have just the headers, or just the body, you can use the header_as_string() or body_as_string() methods, respectively.

A full-blown example

Finally, let's take a look at a full program that sends MIME mail. The program in Listing 1 is mail_attach.pl, a variation of which I've got in production on my VMS cluster. We use it for email delivery of some daily reports, but it can be easily modified to do most anything else you like.

At the top of the program there's a hash with a set of common extensions and their associated media types. This list was lifted out of the config files for the webserver I run on my cluster and works well for me, but you might want to tweak it for the sorts of files you're sending. Having the list embedded in the program saves us from having to educate every analyst on which MIME types go with which files.

Next it takes the parameters off the command line. It could (and probably should) use switches rather than positional parameters, but this way works out okay for us, and the folks I work with consider positional parameters the norm.

We then read in the file with the body text of the message we're sending and create the MIME::Lite object. This text doesn't need to be anything fancy, but it's a good idea to give the recipient at least some idea of what the attachments are. This is especially useful for things that get mailed out less than once a week.

Then the program runs through all the files that need to be attached. A regex extracts the extension and we run it through the extension hash to see if we've got an entry. If so, we use the media type and encoding that goes with it. Otherwise, we default to text/plain and 8bit. A different default, such as application/octet-stream and base64, might be desirable depending on what you're mailing out.

Finally we send the message. This version uses Net::SMTP to do the sending. You might prefer to use sendmail if you're on a Unix box, or another mail routine entirely, depending on your setup. My production copy uses a variation on the standalone mail subroutine (appearing elsewhere in this issue) to avoid loading in Net::SMTP and the entire IO family of modules. While they're quite nice, the overhead really isn't justified for a simple mailing program like this.

And in conclusion...

Well, that's it. Sending MIME mail is pretty straightforward, but if you peek under the hood of Lite.pm you'll see the details are tricky enough to justify not rolling your own.

__END__


Dan Sugalski is the VMS Systems Administrator for the Oregon University System's ITS department. He's been involved in the VMS Perl port for a few years, likes threading, mail, doing obscure things in XS, and tilting at windmills. To reach him you can either leave your message and a plate of cookies under the bushes by your front door or send mail to dan@sidhe.org.


PREVIOUS  TABLE OF CONTENTS  NEXT 
\n