enmime

package module
v0.0.0-...-9d33ba6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 8, 2018 License: MIT Imports: 24 Imported by: 0

README

enmime

GoDoc Build Status Go Report Card Coverage Status

enmime is a MIME parsing library for Go. It's built on top of Go's included mime/multipart support, but is geared towards parsing MIME encoded emails.

It is being developed in tandem with the Inbucket email service.

API documentation can be found here: http://godoc.org/github.com/jhillyerd/enmime

A brief guide to migrating from the old go.enmime API is available here: https://github.com/jhillyerd/enmime/wiki/Enmime-Migration-Guide

API Change Warning

Part readers: Part.Read() and Part.Utf8Reader are now deprecated. Please use Part.Content instead. The deprecated readers will be removed in April 2018.

Development Status

enmime is approaching beta quality: it works but has not been tested with a wide variety of source data. It's possible the API will evolve slightly before an official release.

Please see CONTRIBUTING.md if you'd like to contribute code to the project.

About

enmime is written in Google Go.

enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/enmime

Documentation

Overview

Package enmime implements a MIME parsing library for Go. It's built on top of Go's included mime/multipart support, but is geared towards parsing MIME encoded emails.

Overview

The enmime API has two conceptual layers. The lower layer is a tree of Part structs, representing each component of a decoded MIME message. The upper layer, called an Envelope provides an intuitive way to interact with a MIME message.

Part Tree

Calling ReadParts causes enmime to parse the body of a MIME message into a tree of Part objects, each of which is aware of its content type, filename and headers. Each Part implements io.Reader, providing access to the content it represents. If the part was encoded in quoted-printable or base64, it is decoded prior to being accessed by the Reader.

If you need to locate a particular Part, you can pass a custom PartMatcher function into the BreadthMatchFirst() or DepthMatchFirst() methods to search the Part tree. BreadthMatchAll() and DepthMatchAll() will collect all Parts matching your criteria.

The Envelope

EnvelopeFromMessage returns an Envelope struct. Behind the scenes a Part tree is constructed, and then sorted into the correct fields of the Envelope.

The Envelope contains both the plain text and HTML portions of the email. If there was no plain text Part available, the HTML Part will be downconverted using the html2text library1. The root of the Part tree, as well as slices of the inline and attachment Parts are also available.

Please note that enmime parses messages into memory, so it is not likely to perform well with multi-gigabyte attachments.

enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/enmime

Index

Examples

Constants

View Source
const (
	// ErrorMalformedBase64 name
	ErrorMalformedBase64 = "Malformed Base64"
	// ErrorMalformedHeader name
	ErrorMalformedHeader = "Malformed Header"
	// ErrorMissingBoundary name
	ErrorMissingBoundary = "Missing Boundary"
	// ErrorMissingContentType name
	ErrorMissingContentType = "Missing Content-Type"
	// ErrorCharsetConversion name
	ErrorCharsetConversion = "Character Set Conversion"
	// ErrorContentEncoding name
	ErrorContentEncoding = "Content Encoding"
	// ErrorPlainTextFromHTML name
	ErrorPlainTextFromHTML = "Plain Text from HTML"
)

Variables

View Source
var AddressHeaders = map[string]bool{
	"bcc":             true,
	"cc":              true,
	"delivered-to":    true,
	"from":            true,
	"reply-to":        true,
	"to":              true,
	"sender":          true,
	"resent-bcc":      true,
	"resent-cc":       true,
	"resent-from":     true,
	"resent-reply-to": true,
	"resent-to":       true,
	"resent-sender":   true,
}

AddressHeaders is the set of SMTP headers that contain email addresses, used by Envelope.AddressList(). Key characters must be all lowercase.

Functions

This section is empty.

Types

type Envelope

type Envelope struct {
	Text        string   // The plain text portion of the message
	HTML        string   // The HTML portion of the message
	Root        *Part    // The top-level Part
	Attachments []*Part  // All parts having a Content-Disposition of attachment
	Inlines     []*Part  // All parts having a Content-Disposition of inline
	OtherParts  []*Part  // All parts not in Attachments and Inlines
	Errors      []*Error // Errors encountered while parsing
	// contains filtered or unexported fields
}

Envelope is a simplified wrapper for MIME email messages.

Example

ExampleEnvelope demonstrates the relationship between Envelope and Parts.

package main

import (
	"fmt"
	"strings"

	"github.com/jhillyerd/enmime"
)

func main() {
	// Create sample message in memory
	raw := `From: user@inbucket.org
Subject: Example message
Content-Type: multipart/alternative; boundary=Enmime-100

--Enmime-100
Content-Type: text/plain
X-Comment: part1

hello!
--Enmime-100
Content-Type: text/html
X-Comment: part2

<b>hello!</b>
--Enmime-100
Content-Type: text/plain
Content-Disposition: attachment;
filename=hi.txt
X-Comment: part3

hello again!
--Enmime-100--
`

	// Parse message body with enmime.ReadEnvelope
	r := strings.NewReader(raw)
	env, err := enmime.ReadEnvelope(r)
	if err != nil {
		fmt.Print(err)
		return
	}

	// The root Part contains the message header, which is also available via the
	// Envelope.GetHeader() method.
	fmt.Printf("Root Part Subject: %q\n", env.Root.Header.Get("Subject"))
	fmt.Printf("Envelope Subject: %q\n", env.GetHeader("Subject"))
	fmt.Println()

	// The text from part1 is consumed and placed into the Envelope.Text field.
	fmt.Printf("Text Content: %q\n", env.Text)

	// But part1 is also available as a child of the root Part.  Only the headers may be accessed,
	// because the content has been consumed.
	part1 := env.Root.FirstChild
	fmt.Printf("Part 1 X-Comment: %q\n", part1.Header.Get("X-Comment"))
	fmt.Println()

	// The HTML from part2 is consumed and placed into the Envelope.HTML field.
	fmt.Printf("HTML Content: %q\n", env.HTML)

	// And part2 is available as the second child of the root Part. Only the headers may be
	// accessed, because the content has been consumed.
	part2 := env.Root.FirstChild.NextSibling
	fmt.Printf("Part 2 X-Comment: %q\n", part2.Header.Get("X-Comment"))
	fmt.Println()

	// Because part3 has a disposition of attachment, it is added to the Envelope.Attachments
	// slice
	fmt.Printf("Attachment 1 X-Comment: %q\n", env.Attachments[0].Header.Get("X-Comment"))

	// And is still available as the third child of the root Part
	part3 := env.Root.FirstChild.NextSibling.NextSibling
	fmt.Printf("Part 3 X-Comment: %q\n", part3.Header.Get("X-Comment"))

	// The content of Attachments, Inlines and OtherParts are available as a slice of bytes
	fmt.Printf("Part 3 Content: %q\n", part3.Content)

	// part3 contained a malformed header line, enmime has attached an Error to it
	p3error := part3.Errors[0]
	fmt.Println(p3error.String())
	fmt.Println()

	// All Part errors are collected and placed into Envelope.Errors
	fmt.Println("Envelope errors:")
	for _, e := range env.Errors {
		fmt.Println(e.String())
	}

}
Output:

Root Part Subject: "Example message"
Envelope Subject: "Example message"

Text Content: "hello!"
Part 1 X-Comment: "part1"

HTML Content: "<b>hello!</b>"
Part 2 X-Comment: "part2"

Attachment 1 X-Comment: "part3"
Part 3 X-Comment: "part3"
Part 3 Content: "hello again!"
[W] Malformed Header: Continued line "filename=hi.txt" was not indented

Envelope errors:
[W] Malformed Header: Continued line "filename=hi.txt" was not indented

func EnvelopeFromPart

func EnvelopeFromPart(root *Part) (*Envelope, error)

EnvelopeFromPart uses the provided Part tree to build an Envelope, downconverting HTML to plain text if needed, and sorting the attachments, inlines and other parts into their respective slices. Errors are collected from all Parts and placed into the Envelopes Errors slice.

func ReadEnvelope

func ReadEnvelope(r io.Reader) (*Envelope, error)

ReadEnvelope is a wrapper around ReadParts and EnvelopeFromPart. It parses the content of the provided reader into an Envelope, downconverting HTML to plain text if needed, and sorting the attachments, inlines and other parts into their respective slices. Errors are collected from all Parts and placed into the Envelope.Errors slice.

Example
package main

import (
	"fmt"
	"os"

	"github.com/jhillyerd/enmime"
)

func main() {
	// Open a sample message file
	r, err := os.Open("testdata/mail/qp-utf8-header.raw")
	if err != nil {
		fmt.Print(err)
		return
	}

	// Parse message body with enmime
	env, err := enmime.ReadEnvelope(r)
	if err != nil {
		fmt.Print(err)
		return
	}

	// Headers can be retrieved via Envelope.GetHeader(name)
	fmt.Printf("From: %v\n", env.GetHeader("From"))

	// Address-type headers can be parsed into a list of decoded mail.Address structs
	alist, _ := env.AddressList("To")
	for _, addr := range alist {
		fmt.Printf("To: %s <%s>\n", addr.Name, addr.Address)
	}

	// enmime can decode quoted-printable headers
	fmt.Printf("Subject: %v\n", env.GetHeader("Subject"))

	// The plain text body is available as mime.Text
	fmt.Printf("Text Body: %v chars\n", len(env.Text))

	// The HTML body is stored in mime.HTML
	fmt.Printf("HTML Body: %v chars\n", len(env.HTML))

	// mime.Inlines is a slice of inlined attacments
	fmt.Printf("Inlines: %v\n", len(env.Inlines))

	// mime.Attachments contains the non-inline attachments
	fmt.Printf("Attachments: %v\n", len(env.Attachments))

}
Output:

From: James Hillyerd <jamehi03@jamehi03lx.noa.com>, André Pirard <PIRARD@vm1.ulg.ac.be>
To: Mirosław Marczak <marczak@inbucket.com>
Subject: MIME UTF8 Test ¢ More Text
Text Body: 1300 chars
HTML Body: 1736 chars
Inlines: 0
Attachments: 0

func (*Envelope) AddressList

func (e *Envelope) AddressList(key string) ([]*mail.Address, error)

AddressList returns a mail.Address slice with RFC 2047 encoded names converted to UTF-8

func (*Envelope) GetHeader

func (e *Envelope) GetHeader(name string) string

GetHeader processes the specified header for RFC 2047 encoded words and returns the result as a UTF-8 string

type Error

type Error struct {
	Name   string // The name or type of error encountered, from Error consts
	Detail string // Additional detail about the cause of the error, if available
	Severe bool   // Indicates that a portion of the message was lost during parsing
}

Error describes an error encountered while parsing.

func (*Error) String

func (e *Error) String() string

String formats the enmime.Error as a string

type Part

type Part struct {
	PartID      string               // PartID labels this parts position within the tree
	Header      textproto.MIMEHeader // Header for this Part
	Parent      *Part                // Parent of this part (can be nil)
	FirstChild  *Part                // FirstChild is the top most child of this part
	NextSibling *Part                // NextSibling of this part
	ContentType string               // ContentType header without parameters
	Disposition string               // Content-Disposition header without parameters
	FileName    string               // The file-name from disposition or type header
	Charset     string               // The content charset encoding label
	Errors      []Error              // Errors encountered while parsing this part
	Content     []byte               // Content after decoding, UTF-8 conversion if applicable
	Epilogue    []byte               // Epilogue contains data following the closing boundary marker
	Utf8Reader  io.Reader            // DEPRECATED: The decoded content converted to UTF-8
	// contains filtered or unexported fields
}

Part represents a node in the MIME multipart tree. The Content-Type, Disposition and File Name are parsed out of the header for easier access.

func NewPart

func NewPart(parent *Part, contentType string) *Part

NewPart creates a new Part object. It does not update the parents FirstChild attribute.

func ReadParts

func ReadParts(r io.Reader) (*Part, error)

ReadParts reads a MIME document from the provided reader and parses it into tree of Part objects.

func (*Part) BreadthMatchAll

func (p *Part) BreadthMatchAll(matcher PartMatcher) []*Part

BreadthMatchAll performs a breadth first search of the Part tree and returns all parts that cause the given matcher to return true

func (*Part) BreadthMatchFirst

func (p *Part) BreadthMatchFirst(matcher PartMatcher) *Part

BreadthMatchFirst performs a breadth first search of the Part tree and returns the first part that causes the given matcher to return true

func (*Part) DepthMatchAll

func (p *Part) DepthMatchAll(matcher PartMatcher) []*Part

DepthMatchAll performs a depth first search of the Part tree and returns all parts that causes the given matcher to return true

func (*Part) DepthMatchFirst

func (p *Part) DepthMatchFirst(matcher PartMatcher) *Part

DepthMatchFirst performs a depth first search of the Part tree and returns the first part that causes the given matcher to return true

func (*Part) Read

func (p *Part) Read(b []byte) (n int, err error)

Read returns the decoded & UTF-8 converted content; implements io.Reader.

type PartMatcher

type PartMatcher func(part *Part) bool

PartMatcher is a function type that you must implement to search for Parts using the BreadthMatch* functions. Implementators should inspect the provided Part and return true if it matches your criteria.

Directories

Path Synopsis
cmd
mime-dump
Package main outputs a markdown formatted document describing the provided email
Package main outputs a markdown formatted document describing the provided email
mime-extractor
Package main extracts attachments from the provided email
Package main extracts attachments from the provided email
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL