enmime

package module
v0.0.0-...-88a344d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 6, 2020 License: MIT Imports: 29 Imported by: 1

README

enmime Build Status GoDoc

enmime is a MIME parsing library for Go. It's built ontop of Go's included mime/multipart support, but is geared towards parsing MIME encoded emails.

It is being developed in tandem with the Inbucket email service.

API documentation can be found here: http://godoc.org/github.com/jhillyerd/go.enmime

Development Status

enmime is alpha quality: it works but has not been tested with a wide variety of source data, and it's likely the API will evolve some before an official release.

About

enmime is written in Google Go.

enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/go.enmime

Forked Version Build Status

enmime is forked to use forked stdlib mime, multipart, quotedprintable and textproto packages in order to parse some bad email format. This will increase chances to parse emails successfully.

Documentation

Overview

Package enmime implements a MIME parsing library for Go. It's built ontop of Go's included mime/multipart support, but is geared towards parsing MIME encoded emails.

The basics:

Calling ParseMIMEBody causes enmime to parse the body of the message object into a tree of MIMEPart objects, each of which is aware of its content type, filename and headers. If the part was encoded in quoted-printable or base64, it is decoded before being stored in the MIMEPart object.

ParseMIMEBody returns a MIMEBody struct. The struct contains both the plain text and HTML portions of the email (if available). The root of the tree, as well as slices of the email's inlines and attachments are available in the struct.

If you need to locate a particular MIMEPart, you can pass a custom MIMEPartMatcher function into BreadthMatchFirst() or DepthMatchFirst() to search the MIMEPart tree. BreadthMatchAll() and DepthMatchAll() will collect all matching parts.

Please note that enmime parses messages into memory, so it is not likely to perform well with multi-gigabyte attachments.

enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/go.enmime

Example
file, _ := os.Open("test-data/mail/qp-utf8-header.raw")
msg, _ := mail.ReadMessage(file)     // Read email using Go's net/mail
mime, _ := enmime.ParseMIMEBody(msg) // Parse message body with enmime

// Headers are in the net/mail Message
fmt.Printf("From: %v\n", msg.Header.Get("From"))

// enmime can decode quoted-printable headers
fmt.Printf("Subject: %v\n", mime.GetHeader("Subject"))

// The plain text body is available as mime.Text
fmt.Printf("Text Body: %v chars\n", len(mime.Text))

// The HTML body is stored in mime.HTML
fmt.Printf("HTML Body: %v chars\n", len(mime.HTML))

// mime.Inlines is a slice of inlined attacments
fmt.Printf("Inlines: %v\n", len(mime.Inlines))

// mime.Attachments contains the non-inline attachments
fmt.Printf("Attachments: %v\n", len(mime.Attachments))
Output:

From: James Hillyerd <jamehi03@jamehi03lx.noa.com>
Subject: MIME UTF8 Test ¢ More Text
Text Body: 1300 chars
HTML Body: 1736 chars
Inlines: 0
Attachments: 0

Index

Examples

Constants

This section is empty.

Variables

View Source
var AddressHeaders = []string{"From", "To", "Delivered-To", "Cc", "Bcc", "Reply-To"}

AddressHeaders enumerates SMTP headers that contain email addresses

Functions

func ConvertToUTF8String

func ConvertToUTF8String(charset string, textBytes []byte) (string, error)

ConvertToUTF8String uses the provided charset to decode a slice of bytes into a normal UTF-8 string.

func DecodeHeader

func DecodeHeader(input string) string

DecodeHeader (per RFC 2047) using Golang's mime.WordDecoder

func DecodeToUTF8Base64Header

func DecodeToUTF8Base64Header(input string) string

DecodeToUTF8Base64Header decodes a MIME header per RFC 2047, reencoding to =?utf-8b?

func IsAttachment

func IsAttachment(header mail.Header) bool

IsAttachment returns true, if the given header defines an attachment. First it checks, if the Content-Disposition header defines an attachement. If this test is false, the Content-Type header is checked.

Valid Attachment-Headers:

Content-Disposition: attachment; filename="frog.jpg"
Content-Type: attachment; filename="frog.jpg"

func IsBinaryBody

func IsBinaryBody(mailMsg *mail.Message) bool

IsBinaryBody returns true, if the mail header defines a binary body.

func IsMultipartMessage

func IsMultipartMessage(mailMsg *mail.Message) bool

IsMultipartMessage returns true if the message has a recognized multipart Content-Type header. You don't need to check this before calling ParseMIMEBody, it can handle non-multipart messages.

func IsPlain

func IsPlain(header mail.Header, emptyContentTypeIsPlain bool) bool

IsPlain returns true, if the the mime headers define a valid 'text/plain' or 'text/html part'. Ff emptyContentTypeIsPlain is set to true, a missing Content-Type header will result in a positive plain part detection.

func NewCharsetReader

func NewCharsetReader(charset string, input io.Reader) (io.Reader, error)

NewCharsetReader generates charset-conversion readers, converting from the provided charset into UTF-8. The CharsetReader signature is defined by Golang's mime.WordDecoder

This function is similar to: https://godoc.org/golang.org/x/net/html/charset#NewReaderLabel

func NewMIMEPart

func NewMIMEPart(parent MIMEPart, contentType string) *memMIMEPart

NewMIMEPart creates a new memMIMEPart object. It does not update the parents FirstChild attribute.

Types

type Base64Cleaner

type Base64Cleaner struct {
	// contains filtered or unexported fields
}

Base64Cleaner helps work around bugs in Go's built-in base64 decoder by stripping out whitespace that would cause Go to lose count of things and issue an "illegal base64 data at input byte..." error

func NewBase64Cleaner

func NewBase64Cleaner(r io.Reader) *Base64Cleaner

NewBase64Cleaner returns a Base64Cleaner object for the specified reader. Base64Cleaner implements the io.Reader interface.

func (*Base64Cleaner) Read

func (qp *Base64Cleaner) Read(p []byte) (n int, err error)

Read method for io.Reader interface.

type Base64Combiner

type Base64Combiner struct {
	// contains filtered or unexported fields
}

Base64Combiner help to work around bug where split base64-ed data by line break can cause "illegal base64 data at input byte..." error when the base64-ed data has padding inside it.

func NewB64SoftCombiner

func NewB64SoftCombiner(r io.Reader) *Base64Combiner

func NewBase64Combiner

func NewBase64Combiner(r io.Reader) *Base64Combiner

NewBase64Combiner get data from base64-ed source and produce the original data from it no matter how the base64-ed source splited by line break or carriage return.

func (*Base64Combiner) Read

func (b *Base64Combiner) Read(p []byte) (int, error)

Read method for io.Reader interface.

type MIMEBody

type MIMEBody struct {
	Text           string // The plain text portion of the message
	TextCharset    string
	HTML           string // The HTML portion of the message
	HTMLCharset    string
	IsTextFromHTML bool       // Plain text was empty; down-converted HTML
	Root           MIMEPart   // The top-level MIMEPart
	Attachments    []MIMEPart // All parts having a Content-Disposition of attachment
	Inlines        []MIMEPart // All parts having a Content-Disposition of inline
	OtherParts     []MIMEPart // All parts not in Attachments and Inlines
	// contains filtered or unexported fields
}

MIMEBody is the outer wrapper for MIME messages.

func ParseMIMEBody

func ParseMIMEBody(mailMsg *mail.Message) (*MIMEBody, error)

ParseMIMEBody parses the body of the message object into a tree of MIMEPart objects, each of which is aware of its content type, filename and headers. If the part was encoded in quoted-printable or base64, it is decoded before being stored in the MIMEPart object.

func ParseMIMEBodyWithUTF8QPCorrection

func ParseMIMEBodyWithUTF8QPCorrection(mailMsg *mail.Message) (*MIMEBody, error)

ParseMIMEBodyWithUTF8QPCorrection like ParseMIMEBody but will try to correct bad email with invalid UTF8 quoted-printable so the email can be successfully parsed.

func (*MIMEBody) AddressList

func (m *MIMEBody) AddressList(key string) ([]*mail.Address, error)

AddressList returns a mail.Address slice with RFC 2047 encoded encoded names.

func (*MIMEBody) GetHeader

func (m *MIMEBody) GetHeader(name string) string

GetHeader processes the specified header for RFC 2047 encoded words and return the result

type MIMEPart

type MIMEPart interface {
	Parent() MIMEPart             // Parent of this part (can be nil)
	FirstChild() MIMEPart         // First (top most) child of this part
	NextSibling() MIMEPart        // Next sibling of this part
	Header() textproto.MIMEHeader // Header as parsed by textproto package
	ContentType() string          // Content-Type header without parameters
	Disposition() string          // Content-Disposition header without parameters
	FileName() string             // File Name from disposition or type header
	Charset() string              // Content Charset
	Content() []byte              // Decoded content of this part (can be empty)
}

MIMEPart is the primary interface enmine clients will use. Each MIMEPart represents a node in the MIME multipart tree. The Content-Type, Disposition and File Name are parsed out of the header for easier access.

TODO Content should probably be a reader so that it does not need to be stored in memory.

func BreadthMatchAll

func BreadthMatchAll(p MIMEPart, matcher MIMEPartMatcher) []MIMEPart

BreadthMatchAll performs a breadth first search of the MIMEPart tree and returns all parts that cause the given matcher to return true

func BreadthMatchFirst

func BreadthMatchFirst(p MIMEPart, matcher MIMEPartMatcher) MIMEPart

BreadthMatchFirst performs a breadth first search of the MIMEPart tree and returns the first part that causes the given matcher to return true

func DepthMatchAll

func DepthMatchAll(p MIMEPart, matcher MIMEPartMatcher) []MIMEPart

DepthMatchAll performs a depth first search of the MIMEPart tree and returns all parts that causes the given matcher to return true

func DepthMatchFirst

func DepthMatchFirst(p MIMEPart, matcher MIMEPartMatcher) MIMEPart

DepthMatchFirst performs a depth first search of the MIMEPart tree and returns the first part that causes the given matcher to return true

func ParseMIME

func ParseMIME(reader *bufio.Reader) (MIMEPart, error)

ParseMIME reads a MIME document from the provided reader and parses it into tree of MIMEPart objects.

type MIMEPartMatcher

type MIMEPartMatcher func(part MIMEPart) bool

MIMEPartMatcher is a function type that you must implement to search for MIMEParts using the BreadthMatch* functions. Implementators should inspect the provided MIMEPart and return true if it matches your criteria.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL