core

package
v0.0.0-...-73e6ce1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 27, 2023 License: AGPL-3.0, AGPL-3.0-only Imports: 32 Imported by: 0

Documentation

Overview

Package core defines and implements the primitive PDF object types in golang, and provides functionality for parsing those from a PDF file stream. This includes I/O handling, cross references, repairs, encryption, encoding and other core capabilities.

Index

Constants

View Source
const (
	// XrefTypeTableEntry indicates a normal xref table entry.
	XrefTypeTableEntry xrefType = iota

	// XrefTypeObjectStream indicates an xref entry in an xref object stream.
	XrefTypeObjectStream xrefType = iota
)
View Source
const (
	StreamEncodingFilterNameFlate     = "FlateDecode"
	StreamEncodingFilterNameLZW       = "LZWDecode"
	StreamEncodingFilterNameDCT       = "DCTDecode"
	StreamEncodingFilterNameRunLength = "RunLengthDecode"
	StreamEncodingFilterNameASCIIHex  = "ASCIIHexDecode"
	StreamEncodingFilterNameASCII85   = "ASCII85Decode"
	StreamEncodingFilterNameCCITTFax  = "CCITTFaxDecode"
	StreamEncodingFilterNameJBIG2     = "JBIG2Decode"
	StreamEncodingFilterNameJPX       = "JPXDecode"
	StreamEncodingFilterNameRaw       = "Raw"
)

Stream encoding filter names.

View Source
const (
	// DefaultJPEGQuality is the default quality produced by JPEG encoders.
	DefaultJPEGQuality = 75
)
View Source
const JB2ImageAutoThreshold = -1.0

JB2ImageAutoThreshold is the const value used by the 'GoImageToJBIG2Image function' used to set auto threshold for the histogram.

Variables

View Source
var (
	// ErrUnsupportedEncodingParameters error indicates that encoding/decoding was attempted with unsupported
	// encoding parameters.
	// For example when trying to encode with an unsupported Predictor (flate).
	ErrUnsupportedEncodingParameters = errors.New("unsupported encoding parameters")
	ErrNoCCITTFaxDecode              = errors.New("CCITTFaxDecode encoding is not yet implemented")
	ErrNoJBIG2Decode                 = errors.New("JBIG2Decode encoding is not yet implemented")
	ErrNoJPXDecode                   = errors.New("JPXDecode encoding is not yet implemented")
	ErrNoPdfVersion                  = errors.New("version not found")
	ErrTypeError                     = errors.New("type check error")
	ErrRangeError                    = errors.New("range check error")
	ErrNotSupported                  = errors.New("feature not currently supported")
	ErrNotANumber                    = errors.New("not a number")
)

Common errors that may occur on PDF parsing/writing.

Functions

func DecodeStream

func DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream decodes the stream data and returns the decoded data. An error is returned upon failure.

func EncodeStream

func EncodeStream(streamObj *PdfObjectStream) error

EncodeStream encodes the stream data using the encoded specified by the stream's dictionary.

func EqualObjects

func EqualObjects(obj1, obj2 PdfObject) bool

EqualObjects returns true if `obj1` and `obj2` have the same contents.

NOTE: It is a good idea to flatten obj1 and obj2 with FlattenObject before calling this function so that contents, rather than references, can be compared.

func GetBoolVal

func GetBoolVal(obj PdfObject) (b bool, found bool)

GetBoolVal returns the bool value within a *PdObjectBool represented by an PdfObject interface directly or indirectly. If the PdfObject does not represent a bool value, a default value of false is returned (found = false also).

func GetFloatVal

func GetFloatVal(obj PdfObject) (val float64, found bool)

GetFloatVal returns the float64 value represented by the PdfObject directly or indirectly if contained within an indirect object. On type mismatch the found bool flag returned is false and a nil pointer is returned.

func GetIntVal

func GetIntVal(obj PdfObject) (val int, found bool)

GetIntVal returns the int value represented by the PdfObject directly or indirectly if contained within an indirect object. On type mismatch the found bool flag returned is false and a nil pointer is returned.

func GetNameVal

func GetNameVal(obj PdfObject) (val string, found bool)

GetNameVal returns the string value represented by the PdfObject directly or indirectly if contained within an indirect object. On type mismatch the found bool flag returned is false and an empty string is returned.

func GetNumberAsFloat

func GetNumberAsFloat(obj PdfObject) (float64, error)

GetNumberAsFloat returns the contents of `obj` as a float if it is an integer or float, or an error if it isn't.

func GetNumberAsInt64

func GetNumberAsInt64(obj PdfObject) (int64, error)

GetNumberAsInt64 returns the contents of `obj` as an int64 if it is an integer or float, or an error if it isn't. This is for cases where expecting an integer, but some implementations actually store the number in a floating point format.

func GetNumbersAsFloat

func GetNumbersAsFloat(objects []PdfObject) (floats []float64, err error)

GetNumbersAsFloat converts a list of pdf objects representing floats or integers to a slice of float64 values.

func GetStringBytes

func GetStringBytes(obj PdfObject) (val []byte, found bool)

GetStringBytes is like GetStringVal except that it returns the string as a []byte. It is for convenience.

func GetStringVal

func GetStringVal(obj PdfObject) (val string, found bool)

GetStringVal returns the string value represented by the PdfObject directly or indirectly if contained within an indirect object. On type mismatch the found bool flag returned is false and an empty string is returned.

func IsDecimalDigit

func IsDecimalDigit(c byte) bool

IsDecimalDigit checks if the character is a part of a decimal number string.

func IsDelimiter

func IsDelimiter(c byte) bool

IsDelimiter checks if a character represents a delimiter.

func IsFloatDigit

func IsFloatDigit(c byte) bool

IsFloatDigit checks if a character can be a part of a float number string.

func IsNullObject

func IsNullObject(obj PdfObject) bool

IsNullObject returns true if `obj` is a PdfObjectNull.

func IsOctalDigit

func IsOctalDigit(c byte) bool

IsOctalDigit checks if a character can be part of an octal digit string.

func IsPrintable

func IsPrintable(c byte) bool

IsPrintable checks if a character is printable. Regular characters that are outside the range EXCLAMATION MARK(21h) (!) to TILDE (7Eh) (~) should be written using the hexadecimal notation.

func IsWhiteSpace

func IsWhiteSpace(ch byte) bool

IsWhiteSpace checks if byte represents a white space character.

func PdfCryptNewEncrypt

func PdfCryptNewEncrypt(cf crypto.Filter, userPass, ownerPass []byte, perm security.Permissions) (*PdfCrypt, *EncryptInfo, error)

PdfCryptNewEncrypt makes the document crypt handler based on a specified crypt filter.

func ResolveReferencesDeep

func ResolveReferencesDeep(o PdfObject, traversed map[PdfObject]struct{}) error

ResolveReferencesDeep recursively traverses through object `o`, looking up and replacing references with indirect objects. Optionally a map of already deep-resolved objects can be provided via `traversed`. The `traversed` map is updated while traversing the objects to avoid traversing same objects multiple times.

Types

type ASCII85Encoder

type ASCII85Encoder struct {
}

ASCII85Encoder implements ASCII85 encoder/decoder.

func NewASCII85Encoder

func NewASCII85Encoder() *ASCII85Encoder

NewASCII85Encoder makes a new ASCII85 encoder.

func (*ASCII85Encoder) DecodeBytes

func (enc *ASCII85Encoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes byte array with ASCII85. 5 ASCII characters -> 4 raw binary bytes

func (*ASCII85Encoder) DecodeStream

func (enc *ASCII85Encoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream implements ASCII85 stream decoding.

func (*ASCII85Encoder) EncodeBytes

func (enc *ASCII85Encoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes encodes data into ASCII85 encoded format.

func (*ASCII85Encoder) GetFilterName

func (enc *ASCII85Encoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*ASCII85Encoder) MakeDecodeParams

func (enc *ASCII85Encoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*ASCII85Encoder) MakeStreamDict

func (enc *ASCII85Encoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict make a new instance of an encoding dictionary for a stream object.

func (*ASCII85Encoder) UpdateParams

func (enc *ASCII85Encoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type ASCIIHexEncoder

type ASCIIHexEncoder struct {
}

ASCIIHexEncoder implements ASCII hex encoder/decoder.

func NewASCIIHexEncoder

func NewASCIIHexEncoder() *ASCIIHexEncoder

NewASCIIHexEncoder makes a new ASCII hex encoder.

func (*ASCIIHexEncoder) DecodeBytes

func (enc *ASCIIHexEncoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes a slice of ASCII encoded bytes and returns the result.

func (*ASCIIHexEncoder) DecodeStream

func (enc *ASCIIHexEncoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream implements ASCII hex decoding.

func (*ASCIIHexEncoder) EncodeBytes

func (enc *ASCIIHexEncoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes ASCII encodes the passed in slice of bytes.

func (*ASCIIHexEncoder) GetFilterName

func (enc *ASCIIHexEncoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*ASCIIHexEncoder) MakeDecodeParams

func (enc *ASCIIHexEncoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*ASCIIHexEncoder) MakeStreamDict

func (enc *ASCIIHexEncoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object.

func (*ASCIIHexEncoder) UpdateParams

func (enc *ASCIIHexEncoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type CCITTFaxEncoder

type CCITTFaxEncoder struct {
	K                      int
	EndOfLine              bool
	EncodedByteAlign       bool
	Columns                int
	Rows                   int
	EndOfBlock             bool
	BlackIs1               bool
	DamagedRowsBeforeError int
}

CCITTFaxEncoder implements Group3 and Group4 facsimile (fax) encoder/decoder.

func NewCCITTFaxEncoder

func NewCCITTFaxEncoder() *CCITTFaxEncoder

NewCCITTFaxEncoder makes a new CCITTFax encoder.

func (*CCITTFaxEncoder) DecodeBytes

func (enc *CCITTFaxEncoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes the CCITTFax encoded image data.

func (*CCITTFaxEncoder) DecodeStream

func (enc *CCITTFaxEncoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream decodes the stream containing CCITTFax encoded image data.

func (*CCITTFaxEncoder) EncodeBytes

func (enc *CCITTFaxEncoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes encodes the image data using either Group3 or Group4 CCITT facsimile (fax) encoding. `data` is expected to be 1 color component, 1 byte per component.

func (*CCITTFaxEncoder) GetFilterName

func (enc *CCITTFaxEncoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*CCITTFaxEncoder) MakeDecodeParams

func (enc *CCITTFaxEncoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*CCITTFaxEncoder) MakeStreamDict

func (enc *CCITTFaxEncoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object.

func (*CCITTFaxEncoder) UpdateParams

func (enc *CCITTFaxEncoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type DCTEncoder

type DCTEncoder struct {
	ColorComponents  int // 1 (gray), 3 (rgb), 4 (cmyk)
	BitsPerComponent int // 8 or 16 bit
	Width            int
	Height           int
	Quality          int
}

DCTEncoder provides a DCT (JPG) encoding/decoding functionality for images.

func NewDCTEncoder

func NewDCTEncoder() *DCTEncoder

NewDCTEncoder makes a new DCT encoder with default parameters.

func (*DCTEncoder) DecodeBytes

func (enc *DCTEncoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes a slice of DCT encoded bytes and returns the result.

func (*DCTEncoder) DecodeStream

func (enc *DCTEncoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream decodes a DCT encoded stream and returns the result as a slice of bytes.

func (*DCTEncoder) EncodeBytes

func (enc *DCTEncoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes DCT encodes the passed in slice of bytes.

func (*DCTEncoder) GetFilterName

func (enc *DCTEncoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*DCTEncoder) MakeDecodeParams

func (enc *DCTEncoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*DCTEncoder) MakeStreamDict

func (enc *DCTEncoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object. Has the Filter set. Some other parameters are generated elsewhere.

func (*DCTEncoder) UpdateParams

func (enc *DCTEncoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type DrawableImage

type DrawableImage interface {
	ColorModel() gocolor.Model
	Bounds() goimage.Rectangle
	At(x, y int) gocolor.Color
	Set(x, y int, c gocolor.Color)
}

DrawableImage is same as golang image/draw's Image interface that allow drawing images.

type EncryptInfo

type EncryptInfo struct {
	// Version is minimal PDF version that supports specified encryption algorithm.
	Version
	// Encrypt is an encryption dictionary that contains all necessary parameters.
	// It should be stored in all copies of the document trailer.
	Encrypt *PdfObjectDictionary
	// ID0 and ID1 are IDs used in the trailer. Older algorithms such as RC4 uses them for encryption.
	ID0, ID1 string
}

EncryptInfo contains an information generated by the document encrypter.

type FlateEncoder

type FlateEncoder struct {
	Predictor        int
	BitsPerComponent int
	// For predictors
	Columns int
	Colors  int
}

FlateEncoder represents Flate encoding.

func NewFlateEncoder

func NewFlateEncoder() *FlateEncoder

NewFlateEncoder makes a new flate encoder with default parameters, predictor 1 and bits per component 8.

func (*FlateEncoder) DecodeBytes

func (enc *FlateEncoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes a slice of Flate encoded bytes and returns the result.

func (*FlateEncoder) DecodeStream

func (enc *FlateEncoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream decodes a FlateEncoded stream object and give back decoded bytes.

func (*FlateEncoder) EncodeBytes

func (enc *FlateEncoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes encodes a bytes array and return the encoded value based on the encoder parameters.

func (*FlateEncoder) GetFilterName

func (enc *FlateEncoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*FlateEncoder) MakeDecodeParams

func (enc *FlateEncoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*FlateEncoder) MakeStreamDict

func (enc *FlateEncoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object. Has the Filter set and the DecodeParms.

func (*FlateEncoder) SetPredictor

func (enc *FlateEncoder) SetPredictor(columns int)

SetPredictor sets the predictor function. Specify the number of columns per row. The columns indicates the number of samples per row. Used for grouping data together for compression.

func (*FlateEncoder) UpdateParams

func (enc *FlateEncoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type JBIG2CompressionType

type JBIG2CompressionType int

JBIG2CompressionType defines the enum compression type used by the JBIG2Encoder.

const (
	// JB2Generic is the JBIG2 compression type that uses generic region see 6.2.
	JB2Generic JBIG2CompressionType = iota
	// JB2SymbolCorrelation is the JBIG2 compression type that uses symbol dictionary and text region encoding procedure
	// with the correlation classification.
	// NOT IMPLEMENTED YET.
	JB2SymbolCorrelation
	// JB2SymbolRankHaus is the JBIG2 compression type that uses symbol dictionary and text region encoding procedure
	// with the rank hausdorff classification. RankHausMode uses the rank Hausdorff method that classifies the input images.
	// It is more robust, more susceptible to confusing components that should be in different classes.
	// NOT IMPLEMENTED YET.
	JB2SymbolRankHaus
)

type JBIG2Encoder

type JBIG2Encoder struct {
	// These values are required to be set for the 'EncodeBytes' method.
	// ColorComponents defines the number of color components for provided image.
	ColorComponents int
	// BitsPerComponent is the number of bits that stores per color component
	BitsPerComponent int
	// Width is the width of the image to encode
	Width int
	// Height is the height of the image to encode.
	Height int

	// Globals are the JBIG2 global segments.
	Globals jbig2.Globals
	// IsChocolateData defines if the data is encoded such that
	// binary data '1' means black and '0' white.
	// otherwise the data is called vanilla.
	// Naming convention taken from: 'https://en.wikipedia.org/wiki/Binary_image#Interpretation'
	IsChocolateData bool
	// DefaultPageSettings are the settings parameters used by the jbig2 encoder.
	DefaultPageSettings JBIG2EncoderSettings
	// contains filtered or unexported fields
}

JBIG2Encoder implements both jbig2 encoder and the decoder. The encoder allows to encode provided images (best used document scans) in multiple way. By default it uses single page generic encoder. It allows to store lossless data as a single segment. In order to store multiple image pages use the 'FileMode' which allows to store more pages within single jbig2 document. WIP: In order to obtain better compression results the encoder would allow to encode the input in a lossy or lossless way with a component (symbol) mode. It divides the image into components. Then checks if any component is 'similar' to the others and maps them together. The symbol classes are stored in the dictionary. Then the encoder creates text regions which uses the related symbol classes to fill it's space. The similarity is defined by the 'Threshold' variable (default: 0.95). The less the value is, the more components matches to single class, thus the compression is better, but the result might become lossy.

func NewJBIG2Encoder

func NewJBIG2Encoder() *JBIG2Encoder

NewJBIG2Encoder creates a new JBIG2Encoder.

func (*JBIG2Encoder) AddPageImage

func (enc *JBIG2Encoder) AddPageImage(img *JBIG2Image, settings *JBIG2EncoderSettings) (err error)

AddPageImage adds the page with the image 'img' to the encoder context in order to encode it jbig2 document. The 'settings' defines what encoding type should be used by the encoder.

func (*JBIG2Encoder) DecodeBytes

func (enc *JBIG2Encoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes a slice of JBIG2 encoded bytes and returns the results.

func (*JBIG2Encoder) DecodeGlobals

func (enc *JBIG2Encoder) DecodeGlobals(encoded []byte) (jbig2.Globals, error)

DecodeGlobals decodes 'encoded' byte stream and returns their Globally defined segments ('Globals').

func (*JBIG2Encoder) DecodeImages

func (enc *JBIG2Encoder) DecodeImages(encoded []byte) ([]image.Image, error)

DecodeImages decodes the page images from the jbig2 'encoded' data input. The jbig2 document may contain multiple pages, thus the function can return multiple images. The images order corresponds to the page number.

func (*JBIG2Encoder) DecodeStream

func (enc *JBIG2Encoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream decodes a JBIG2 encoded stream and returns the result as a slice of bytes.

func (*JBIG2Encoder) Encode

func (enc *JBIG2Encoder) Encode() (data []byte, err error)

Encode encodes previously prepare jbig2 document and stores it as the byte slice.

func (*JBIG2Encoder) EncodeBytes

func (enc *JBIG2Encoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes encodes slice of bytes into JBIG2 encoding format. The input 'data' must be an image. In order to Decode it a user is responsible to load the codec ('png', 'jpg'). Returns jbig2 single page encoded document byte slice. The encoder uses DefaultPageSettings to encode given image.

func (*JBIG2Encoder) EncodeImage

func (enc *JBIG2Encoder) EncodeImage(img image.Image) ([]byte, error)

EncodeImage encodes 'img' golang image.Image into jbig2 encoded bytes document using default encoder settings.

func (*JBIG2Encoder) EncodeJBIG2Image

func (enc *JBIG2Encoder) EncodeJBIG2Image(img *JBIG2Image) ([]byte, error)

EncodeJBIG2Image encodes 'img' into jbig2 encoded bytes stream, using default encoder settings.

func (*JBIG2Encoder) GetFilterName

func (enc *JBIG2Encoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*JBIG2Encoder) MakeDecodeParams

func (enc *JBIG2Encoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*JBIG2Encoder) MakeStreamDict

func (enc *JBIG2Encoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object.

func (*JBIG2Encoder) UpdateParams

func (enc *JBIG2Encoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder. Implements StreamEncoder interface.

type JBIG2EncoderSettings

type JBIG2EncoderSettings struct {
	// FileMode defines if the jbig2 encoder should return full jbig2 file instead of
	// shortened pdf mode. This adds the file header to the jbig2 definition.
	FileMode bool
	// Compression is the setting that defines the compression type used for encoding the page.
	Compression JBIG2CompressionType
	// DuplicatedLinesRemoval code generic region in a way such that if the lines are duplicated the encoder
	// doesn't store it twice.
	DuplicatedLinesRemoval bool
	// DefaultPixelValue is the bit value initial for every pixel in the page.
	DefaultPixelValue uint8
	// ResolutionX optional setting that defines the 'x' axis input image resolution - used for single page encoding.
	ResolutionX int
	// ResolutionY optional setting that defines the 'y' axis input image resolution - used for single page encoding.
	ResolutionY int
	// Threshold defines the threshold of the image correlation for
	// non Generic compression.
	// User only for JB2SymbolCorrelation and JB2SymbolRankHaus methods.
	// Best results in range [0.7 - 0.98] - the less the better the compression would be
	// but the more lossy.
	// Default value: 0.95
	Threshold float64
}

JBIG2EncoderSettings contains the parameters and settings used by the JBIG2Encoder. Current version works only on JB2Generic compression.

func (JBIG2EncoderSettings) Validate

func (s JBIG2EncoderSettings) Validate() error

Validate validates the page settings for the JBIG2 encoder.

type JBIG2Image

type JBIG2Image struct {
	// Width and Height defines the image boundaries.
	Width, Height int
	// Data is the byte slice data for the input image
	Data []byte
	// HasPadding is the attribute that defines if the last byte of the data in the row contains
	// 0 bits padding.
	HasPadding bool
}

JBIG2Image is the image structure used by the jbig2 encoder. Its Data must be in a 1 bit per component and 1 component per pixel (1bpp). In order to create binary image use GoImageToJBIG2 function. If the image data contains the row bytes padding set the HasPadding to true.

func GoImageToJBIG2

func GoImageToJBIG2(i image.Image, bwThreshold float64) (*JBIG2Image, error)

GoImageToJBIG2 creates a binary image on the base of 'i' golang image.Image. If the image is not a black/white image then the function converts provided input into JBIG2Image with 1bpp. For non grayscale images the function performs the conversion to the grayscale temp image. Then it checks the value of the gray image value if it's within bounds of the black white threshold. This 'bwThreshold' value should be in range (0.0, 1.0). The threshold checks if the grayscale pixel (uint) value is greater or smaller than 'bwThreshold' * 255. Pixels inside the range will be white, and the others will be black. If the 'bwThreshold' is equal to -1.0 - JB2ImageAutoThreshold then it's value would be set on the base of it's histogram using Triangle method. For more information go to:

https://www.mathworks.com/matlabcentral/fileexchange/28047-gray-image-thresholding-using-the-triangle-method

func (*JBIG2Image) ToGoImage

func (j *JBIG2Image) ToGoImage() (image.Image, error)

ToGoImage converts the JBIG2Image to the golang image.Image.

type JPXEncoder

type JPXEncoder struct{}

JPXEncoder implements JPX encoder/decoder (dummy, for now) FIXME: implement

func NewJPXEncoder

func NewJPXEncoder() *JPXEncoder

NewJPXEncoder returns a new instance of JPXEncoder.

func (*JPXEncoder) DecodeBytes

func (enc *JPXEncoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes a slice of JPX encoded bytes and returns the result.

func (*JPXEncoder) DecodeStream

func (enc *JPXEncoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream decodes a JPX encoded stream and returns the result as a slice of bytes.

func (*JPXEncoder) EncodeBytes

func (enc *JPXEncoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes JPX encodes the passed in slice of bytes.

func (*JPXEncoder) GetFilterName

func (enc *JPXEncoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*JPXEncoder) MakeDecodeParams

func (enc *JPXEncoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*JPXEncoder) MakeStreamDict

func (enc *JPXEncoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object.

func (*JPXEncoder) UpdateParams

func (enc *JPXEncoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type LZWEncoder

type LZWEncoder struct {
	Predictor        int
	BitsPerComponent int
	// For predictors
	Columns int
	Colors  int
	// LZW algorithm setting.
	EarlyChange int
}

LZWEncoder provides LZW encoding/decoding functionality.

func NewLZWEncoder

func NewLZWEncoder() *LZWEncoder

NewLZWEncoder makes a new LZW encoder with default parameters.

func (*LZWEncoder) DecodeBytes

func (enc *LZWEncoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes a slice of LZW encoded bytes and returns the result.

func (*LZWEncoder) DecodeStream

func (enc *LZWEncoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream decodes a LZW encoded stream and returns the result as a slice of bytes.

func (*LZWEncoder) EncodeBytes

func (enc *LZWEncoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes implements support for LZW encoding. Currently not supporting predictors (raw compressed data only). Only supports the Early change = 1 algorithm (compress/lzw) as the other implementation does not have a write method. TODO: Consider refactoring compress/lzw to allow both.

func (*LZWEncoder) GetFilterName

func (enc *LZWEncoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*LZWEncoder) MakeDecodeParams

func (enc *LZWEncoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*LZWEncoder) MakeStreamDict

func (enc *LZWEncoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object. Has the Filter set and the DecodeParms.

func (*LZWEncoder) UpdateParams

func (enc *LZWEncoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type MultiEncoder

type MultiEncoder struct {
	// contains filtered or unexported fields
}

MultiEncoder supports serial encoding.

func NewMultiEncoder

func NewMultiEncoder() *MultiEncoder

NewMultiEncoder returns a new instance of MultiEncoder.

func (*MultiEncoder) AddEncoder

func (enc *MultiEncoder) AddEncoder(encoder StreamEncoder)

AddEncoder adds the passed in encoder to the underlying encoder slice.

func (*MultiEncoder) DecodeBytes

func (enc *MultiEncoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes a multi-encoded slice of bytes by passing it through the DecodeBytes method of the underlying encoders.

func (*MultiEncoder) DecodeStream

func (enc *MultiEncoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream decodes a multi-encoded stream by passing it through the DecodeStream method of the underlying encoders.

func (*MultiEncoder) EncodeBytes

func (enc *MultiEncoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes encodes the passed in slice of bytes by passing it through the EncodeBytes method of the underlying encoders.

func (*MultiEncoder) GetFilterName

func (enc *MultiEncoder) GetFilterName() string

GetFilterName returns the names of the underlying encoding filters, separated by spaces.

func (*MultiEncoder) MakeDecodeParams

func (enc *MultiEncoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*MultiEncoder) MakeStreamDict

func (enc *MultiEncoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object.

func (*MultiEncoder) UpdateParams

func (enc *MultiEncoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type PdfCrypt

type PdfCrypt struct {
	// contains filtered or unexported fields
}

PdfCrypt provides PDF encryption/decryption support. The PDF standard supports encryption of strings and streams (Section 7.6).

func PdfCryptNewDecrypt

func PdfCryptNewDecrypt(parser *PdfParser, ed, trailer *PdfObjectDictionary) (*PdfCrypt, error)

PdfCryptNewDecrypt makes the document crypt handler based on the encryption dictionary and trailer dictionary. Returns an error on failure to process.

func (*PdfCrypt) Decrypt

func (crypt *PdfCrypt) Decrypt(obj PdfObject, parentObjNum, parentGenNum int64) error

Decrypt an object with specified key. For numbered objects, the key argument is not used and a new one is generated based on the object and generation number. Traverses through all the subobjects (recursive).

Does not look up references.. That should be done prior to calling.

func (*PdfCrypt) Encrypt

func (crypt *PdfCrypt) Encrypt(obj PdfObject, parentObjNum, parentGenNum int64) error

Encrypt an object with specified key. For numbered objects, the key argument is not used and a new one is generated based on the object and generation number. Traverses through all the subobjects (recursive).

Does not look up references.. That should be done prior to calling.

func (*PdfCrypt) GetAccessPermissions

func (crypt *PdfCrypt) GetAccessPermissions() security.Permissions

GetAccessPermissions returns the PDF access permissions as an AccessPermissions object.

func (*PdfCrypt) String

func (crypt *PdfCrypt) String() string

String returns a descriptive information string about the encryption method used.

type PdfIndirectObject

type PdfIndirectObject struct {
	PdfObjectReference
	PdfObject
}

PdfIndirectObject represents the primitive PDF indirect object.

func GetIndirect

func GetIndirect(obj PdfObject) (ind *PdfIndirectObject, found bool)

GetIndirect returns the *PdfIndirectObject represented by the PdfObject. On type mismatch the found bool flag is false and a nil pointer is returned.

func MakeIndirectObject

func MakeIndirectObject(obj PdfObject) *PdfIndirectObject

MakeIndirectObject creates an PdfIndirectObject with a specified direct object PdfObject.

func (*PdfIndirectObject) String

func (ind *PdfIndirectObject) String() string

String returns a string describing `ind`.

func (*PdfIndirectObject) WriteString

func (ind *PdfIndirectObject) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObject

type PdfObject interface {
	// String outputs a string representation of the primitive (for debugging).
	String() string

	// WriteString outputs the PDF primitive as written to file as expected by the standard.
	// TODO(dennwc): it should return a byte slice, or accept a writer
	WriteString() string
}

PdfObject is an interface which all primitive PDF objects must implement.

func FlattenObject

func FlattenObject(obj PdfObject) PdfObject

FlattenObject returns the contents of `obj`. In other words, `obj` with indirect objects replaced by their values. The replacements are made recursively to a depth of traceMaxDepth. NOTE: Dicts are sorted to make objects with same contents have the same PDF object strings.

func ParseNumber

func ParseNumber(buf *bufio.Reader) (PdfObject, error)

ParseNumber parses a numeric objects from a buffered stream. Section 7.3.3. Integer or Float.

An integer shall be written as one or more decimal digits optionally preceded by a sign. The value shall be interpreted as a signed decimal integer and shall be converted to an integer object.

A real value shall be written as one or more decimal digits with an optional sign and a leading, trailing, or embedded PERIOD (2Eh) (decimal point). The value shall be interpreted as a real number and shall be converted to a real object.

Regarding exponential numbers: 7.3.3 Numeric Objects: A conforming writer shall not use the PostScript syntax for numbers with non-decimal radices (such as 16#FFFE) or in exponential format (such as 6.02E23). Nonetheless, we sometimes get numbers with exponential format, so we will support it in the reader (no confusion with other types, so no compromise).

func ResolveReference

func ResolveReference(obj PdfObject) PdfObject

ResolveReference resolves reference if `o` is a *PdfObjectReference and returns the object referenced to. Otherwise returns back `o`.

func TraceToDirectObject

func TraceToDirectObject(obj PdfObject) PdfObject

TraceToDirectObject traces a PdfObject to a direct object. For example direct objects contained in indirect objects (can be double referenced even).

type PdfObjectArray

type PdfObjectArray struct {
	// contains filtered or unexported fields
}

PdfObjectArray represents the primitive PDF array object.

func GetArray

func GetArray(obj PdfObject) (arr *PdfObjectArray, found bool)

GetArray returns the *PdfObjectArray represented by the PdfObject directly or indirectly within an indirect object. On type mismatch the found bool flag is false and a nil pointer is returned.

func MakeArray

func MakeArray(objects ...PdfObject) *PdfObjectArray

MakeArray creates an PdfObjectArray from a list of PdfObjects.

func MakeArrayFromFloats

func MakeArrayFromFloats(vals []float64) *PdfObjectArray

MakeArrayFromFloats creates an PdfObjectArray from a slice of float64s, where each array element is an PdfObjectFloat.

func MakeArrayFromIntegers

func MakeArrayFromIntegers(vals []int) *PdfObjectArray

MakeArrayFromIntegers creates an PdfObjectArray from a slice of ints, where each array element is an PdfObjectInteger.

func MakeArrayFromIntegers64

func MakeArrayFromIntegers64(vals []int64) *PdfObjectArray

MakeArrayFromIntegers64 creates an PdfObjectArray from a slice of int64s, where each array element is an PdfObjectInteger.

func (*PdfObjectArray) Append

func (array *PdfObjectArray) Append(objects ...PdfObject)

Append appends PdfObject(s) to the array.

func (*PdfObjectArray) Clear

func (array *PdfObjectArray) Clear()

Clear resets the array to an empty state.

func (*PdfObjectArray) Elements

func (array *PdfObjectArray) Elements() []PdfObject

Elements returns a slice of the PdfObject elements in the array.

func (*PdfObjectArray) Get

func (array *PdfObjectArray) Get(i int) PdfObject

Get returns the i-th element of the array or nil if out of bounds (by index).

func (*PdfObjectArray) GetAsFloat64Slice

func (array *PdfObjectArray) GetAsFloat64Slice() ([]float64, error)

GetAsFloat64Slice returns the array as []float64 slice. Returns an error if not entirely numeric (only PdfObjectIntegers, PdfObjectFloats).

func (*PdfObjectArray) Len

func (array *PdfObjectArray) Len() int

Len returns the number of elements in the array.

func (*PdfObjectArray) Set

func (array *PdfObjectArray) Set(i int, obj PdfObject) error

Set sets the PdfObject at index i of the array. An error is returned if the index is outside bounds.

func (*PdfObjectArray) String

func (array *PdfObjectArray) String() string

String returns a string describing `array`.

func (*PdfObjectArray) ToFloat64Array

func (array *PdfObjectArray) ToFloat64Array() ([]float64, error)

ToFloat64Array returns a slice of all elements in the array as a float64 slice. An error is returned if the array contains non-numeric objects (each element can be either PdfObjectInteger or PdfObjectFloat).

func (*PdfObjectArray) ToInt64Slice

func (array *PdfObjectArray) ToInt64Slice() ([]int64, error)

ToInt64Slice returns a slice of all array elements as an int64 slice. An error is returned if the array non-integer objects. Each element can only be PdfObjectInteger.

func (*PdfObjectArray) ToIntegerArray

func (array *PdfObjectArray) ToIntegerArray() ([]int, error)

ToIntegerArray returns a slice of all array elements as an int slice. An error is returned if the array non-integer objects. Each element can only be PdfObjectInteger.

func (*PdfObjectArray) WriteString

func (array *PdfObjectArray) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectBool

type PdfObjectBool bool

PdfObjectBool represents the primitive PDF boolean object.

func GetBool

func GetBool(obj PdfObject) (bo *PdfObjectBool, found bool)

GetBool returns the *PdfObjectBool object that is represented by a PdfObject directly or indirectly within an indirect object. The bool flag indicates whether a match was found.

func MakeBool

func MakeBool(val bool) *PdfObjectBool

MakeBool creates a PdfObjectBool from a bool value.

func (*PdfObjectBool) String

func (bool *PdfObjectBool) String() string

String returns the state of the bool as "true" or "false".

func (*PdfObjectBool) WriteString

func (bool *PdfObjectBool) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectDictionary

type PdfObjectDictionary struct {
	// contains filtered or unexported fields
}

PdfObjectDictionary represents the primitive PDF dictionary/map object.

func GetDict

func GetDict(obj PdfObject) (dict *PdfObjectDictionary, found bool)

GetDict returns the *PdfObjectDictionary represented by the PdfObject directly or indirectly within an indirect object. On type mismatch the found bool flag is false and a nil pointer is returned.

func MakeDict

func MakeDict() *PdfObjectDictionary

MakeDict creates and returns an empty PdfObjectDictionary.

func (*PdfObjectDictionary) Clear

func (d *PdfObjectDictionary) Clear()

Clear resets the dictionary to an empty state.

func (*PdfObjectDictionary) Get

Get returns the PdfObject corresponding to the specified key. Returns a nil value if the key is not set.

func (*PdfObjectDictionary) GetString

func (d *PdfObjectDictionary) GetString(key PdfObjectName) (string, bool)

GetString is a helper for Get that returns a string value. Returns false if the key is missing or a value is not a string.

func (*PdfObjectDictionary) Keys

func (d *PdfObjectDictionary) Keys() []PdfObjectName

Keys returns the list of keys in the dictionary. If `d` is nil returns a nil slice.

func (*PdfObjectDictionary) Merge

Merge merges in key/values from another dictionary. Overwriting if has same keys. The mutated dictionary (d) is returned in order to allow method chaining.

func (*PdfObjectDictionary) Remove

func (d *PdfObjectDictionary) Remove(key PdfObjectName)

Remove removes an element specified by key.

func (*PdfObjectDictionary) Set

func (d *PdfObjectDictionary) Set(key PdfObjectName, val PdfObject)

Set sets the dictionary's key -> val mapping entry. Overwrites if key already set.

func (*PdfObjectDictionary) SetIfNotNil

func (d *PdfObjectDictionary) SetIfNotNil(key PdfObjectName, val PdfObject)

SetIfNotNil sets the dictionary's key -> val mapping entry -IF- val is not nil. Note that we take care to perform a type switch. Otherwise if we would supply a nil value of another type, e.g. (PdfObjectArray*)(nil), then it would not be a PdfObject(nil) and thus would get set.

func (*PdfObjectDictionary) String

func (d *PdfObjectDictionary) String() string

String returns a string describing `d`.

func (*PdfObjectDictionary) WriteString

func (d *PdfObjectDictionary) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectFloat

type PdfObjectFloat float64

PdfObjectFloat represents the primitive PDF floating point numerical object.

func GetFloat

func GetFloat(obj PdfObject) (fo *PdfObjectFloat, found bool)

GetFloat returns the *PdfObjectFloat represented by the PdfObject directly or indirectly within an indirect object. On type mismatch the found bool flag is false and a nil pointer is returned.

func MakeFloat

func MakeFloat(val float64) *PdfObjectFloat

MakeFloat creates an PdfObjectFloat from a float64.

func (*PdfObjectFloat) String

func (float *PdfObjectFloat) String() string

func (*PdfObjectFloat) WriteString

func (float *PdfObjectFloat) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectInteger

type PdfObjectInteger int64

PdfObjectInteger represents the primitive PDF integer numerical object.

func GetInt

func GetInt(obj PdfObject) (into *PdfObjectInteger, found bool)

GetInt returns the *PdfObjectBool object that is represented by a PdfObject either directly or indirectly within an indirect object. The bool flag indicates whether a match was found.

func MakeInteger

func MakeInteger(val int64) *PdfObjectInteger

MakeInteger creates a PdfObjectInteger from an int64.

func (*PdfObjectInteger) String

func (int *PdfObjectInteger) String() string

func (*PdfObjectInteger) WriteString

func (int *PdfObjectInteger) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectName

type PdfObjectName string

PdfObjectName represents the primitive PDF name object.

func GetName

func GetName(obj PdfObject) (name *PdfObjectName, found bool)

GetName returns the *PdfObjectName represented by the PdfObject directly or indirectly within an indirect object. On type mismatch the found bool flag is false and a nil pointer is returned.

func MakeName

func MakeName(s string) *PdfObjectName

MakeName creates a PdfObjectName from a string.

func (*PdfObjectName) String

func (name *PdfObjectName) String() string

String returns a string representation of `name`.

func (*PdfObjectName) WriteString

func (name *PdfObjectName) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectNull

type PdfObjectNull struct{}

PdfObjectNull represents the primitive PDF null object.

func MakeNull

func MakeNull() *PdfObjectNull

MakeNull creates an PdfObjectNull.

func (*PdfObjectNull) String

func (null *PdfObjectNull) String() string

String returns a string describing `null`.

func (*PdfObjectNull) WriteString

func (null *PdfObjectNull) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectReference

type PdfObjectReference struct {
	ObjectNumber     int64
	GenerationNumber int64
	// contains filtered or unexported fields
}

PdfObjectReference represents the primitive PDF reference object.

func (*PdfObjectReference) GetParser

func (ref *PdfObjectReference) GetParser() *PdfParser

GetParser returns the parser for lazy-loading or compare references.

func (*PdfObjectReference) Resolve

func (ref *PdfObjectReference) Resolve() PdfObject

Resolve resolves the reference and returns the indirect or stream object. If the reference cannot be resolved, a *PdfObjectNull object is returned.

func (*PdfObjectReference) String

func (ref *PdfObjectReference) String() string

String returns a string describing `ref`.

func (*PdfObjectReference) WriteString

func (ref *PdfObjectReference) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectStream

type PdfObjectStream struct {
	PdfObjectReference
	*PdfObjectDictionary
	Stream []byte
}

PdfObjectStream represents the primitive PDF Object stream.

func GetStream

func GetStream(obj PdfObject) (stream *PdfObjectStream, found bool)

GetStream returns the *PdfObjectStream represented by the PdfObject. On type mismatch the found bool flag is false and a nil pointer is returned.

func MakeStream

func MakeStream(contents []byte, encoder StreamEncoder) (*PdfObjectStream, error)

MakeStream creates an PdfObjectStream with specified contents and encoding. If encoding is nil, then raw encoding will be used (i.e. no encoding applied).

func (*PdfObjectStream) String

func (stream *PdfObjectStream) String() string

String returns a string describing `stream`.

func (*PdfObjectStream) WriteString

func (stream *PdfObjectStream) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectStreams

type PdfObjectStreams struct {
	PdfObjectReference
	// contains filtered or unexported fields
}

PdfObjectStreams represents the primitive PDF object streams. 7.5.7 Object Streams (page 45).

func GetObjectStreams

func GetObjectStreams(obj PdfObject) (objStream *PdfObjectStreams, found bool)

GetObjectStreams returns the *PdfObjectStreams represented by the PdfObject. On type mismatch the found bool flag is false and a nil pointer is returned.

func MakeObjectStreams

func MakeObjectStreams(objects ...PdfObject) *PdfObjectStreams

MakeObjectStreams creates an PdfObjectStreams from a list of PdfObjects.

func (*PdfObjectStreams) Append

func (streams *PdfObjectStreams) Append(objects ...PdfObject)

Append appends PdfObject(s) to the streams.

func (*PdfObjectStreams) Elements

func (streams *PdfObjectStreams) Elements() []PdfObject

Elements returns a slice of the PdfObject elements in the array. Preferred over accessing the array directly as type may be changed in future major versions (v3).

func (*PdfObjectStreams) Len

func (streams *PdfObjectStreams) Len() int

Len returns the number of elements in the streams.

func (*PdfObjectStreams) Set

func (streams *PdfObjectStreams) Set(i int, obj PdfObject) error

Set sets the PdfObject at index i of the streams. An error is returned if the index is outside bounds.

func (*PdfObjectStreams) String

func (streams *PdfObjectStreams) String() string

String returns a string describing `streams`.

func (*PdfObjectStreams) WriteString

func (streams *PdfObjectStreams) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfObjectString

type PdfObjectString struct {
	// contains filtered or unexported fields
}

PdfObjectString represents the primitive PDF string object.

func GetString

func GetString(obj PdfObject) (so *PdfObjectString, found bool)

GetString returns the *PdfObjectString represented by the PdfObject directly or indirectly within an indirect object. On type mismatch the found bool flag is false and a nil pointer is returned.

func MakeEncodedString

func MakeEncodedString(s string, utf16BE bool) *PdfObjectString

MakeEncodedString creates a PdfObjectString with encoded content, which can be either UTF-16BE or PDFDocEncoding depending on whether `utf16BE` is true or false respectively.

func MakeHexString

func MakeHexString(s string) *PdfObjectString

MakeHexString creates an PdfObjectString from a string intended for output as a hexadecimal string.

func MakeString

func MakeString(s string) *PdfObjectString

MakeString creates an PdfObjectString from a string. NOTE: PDF does not use utf-8 string encoding like Go so `s` will often not be a utf-8 encoded string.

func MakeStringFromBytes

func MakeStringFromBytes(data []byte) *PdfObjectString

MakeStringFromBytes creates an PdfObjectString from a byte array. This is more natural than MakeString as `data` is usually not utf-8 encoded.

func (*PdfObjectString) Bytes

func (str *PdfObjectString) Bytes() []byte

Bytes returns the PdfObjectString content as a []byte array.

func (*PdfObjectString) Decoded

func (str *PdfObjectString) Decoded() string

Decoded returns the PDFDocEncoding or UTF-16BE decoded string contents. UTF-16BE is applied when the first two bytes are 0xFE, 0XFF, otherwise decoding of PDFDocEncoding is performed.

func (*PdfObjectString) Str

func (str *PdfObjectString) Str() string

Str returns the string value of the PdfObjectString. Defined in addition to String() function to clarify that this function returns the underlying string directly, whereas the String function technically could include debug info.

func (*PdfObjectString) String

func (str *PdfObjectString) String() string

String returns a string representation of the *PdfObjectString.

func (*PdfObjectString) WriteString

func (str *PdfObjectString) WriteString() string

WriteString outputs the object as it is to be written to file.

type PdfParser

type PdfParser struct {
	ObjCache objectCache
	// contains filtered or unexported fields
}

PdfParser parses a PDF file and provides access to the object structure of the PDF.

func NewParser

func NewParser(rs io.ReadSeeker) (*PdfParser, error)

NewParser creates a new parser for a PDF file via ReadSeeker. Loads the cross reference stream and trailer. An error is returned on failure.

func NewParserFromString

func NewParserFromString(txt string) *PdfParser

NewParserFromString is used for testing purposes.

func (*PdfParser) CheckAccessRights

func (parser *PdfParser) CheckAccessRights(password []byte) (bool, security.Permissions, error)

CheckAccessRights checks access rights and permissions for a specified password. If either user/owner password is specified, full rights are granted, otherwise the access rights are specified by the Permissions flag.

The bool flag indicates that the user can access and view the file. The AccessPermissions shows what access the user has for editing etc. An error is returned if there was a problem performing the authentication.

func (*PdfParser) Decrypt

func (parser *PdfParser) Decrypt(password []byte) (bool, error)

Decrypt attempts to decrypt the PDF file with a specified password. Also tries to decrypt with an empty password. Returns true if successful, false otherwise. An error is returned when there is a problem with decrypting.

func (*PdfParser) GetCrypter

func (parser *PdfParser) GetCrypter() *PdfCrypt

GetCrypter returns the PdfCrypt instance which has information about the PDFs encryption.

func (*PdfParser) GetFileOffset

func (parser *PdfParser) GetFileOffset() int64

GetFileOffset returns the current file offset, accounting for buffered position.

func (*PdfParser) GetObjectNums

func (parser *PdfParser) GetObjectNums() []int

GetObjectNums returns a sorted list of object numbers of the PDF objects in the file.

func (*PdfParser) GetTrailer

func (parser *PdfParser) GetTrailer() *PdfObjectDictionary

GetTrailer returns the PDFs trailer dictionary. The trailer dictionary is typically the starting point for a PDF, referencing other key objects that are important in the document structure.

func (*PdfParser) GetXrefOffset

func (parser *PdfParser) GetXrefOffset() int64

GetXrefOffset returns the offset of the xref table.

func (*PdfParser) GetXrefTable

func (parser *PdfParser) GetXrefTable() XrefTable

GetXrefTable returns the PDFs xref table.

func (*PdfParser) GetXrefType

func (parser *PdfParser) GetXrefType() *xrefType

GetXrefType returns the type of the first xref object (table or stream).

func (*PdfParser) Inspect

func (parser *PdfParser) Inspect() (map[string]int, error)

Inspect analyzes the document object structure. Returns a map of object types (by name) with the instance count as value.

func (*PdfParser) IsAuthenticated

func (parser *PdfParser) IsAuthenticated() bool

IsAuthenticated returns true if the PDF has already been authenticated for accessing.

func (*PdfParser) IsEncrypted

func (parser *PdfParser) IsEncrypted() (bool, error)

IsEncrypted checks if the document is encrypted. A bool flag is returned indicating the result. First time when called, will check if the Encrypt dictionary is accessible through the trailer dictionary. If encrypted, prepares a crypt datastructure which can be used to authenticate and decrypt the document. On failure, an error is returned.

func (*PdfParser) LookupByNumber

func (parser *PdfParser) LookupByNumber(objNumber int) (PdfObject, error)

LookupByNumber looks up a PdfObject by object number. Returns an error on failure.

func (*PdfParser) LookupByReference

func (parser *PdfParser) LookupByReference(ref PdfObjectReference) (PdfObject, error)

LookupByReference looks up a PdfObject by a reference.

func (*PdfParser) ParseDict

func (parser *PdfParser) ParseDict() (*PdfObjectDictionary, error)

ParseDict reads and parses a PDF dictionary object enclosed with '<<' and '>>'

func (*PdfParser) ParseIndirectObject

func (parser *PdfParser) ParseIndirectObject() (PdfObject, error)

ParseIndirectObject parses an indirect object from the input stream. Can also be an object stream. Returns the indirect object (*PdfIndirectObject) or the stream object (*PdfObjectStream).

func (*PdfParser) PdfVersion

func (parser *PdfParser) PdfVersion() Version

PdfVersion returns version of the PDF file.

func (*PdfParser) ReadAtLeast

func (parser *PdfParser) ReadAtLeast(p []byte, n int) (int, error)

ReadAtLeast reads at least n bytes into slice p. Returns the number of bytes read (should always be == n), and an error on failure.

func (*PdfParser) ReadBytesAt

func (parser *PdfParser) ReadBytesAt(offset, len int64) ([]byte, error)

ReadBytesAt reads byte content at specific offset and length within the PDF.

func (*PdfParser) Resolve

func (parser *PdfParser) Resolve(obj PdfObject) (PdfObject, error)

Resolve resolves a PdfObject to direct object, looking up and resolving references as needed (unlike TraceToDirect).

func (*PdfParser) SetFileOffset

func (parser *PdfParser) SetFileOffset(offset int64)

SetFileOffset sets the file to an offset position and resets buffer.

type RawEncoder

type RawEncoder struct{}

RawEncoder implements Raw encoder/decoder (no encoding, pass through)

func NewRawEncoder

func NewRawEncoder() *RawEncoder

NewRawEncoder returns a new instace of RawEncoder.

func (*RawEncoder) DecodeBytes

func (enc *RawEncoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes returns the passed in slice of bytes. The purpose of the method is to satisfy the StreamEncoder interface.

func (*RawEncoder) DecodeStream

func (enc *RawEncoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream returns the passed in stream as a slice of bytes. The purpose of the method is to satisfy the StreamEncoder interface.

func (*RawEncoder) EncodeBytes

func (enc *RawEncoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes returns the passed in slice of bytes. The purpose of the method is to satisfy the StreamEncoder interface.

func (*RawEncoder) GetFilterName

func (enc *RawEncoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*RawEncoder) MakeDecodeParams

func (enc *RawEncoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*RawEncoder) MakeStreamDict

func (enc *RawEncoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object.

func (*RawEncoder) UpdateParams

func (enc *RawEncoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type RunLengthEncoder

type RunLengthEncoder struct {
}

RunLengthEncoder represents Run length encoding.

func NewRunLengthEncoder

func NewRunLengthEncoder() *RunLengthEncoder

NewRunLengthEncoder makes a new run length encoder

func (*RunLengthEncoder) DecodeBytes

func (enc *RunLengthEncoder) DecodeBytes(encoded []byte) ([]byte, error)

DecodeBytes decodes a byte slice from Run length encoding.

7.4.5 RunLengthDecode Filter The RunLengthDecode filter decodes data that has been encoded in a simple byte-oriented format based on run length. The encoded data shall be a sequence of runs, where each run shall consist of a length byte followed by 1 to 128 bytes of data. If the length byte is in the range 0 to 127, the following length + 1 (1 to 128) bytes shall be copied literally during decompression. If length is in the range 129 to 255, the following single byte shall be copied 257 - length (2 to 128) times during decompression. A length value of 128 shall denote EOD.

func (*RunLengthEncoder) DecodeStream

func (enc *RunLengthEncoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error)

DecodeStream decodes RunLengthEncoded stream object and give back decoded bytes.

func (*RunLengthEncoder) EncodeBytes

func (enc *RunLengthEncoder) EncodeBytes(data []byte) ([]byte, error)

EncodeBytes encodes a bytes array and return the encoded value based on the encoder parameters.

func (*RunLengthEncoder) GetFilterName

func (enc *RunLengthEncoder) GetFilterName() string

GetFilterName returns the name of the encoding filter.

func (*RunLengthEncoder) MakeDecodeParams

func (enc *RunLengthEncoder) MakeDecodeParams() PdfObject

MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings.

func (*RunLengthEncoder) MakeStreamDict

func (enc *RunLengthEncoder) MakeStreamDict() *PdfObjectDictionary

MakeStreamDict makes a new instance of an encoding dictionary for a stream object.

func (*RunLengthEncoder) UpdateParams

func (enc *RunLengthEncoder) UpdateParams(params *PdfObjectDictionary)

UpdateParams updates the parameter values of the encoder.

type StreamEncoder

type StreamEncoder interface {
	GetFilterName() string
	MakeDecodeParams() PdfObject
	MakeStreamDict() *PdfObjectDictionary
	UpdateParams(params *PdfObjectDictionary)

	EncodeBytes(data []byte) ([]byte, error)
	DecodeBytes(encoded []byte) ([]byte, error)
	DecodeStream(streamObj *PdfObjectStream) ([]byte, error)
}

StreamEncoder represents the interface for all PDF stream encoders.

func NewEncoderFromStream

func NewEncoderFromStream(streamObj *PdfObjectStream) (StreamEncoder, error)

NewEncoderFromStream creates a StreamEncoder based on the stream's dictionary.

type Version

type Version struct {
	Major int
	Minor int
}

Version represents a version of a PDF standard.

func (Version) String

func (v Version) String() string

String returns the PDF version as a string. Implements interface fmt.Stringer.

type XrefObject

type XrefObject struct {
	XType        xrefType
	ObjectNumber int
	Generation   int
	// For normal xrefs (defined by OFFSET)
	Offset int64
	// For xrefs to object streams.
	OsObjNumber int
	OsObjIndex  int
}

XrefObject defines a cross reference entry which is a map between object number (with generation number) and the location of the actual object, either as a file offset (xref table entry), or as a location within an xref stream object (xref object stream).

type XrefTable

type XrefTable struct {
	ObjectMap map[int]XrefObject // Maps object number to XrefObject
	// contains filtered or unexported fields
}

XrefTable represents the cross references in a PDF, i.e. the table of objects and information where to access within the PDF file.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL