pdfcpu

package
v0.4.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 15, 2023 License: Apache-2.0 Imports: 38 Imported by: 0

Documentation

Overview

Package pdfcpu is a PDF processing library written in Go supporting encryption. It provides an API and a command line interface. Supported are all versions up to PDF 1.7 (ISO-32000).

The commands are:

annotations   list, remove page annotations
attachments   list, add, remove, extract embedded file attachments
booklet       arrange pages onto larger sheets of paper to make a booklet or zine
boxes         list, add, remove page boundaries for selected pages
changeopw     change owner password
changeupw     change user password
collect       create custom sequence of selected pages
config        print configuration
create        create PDF content including forms via JSON
crop          set cropbox for selected pages
decrypt       remove password protection
encrypt       set password protection
extract       extract images, fonts, content, pages or metadata
fonts         install, list supported fonts, create cheat sheets
form          list, remove fields, lock, unlock, reset, export, fill form via JSON or CSV
grid          rearrange pages or images for enhanced browsing experience
images        list images for selected pages
import        import/convert images to PDF
info          print file info
keywords      list, add, remove keywords
merge         concatenate PDFs
nup           rearrange pages or images for reduced number of pages
optimize      optimize PDF by getting rid of redundant page resources
pages         insert, remove selected pages
paper         print list of supported paper sizes
permissions   list, set user access permissions
portfolio     list, add, remove, extract portfolio entries with optional description
properties    list, add, remove document properties
resize        scale selected pages
rotate        rotate selected pages
selectedpages print definition of the -pages flag
split         split up a PDF by span or bookmark
stamp         add, remove, update Unicode text, image or PDF stamps for selected pages
trim          create trimmed version of selected pages
validate      validate PDF against PDF 32000-1:2008 (PDF 1.7)
version       print version
watermark     add, remove, update Unicode text, image or PDF watermarks for selected pages

Index

Constants

View Source
const (

	// ObjectStreamMaxObjects limits the number of objects within an object stream written.
	ObjectStreamMaxObjects = 100
)

Variables

View Source
var (
	ErrUnknownEncryption = errors.New("pdfcpu: PDF 2.0 encryption not supported")
)
View Source
var (
	ErrUnsupported16BPC = errors.New("unsupported 16 bits per component")
)

Errors to be identified.

View Source
var (
	ErrWrongPassword = errors.New("pdfcpu: please provide the correct password")
)

Functions

func AddAnnotation

func AddAnnotation(
	ctx *model.Context,
	pageDictIndRef *types.IndirectRef,
	pageDict types.Dict,
	pageNr int,
	ar model.AnnotationRenderer,
	incr bool) (bool, error)

AddAnnotation adds ar to pageDict.

func AddAnnotations

func AddAnnotations(ctx *model.Context, selectedPages types.IntSet, ar model.AnnotationRenderer, incr bool) (bool, error)

AddAnnotations adds ar to selected pages.

func AddAnnotationsMap

func AddAnnotationsMap(ctx *model.Context, m map[int][]model.AnnotationRenderer, incr bool) (bool, error)

AddAnnotationsMap adds annotations in m to corresponding pages.

func AddBookmarks

func AddBookmarks(ctx *model.Context, bms []Bookmark) error

AddBookmarks adds bms to ctx.

func AddPageTreeWithSamplePage

func AddPageTreeWithSamplePage(xRefTable *model.XRefTable, rootDict types.Dict, p model.Page) error

func AddPages

func AddPages(ctxSrc, ctxDest *model.Context, pageNrs []int, usePgCache bool) error

AddPages adds pages and corresponding resources from ctxSrc to ctxDest.

func AddWatermarks

func AddWatermarks(ctx *model.Context, selectedPages types.IntSet, wm *model.Watermark) error

AddWatermarks adds watermarks to all pages selected.

func AddWatermarksMap

func AddWatermarksMap(ctx *model.Context, m map[int]*model.Watermark) error

AddWatermarksMap adds watermarks in m to corresponding pages.

func AddWatermarksSliceMap

func AddWatermarksSliceMap(ctx *model.Context, m map[int][]*model.Watermark) error

AddWatermarksSliceMap adds watermarks in m to corresponding pages.

func Annotation

func Annotation(xRefTable *model.XRefTable, d types.Dict) (model.AnnotationRenderer, error)

Annotation returns an annotation renderer. Validation sets up a cache of annotation renderers.

func AppendStatsFile

func AppendStatsFile(ctx *model.Context) error

AppendStatsFile appends a stats line for this xRefTable to the configured csv file name.

func BookletFromImages

func BookletFromImages(ctx *model.Context, fileNames []string, nup *model.NUp, pagesDict types.Dict, pagesIndRef *types.IndirectRef) error

BookletFromImages creates a booklet version of the image sequence represented by fileNames.

func BookletFromPDF

func BookletFromPDF(ctx *model.Context, selectedPages types.IntSet, nup *model.NUp) error

BookletFromPDF creates a booklet version of the PDF represented by xRefTable.

func CachedAnnotationObjNrs

func CachedAnnotationObjNrs(ctx *model.Context) ([]int, error)

CachedAnnotationObjNrs returns a list of object numbers representing known annotation dict indirect references.

func CollectPages

func CollectPages(ctx *model.Context, collectedPages []int) (*model.Context, error)

CollectPages creates a new PDF Context for a custom PDF page sequence of the PDF represented by ctx.

func ColorSpaceComponents

func ColorSpaceComponents(xRefTable *model.XRefTable, sd *types.StreamDict) (int, error)

ColorSpaceComponents returns the corresponding number of used color components for sd's colorspace.

func ColorSpaceString

func ColorSpaceString(ctx *model.Context, sd *types.StreamDict) (string, error)

ColorSpaceString returns a string representation for sd's colorspace.

func CreateAcroFormDemoXRef

func CreateAcroFormDemoXRef() (*model.XRefTable, error)

CreateAcroFormDemoXRef creates an xRefTable with an AcroForm example.

func CreateAnnotationDemoXRef

func CreateAnnotationDemoXRef() (*model.XRefTable, error)

CreateAnnotationDemoXRef creates a PDF file with examples of annotations and actions.

func CreateContext

func CreateContext(xRefTable *model.XRefTable, conf *model.Configuration) *model.Context

CreateContext creates a Context for given cross reference table and configuration.

func CreateContextWithXRefTable

func CreateContextWithXRefTable(conf *model.Configuration, pageDim *types.Dim) (*model.Context, error)

CreateContextWithXRefTable creates a Context with an xRefTable without pages for given configuration.

func CreateDemoXRef

func CreateDemoXRef() (*model.XRefTable, error)

CreateDemoXRef creates a minimal single page PDF file for demo purposes.

func CreateResourceDictInheritanceDemoXRef

func CreateResourceDictInheritanceDemoXRef() (*model.XRefTable, error)

CreateResourceDictInheritanceDemoXRef creates a page tree for testing resource dict inheritance.

func CreateTestPageContent

func CreateTestPageContent(p model.Page)

CreateTestPageContent draws a test grid.

func CreateXRefTableWithRootDict

func CreateXRefTableWithRootDict() (*model.XRefTable, error)

func DefaultBookletConfig

func DefaultBookletConfig() *model.NUp

DefaultBookletConfig returns the default configuration for a booklet

func DetectPageTreeWatermarks

func DetectPageTreeWatermarks(ctx *model.Context) error

DetectPageTreeWatermarks checks xRefTable's page tree for watermarks and records the result to xRefTable.Watermarked.

func DetectWatermarks

func DetectWatermarks(ctx *model.Context) error

DetectWatermarks checks ctx for watermarks and records the result to xRefTable.Watermarked.

func ExtractImage

func ExtractImage(ctx *model.Context, sd *types.StreamDict, thumb bool, resourceId string, objNr int, stub bool) (*model.Image, error)

ExtractImage extracts an image from sd.

func ExtractPage

func ExtractPage(ctx *model.Context, pageNr int) (*model.Context, error)

ExtractPage extracts pageNr into a new single page context.

func ExtractPageContent

func ExtractPageContent(ctx *model.Context, pageNr int) (io.Reader, error)

ExtractPageContent extracts the consolidated page content stream for pageNr.

func ExtractPageImages

func ExtractPageImages(ctx *model.Context, pageNr int, stub bool) (map[int]model.Image, error)

ExtractPageImages extracts all images used by pageNr. Optionally return stubs only.

func ExtractPages

func ExtractPages(ctx *model.Context, pageNrs []int, usePgCache bool) (*model.Context, error)

ExtractPages extracts pageNrs into a new single page context.

func FontObjNrs

func FontObjNrs(ctx *model.Context, pageNr int) []int

FontObjNrs returns all font dict objNrs for pageNr. Requires an optimized context.

func ImageBookletConfig

func ImageBookletConfig(val int, desc string) (*model.NUp, error)

ImageBookletConfig returns an NUp configuration for booklet-ing image files.

func ImageGridConfig

func ImageGridConfig(rows, cols int, desc string) (*model.NUp, error)

ImageGridConfig returns a grid configuration for Nup-ing image files.

func ImageNUpConfig

func ImageNUpConfig(val int, desc string) (*model.NUp, error)

ImageNUpConfig returns an NUp configuration for Nup-ing image files.

func ImageObjNrs

func ImageObjNrs(ctx *model.Context, pageNr int) []int

ImageObjNrs returns all image dict objNrs for pageNr. Requires an optimized context.

func InfoDigest

func InfoDigest(ctx *model.Context, selectedPages types.IntSet) ([]string, error)

InfoDigest returns info about ctx.

func KeywordsAdd

func KeywordsAdd(xRefTable *model.XRefTable, keywords []string) error

KeywordsAdd adds keywords to the document info dict. Returns true if at least one keyword was added.

func KeywordsList

func KeywordsList(xRefTable *model.XRefTable) ([]string, error)

KeywordsList returns a list of keywords as recorded in the document info dict.

func KeywordsRemove

func KeywordsRemove(xRefTable *model.XRefTable, keywords []string) (bool, error)

KeywordsRemove deletes keywords from the document info dict. Returns true if at least one keyword was removed.

func ListAnnotations

func ListAnnotations(ctx *model.Context, selectedPages types.IntSet) (int, []string, error)

ListAnnotations returns a formatted list of annotations for selected pages.

func ListImages

func ListImages(ctx *model.Context, selectedPages types.IntSet) ([]string, error)

ListImages returns a list of embedded images.

func MergeXRefTables

func MergeXRefTables(ctxSource, ctxDest *model.Context) (err error)

MergeXRefTables merges Context ctxSource into ctxDest by appending its page tree.

func NUpFromMultipleImages

func NUpFromMultipleImages(ctx *model.Context, fileNames []string, nup *model.NUp, pagesDict types.Dict, pagesIndRef *types.IndirectRef) error

NUpFromMultipleImages creates pages in NUp-style rendering each image once.

func NUpFromOneImage

func NUpFromOneImage(ctx *model.Context, fileName string, nup *model.NUp, pagesDict types.Dict, pagesIndRef *types.IndirectRef) error

NUpFromOneImage creates one page with instances of one image.

func NUpFromPDF

func NUpFromPDF(ctx *model.Context, selectedPages types.IntSet, nup *model.NUp) error

NUpFromPDF creates an n-up version of the PDF represented by xRefTable.

func NewNUpPageForImage

func NewNUpPageForImage(xRefTable *model.XRefTable, fileName string, parentIndRef *types.IndirectRef, nup *model.NUp) (*types.IndirectRef, error)

NewNUpPageForImage creates a new page dict in xRefTable for given image filename and n-up conf.

func NewPageForImage

func NewPageForImage(xRefTable *model.XRefTable, r io.Reader, parentIndRef *types.IndirectRef, imp *Import) (*types.IndirectRef, error)

NewPageForImage creates a new page dict in xRefTable for given image reader r.

func OptimizeXRefTable

func OptimizeXRefTable(ctx *model.Context) error

OptimizeXRefTable optimizes an xRefTable by locating and getting rid of redundant embedded fonts and images.

func PDFBookletConfig

func PDFBookletConfig(val int, desc string) (*model.NUp, error)

PDFBookletConfig returns an NUp configuration for booklet-ing PDF files.

func PDFGridConfig

func PDFGridConfig(rows, cols int, desc string) (*model.NUp, error)

PDFGridConfig returns a grid configuration for Nup-ing PDF files.

func PDFNUpConfig

func PDFNUpConfig(val int, desc string) (*model.NUp, error)

PDFNUpConfig returns an NUp configuration for Nup-ing PDF files.

func PageObjFromDestinationArray

func PageObjFromDestinationArray(ctx *model.Context, dest types.Object) (*types.IndirectRef, error)

PageObjFromDestinationArray return an IndirectRef for this destinations page object.

func ParseImageWatermarkDetails

func ParseImageWatermarkDetails(fileName, desc string, onTop bool, u types.DisplayUnit) (*model.Watermark, error)

ParseImageWatermarkDetails parses an image Watermark/Stamp command string into an internal structure.

func ParseNUpDetails

func ParseNUpDetails(s string, nup *model.NUp) error

ParseNUpDetails parses a NUp command string into an internal structure.

func ParseNUpGridDefinition

func ParseNUpGridDefinition(rows, cols int, nUp *model.NUp) error

ParseNUpGridDefinition parses NUp grid dimensions into an internal structure.

func ParseNUpValue

func ParseNUpValue(n int, nUp *model.NUp) error

ParseNUpValue parses the NUp value into an internal structure.

func ParseObject

func ParseObject(ctx *model.Context, offset int64, objNr, genNr int) (types.Object, error)

ParseObject parses an object from file at given offset.

func ParsePDFWatermarkDetails

func ParsePDFWatermarkDetails(fileName, desc string, onTop bool, u types.DisplayUnit) (*model.Watermark, error)

ParsePDFWatermarkDetails parses a PDF Watermark/Stamp command string into an internal structure.

func ParseResizeConfig

func ParseResizeConfig(s string, u types.DisplayUnit) (*model.Resize, error)

ParseResizeConfig parses a Resize command string into an internal structure. "scale:.5, form:A4, dim:400 200 bgcol:#D00000"

func ParseTextWatermarkDetails

func ParseTextWatermarkDetails(text, desc string, onTop bool, u types.DisplayUnit) (*model.Watermark, error)

ParseTextWatermarkDetails parses a text Watermark/Stamp command string into an internal structure.

func Permissions

func Permissions(ctx *model.Context) (list []string)

Permissions returns a list of set permissions.

func PropertiesAdd

func PropertiesAdd(ctx *model.Context, properties map[string]string) error

PropertiesAdd adds properties into the document info dict. Returns true if at least one property was added.

func PropertiesList

func PropertiesList(ctx *model.Context) ([]string, error)

PropertiesList returns a list of document properties as recorded in the document info dict.

func PropertiesRemove

func PropertiesRemove(ctx *model.Context, properties []string) (bool, error)

PropertiesRemove deletes specified properties. Returns true if at least one property was removed.

func Read

func Read(rs io.ReadSeeker, conf *model.Configuration) (*model.Context, error)

Read takes a readSeeker and generates a Context, an in-memory representation containing a cross reference table.

func ReadFile

func ReadFile(inFile string, conf *model.Configuration) (*model.Context, error)

ReadFile reads in a PDF file and builds an internal structure holding its cross reference table aka the Context.

func RemoveAnnotations

func RemoveAnnotations(ctx *model.Context, selectedPages types.IntSet, idsAndTypes []string, objNrs []int, incr bool) (bool, error)

RemoveAnnotations removes annotations for selected pages by id, type or object number. All annotations for selected pages are removed if neither idsAndTypes nor objNrs are provided.

func RemoveAnnotationsFromPageDict

func RemoveAnnotationsFromPageDict(
	ctx *model.Context,
	annotTypes []model.AnnotationType,
	ids []string,
	objNrSet types.IntSet,
	pageDict types.Dict,
	pageDictObjNr,
	pageNr int,
	incr bool) (bool, error)

RemoveAnnotationsFromPageDict removes an annotation by annotType, id and obj# from pageDict.

func RemoveWatermarks

func RemoveWatermarks(ctx *model.Context, selectedPages types.IntSet) error

RemoveWatermarks removes watermarks for all pages selected.

func RenderImage

func RenderImage(xRefTable *model.XRefTable, sd *types.StreamDict, thumb bool, resourceName string, objNr int) (io.Reader, string, error)

RenderImage returns a reader for a decoded image stream.

func Resize

func Resize(ctx *model.Context, selectedPages types.IntSet, res *model.Resize) error

func RotatePages

func RotatePages(ctx *model.Context, selectedPages types.IntSet, rotation int) error

RotatePages rotates all selected pages by a multiple of 90 degrees.

func StreamLength

func StreamLength(ctx *model.Context, sd *types.StreamDict) (int64, error)

StreamLength returns sd's stream length.

func Write

func Write(ctx *model.Context) (err error)

Write generates a PDF file for the cross reference table contained in Context.

func WriteImage

func WriteImage(xRefTable *model.XRefTable, fileName string, sd *types.StreamDict, thumb bool, objNr int) (string, error)

WriteImage writes a PDF image object to disk.

func WriteImageToDisk

func WriteImageToDisk(outDir, fileName string) func(model.Image, bool, int) error

WriteImageToDisk returns a closure for writing img to disk.

func WriteIncrement

func WriteIncrement(ctx *model.Context) error

WriteIncrement writes a PDF increment..

func WriteReader

func WriteReader(path string, r io.Reader) error

WriteReader consumes r's content by writing it to a file at path.

Types

type Bookmark

type Bookmark struct {
	Title    string
	PageFrom int
	PageThru int // for extraction only; >= pageFrom and reaches until before pageFrom of the next bookmark.
	Bold     bool
	Italic   bool
	Color    *color.SimpleColor
	Children []Bookmark
	Parent   *Bookmark
}

Bookmark represents an outline item tree.

func BookmarksForOutline

func BookmarksForOutline(ctx *model.Context) ([]Bookmark, error)

BookmarksForOutline returns all ctx bookmark information recursively.

func BookmarksForOutlineItem

func BookmarksForOutlineItem(ctx *model.Context, item *types.IndirectRef, parent *Bookmark) ([]Bookmark, error)

BookmarksForOutlineItem returns the bookmarks tree for an outline item.

func (Bookmark) Style

func (bm Bookmark) Style() int

Style returns an int corresponding to the bookmark style.

type Font

type Font struct {
	io.Reader
	Name string
	Type string
}

Font is a Reader representing an embedded font.

func ExtractFont

func ExtractFont(ctx *model.Context, fontObject model.FontObject, objNr int) (*Font, error)

ExtractFont extracts a font from fontObject.

func ExtractFormFonts

func ExtractFormFonts(ctx *model.Context) ([]Font, error)

ExtractPageFonts extracts all form fonts.

func ExtractPageFonts

func ExtractPageFonts(ctx *model.Context, pageNr int) ([]Font, error)

ExtractPageFonts extracts all fonts used by pageNr.

type Import

type Import struct {
	PageDim  *types.Dim        // page dimensions in display unit.
	PageSize string            // one of A0,A1,A2,A3,A4(=default),A5,A6,A7,A8,Letter,Legal,Ledger,Tabloid,Executive,ANSIC,ANSID,ANSIE.
	UserDim  bool              // true if one of dimensions or paperSize provided overriding the default.
	DPI      int               // destination resolution to apply in dots per inch.
	Pos      types.Anchor      // position anchor, one of tl,tc,tr,l,c,r,bl,bc,br,full.
	Dx, Dy   int               // anchor offset.
	Scale    float64           // relative scale factor. 0 <= x <= 1
	ScaleAbs bool              // true for absolute scaling.
	InpUnit  types.DisplayUnit // input display unit.
	Gray     bool              // true for rendering in Gray.
	Sepia    bool
	BgColor  *color.SimpleColor // background color
}

Import represents the command details for the command "ImportImage".

func DefaultImportConfig

func DefaultImportConfig() *Import

DefaultImportConfig returns the default configuration.

func ParseImportDetails

func ParseImportDetails(s string, u types.DisplayUnit) (*Import, error)

ParseImportDetails parses an Import command string into an internal structure.

func (Import) String

func (imp Import) String() string

type Metadata

type Metadata struct {
	io.Reader          // metadata
	ObjNr       int    // metadata dict objNr
	ParentObjNr int    // container object number
	ParentType  string // container dict type
}

Metadata is a Reader representing a metadata dict.

func ExtractMetadata

func ExtractMetadata(ctx *model.Context) ([]Metadata, error)

ExtractMetadata returns all metadata of ctx.

type PDFImage

type PDFImage struct {
	// contains filtered or unexported fields
}

PDFImage represents a XObject of subtype image.

Directories

Path Synopsis
Package validate implements validation against PDF 32000-1:2008.
Package validate implements validation against PDF 32000-1:2008.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL