cmap

package
v0.4.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 26, 2024 License: GPL-3.0 Imports: 22 Imported by: 0

Documentation

Overview

Package cmap implements CMap files for embedding in PDF files.

When composite fonts are used in PDF files, glyphs are selected using a two-step process: first, character codes are mapped to character identifiers (CIDs), and then CIDs are mapped to glyph identifiers (GIDs). A CMap file describes the first step of this process, i.e. the mapping from character codes to CIDs.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CIDEncoder

type CIDEncoder interface {
	// AppendEncoded appends the character code for the given glyph ID
	// to the given PDF string (allocating new codes as needed).
	// It also records the fact that the character code corresponds to the
	// given unicode string.
	AppendEncoded(pdf.String, glyph.ID, []rune) pdf.String

	CodeAndCID(pdf.String, glyph.ID, []rune) (pdf.String, pscid.CID)

	CS() charcode.CodeSpaceRange

	Lookup(c charcode.CharCode) (pscid.CID, bool)

	// CMap returns the mapping from character codes to CID values.
	CMap() *Info

	// ToUnicode returns a PDF ToUnicode CMap.
	ToUnicode() *ToUnicode

	// Subset is the set of all GIDs which have been used with AppendEncoded.
	// The returned slice is sorted and always starts with GID 0.
	Subset() []glyph.ID

	AsText(pdf.String) []rune

	AllCIDs(pdf.String) func(yield func([]byte, pscid.CID) bool) bool
}

CIDEncoder constructs and stores mappings from character codes to CID values and from character codes to unicode strings.

func NewCIDEncoderIdentity added in v0.3.6

func NewCIDEncoderIdentity(g2c GIDToCID) CIDEncoder

NewCIDEncoderIdentity returns an encoder where two-byte codes are used directly as CID values.

func NewCIDEncoderUTF8 added in v0.3.6

func NewCIDEncoderUTF8(g2c GIDToCID) CIDEncoder

NewCIDEncoderUTF8 returns an encoder where character codes equal the UTF-8 encoding of the text content, where possible.

type GIDToCID added in v0.3.5

type GIDToCID interface {
	CID(glyph.ID, []rune) pscid.CID
	GID(pscid.CID) glyph.ID

	ROS() *pscid.SystemInfo

	GIDToCID(numGlyph int) []pscid.CID
}

GIDToCID encodes a mapping from Glyph Identifier (GID) values to Character Identifier (CID) values.

func NewGIDToCIDIdentity added in v0.4.0

func NewGIDToCIDIdentity() GIDToCID

NewGIDToCIDIdentity returns a GIDToCID which uses the GID values directly as CID values.

func NewGIDToCIDSequential added in v0.4.0

func NewGIDToCIDSequential() GIDToCID

NewGIDToCIDSequential returns a GIDToCID which assigns CID values sequentially, starting with 1.

type Info added in v0.3.3

type Info struct {
	Name string
	ROS  *cid.SystemInfo
	charcode.CodeSpaceRange
	CSFile  charcode.CodeSpaceRange // TODO(voss): remove this
	WMode   int
	UseCMap string
	Singles []SingleEntry
	Ranges  []RangeEntry
}

Info holds the information for a PDF CMap.

func Extract added in v0.3.5

func Extract(r pdf.Getter, obj pdf.Object) (*Info, error)

Extract reads a CMap from a PDF file.

func New added in v0.3.5

New allocates a new CMap object.

func Read added in v0.3.5

func Read(r io.Reader, other map[string]*Info) (*Info, error)

func (*Info) Embed added in v0.3.5

func (info *Info) Embed(w pdf.Putter, ref pdf.Reference, other map[string]pdf.Reference) error

func (*Info) GetMapping added in v0.3.5

func (info *Info) GetMapping() map[charcode.CharCode]cid.CID

GetMapping returns the mapping information from info.

func (*Info) IsIdentity added in v0.3.5

func (info *Info) IsIdentity() bool

IsIdentity returns true if all codes are equal to the corresponding CID.

func (*Info) IsPredefined added in v0.3.6

func (info *Info) IsPredefined() bool

IsPredefined returns true if the CMap is one of the CMaps predefined in PDF.

func (*Info) MaxCID added in v0.3.5

func (info *Info) MaxCID() cid.CID

MaxCID returns the largest CID used by this CMap.

func (*Info) SetMapping added in v0.3.5

func (info *Info) SetMapping(m map[charcode.CharCode]cid.CID)

SetMapping replaces the mapping information in info with the given mapping.

To make efficient use of range entries, the generated mapping may be a superset of the original mapping, i.e. it may contain entries for charcodes which were not mapped in the original mapping.

func (*Info) Write added in v0.3.5

func (info *Info) Write(w io.Writer) error

type RangeEntry added in v0.3.6

type RangeEntry struct {
	First charcode.CharCode
	Last  charcode.CharCode
	Value cid.CID
}

RangeEntry describes a range of character codes with consecutive CIDs. First and Last are the first and last code points in the range. Value is the CID of the first code point in the range.

type RangeTUEntry added in v0.3.6

type RangeTUEntry struct {
	First  charcode.CharCode
	Last   charcode.CharCode
	Values [][]rune
}

RangeTUEntry describes a range of character codes. First and Last are the first and last code points in the range. Values is a list of unicode strings. If the list has length one, then the replacement character is incremented by one for each code point in the range. Otherwise, the list must have the length Last-First+1, and specify the value for each code point in the range.

func (RangeTUEntry) String added in v0.3.6

func (r RangeTUEntry) String() string

type SingleEntry added in v0.3.6

type SingleEntry struct {
	Code  charcode.CharCode
	Value cid.CID
}

SingleEntry specifies that character code Code represents the given CID.

type SingleTUEntry added in v0.3.6

type SingleTUEntry struct {
	Code  charcode.CharCode
	Value []rune
}

SingleTUEntry specifies that character code Code represents the given unicode string.

func (SingleTUEntry) String added in v0.3.6

func (s SingleTUEntry) String() string

type ToUnicode added in v0.3.6

type ToUnicode struct {
	CS      charcode.CodeSpaceRange
	Singles []SingleTUEntry
	Ranges  []RangeTUEntry
}

ToUnicode holds the information for a PDF ToUnicode cmap.

func ExtractToUnicode added in v0.3.6

func ExtractToUnicode(r pdf.Getter, obj pdf.Object, cs charcode.CodeSpaceRange) (*ToUnicode, error)

ExtractToUnicode extracts a ToUnicode CMap from a PDF file. If cs is not nil, it overrides the code space range given inside the CMap.

func NewToUnicode added in v0.3.6

func NewToUnicode(cs charcode.CodeSpaceRange, m map[charcode.CharCode][]rune) *ToUnicode

NewToUnicode constructs a ToUnicode cmap from the given mapping.

func NewToUnicodeNew added in v0.4.0

func NewToUnicodeNew(cs charcode.CodeSpaceRange, m map[string][]rune) *ToUnicode

NewToUnicodeNew constructs a ToUnicode cmap from the given mapping.

func ReadToUnicode added in v0.3.6

func ReadToUnicode(r io.Reader, cs charcode.CodeSpaceRange) (*ToUnicode, error)

ReadToUnicode reads a ToUnicode CMap. If cs is not nil, it overrides the code space range given inside the CMap.

func (*ToUnicode) Decode added in v0.3.6

func (info *ToUnicode) Decode(s pdf.String) ([]rune, int)

Decode decodes the first character code from the given string. It returns the corresponding unicode rune and the number of bytes consumed. If the character code cannot be decoded, unicode.ReplacementChar is returned, and the length is either 0 (if the string is empty) or 1. If a valid character code is found but the code is not mapped by the ToUnicode cmap, then the unicode replacement character is returned.

func (*ToUnicode) Embed added in v0.3.6

func (info *ToUnicode) Embed(w pdf.Putter, ref pdf.Reference) error

Embed adds the ToUnicode cmap to a PDF file.

func (*ToUnicode) GetMapping added in v0.3.6

func (info *ToUnicode) GetMapping() map[charcode.CharCode][]rune

GetMapping returns the mapping information from info.

func (*ToUnicode) GetMappingNew added in v0.4.0

func (info *ToUnicode) GetMappingNew() map[string][]rune

GetMappingNew returns the mapping information from info.

func (*ToUnicode) GetSimpleMapping added in v0.4.0

func (info *ToUnicode) GetSimpleMapping() [][]rune

func (*ToUnicode) SetMapping added in v0.3.6

func (info *ToUnicode) SetMapping(m map[charcode.CharCode][]rune)

SetMapping replaces the mapping information in info with the given mapping.

To make efficient use of range entries, the generated mapping may be a superset of the original mapping, i.e. it may contain entries for charcodes which were not mapped in the original mapping.

func (*ToUnicode) Write added in v0.3.6

func (info *ToUnicode) Write(w io.Writer) error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL