cmap

package
v0.0.0-...-fb73569 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 24, 2023 License: Apache-2.0 Imports: 13 Imported by: 0

Documentation

Index

Constants

View Source
const MissingCodeRune = textencoding.MissingCodeRune

MissingCodeRune replaces runes that can't be decoded. '\ufffd' = �. Was '?'.

Variables

View Source
var (
	ErrBadCMap        = errors.New("bad cmap")
	ErrBadCMapComment = errors.New("comment should start with %")
	ErrBadCMapDict    = errors.New("invalid dict")
)

Functions

This section is empty.

Types

type CIDSystemInfo

type CIDSystemInfo struct {
	Registry   string
	Ordering   string
	Supplement int
}

CIDSystemInfo=Dict("Registry": Adobe, "Ordering": Korea1, "Supplement": 0, )

func NewCIDSystemInfo

func NewCIDSystemInfo(obj core.PdfObject) (info CIDSystemInfo, err error)

NewCIDSystemInfo returns the CIDSystemInfo encoded in PDFObject `obj`.

func (*CIDSystemInfo) String

func (info *CIDSystemInfo) String() string

String returns a human readable description of `info`. It looks like "Adobe-Japan2-000".

type CMap

type CMap struct {
	// contains filtered or unexported fields
}

CMap represents a character code to unicode mapping used in PDF files. References:

https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/5411.ToUnicode.pdf
https://github.com/adobe-type-tools/cmap-resources/releases

func LoadCmapFromData

func LoadCmapFromData(data []byte, isSimple bool) (*CMap, error)

LoadCmapFromData parses the in-memory cmap `data` and returns the resulting CMap. If `isSimple` is true, it uses 1-byte encodings, otherwise it uses the codespaces in the cmap.

9.10.3 ToUnicode CMaps (page 293).

func LoadCmapFromDataCID

func LoadCmapFromDataCID(data []byte) (*CMap, error)

LoadCmapFromDataCID parses the in-memory cmap `data` and returns the resulting CMap. It is a convenience function.

func NewToUnicodeCMap

func NewToUnicodeCMap(codeToUnicode map[CharCode]rune) *CMap

NewToUnicodeCMap returns an identity CMap with codeToUnicode matching the `codeToUnicode` arg.

func (*CMap) Bytes

func (cmap *CMap) Bytes() []byte

Bytes returns the raw bytes of a PDF CMap corresponding to `cmap`.

func (*CMap) CharcodeBytesToUnicode

func (cmap *CMap) CharcodeBytesToUnicode(data []byte) (string, int)

CharcodeBytesToUnicode converts a byte array of charcodes to a unicode string representation. It also returns a bool flag to tell if the conversion was successful. NOTE: This only works for ToUnicode cmaps.

func (*CMap) CharcodeToUnicode

func (cmap *CMap) CharcodeToUnicode(code CharCode) (rune, bool)

CharcodeToUnicode converts a single character code `code` to a unicode string. If `code` is not in the unicode map, '�' is returned. NOTE: CharcodeBytesToUnicode is typically more efficient.

func (*CMap) Name

func (cmap *CMap) Name() string

Name returns the name of the CMap.

func (*CMap) String

func (cmap *CMap) String() string

String returns a human readable description of `cmap`.

func (*CMap) Type

func (cmap *CMap) Type() int

Type returns the CMap type.

type CharCode

type CharCode uint32

CharCode is a character code or Unicode rune is int32 https://golang.org/doc/go1#rune

type Codespace

type Codespace struct {
	NumBytes int
	Low      CharCode
	High     CharCode
}

Codespace represents a single codespace range used in the CMap.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL