Documentation ¶
Overview ¶
Package extractor is used for quickly extracting PDF content through a simple interface. Currently offers functionality for extracting textual content.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Extractor ¶
type Extractor struct {
// contains filtered or unexported fields
}
Extractor stores and offers functionality for extracting content from PDF pages.
func (*Extractor) ExtractText ¶
ExtractText processes and extracts all text data in content streams and returns as a string. Takes into account character encoding via CMaps in the PDF file. The text is processed linearly e.g. in the order in which it appears. A best effort is done to add spaces and newlines.
Click to show internal directories.
Click to hide internal directories.