ocrpdf

package module

v0.0.0-...-8a85526 Latest Latest Go to latest Published: Jul 23, 2016 License: MIT Imports: 7 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/johnsto/ocrpdf

Links

Open Source Insights

README ¶

ocrpdf

ocrpdf is a library wrapping around Tesseract, Leptonica and gofpdf, designed to assist in the generation of PDF files from scanned documents.

It is designed primarily for use with the goscan2pdf tool.

Documentation ¶

Index ¶

Constants
Variables
type Document
- func NewDocument(size string) *Document
type Image
- func NewImageFromFile(filename string) (*Image, error)
type Orientation
type Tess
- func NewTess(datapath string, language string) (*Tess, error)
- func (t *Tess) SetImagePix(pix *C.struct_Pix)
- func (t *Tess) Words() []Word
type TextScaling
type Word

Constants ¶

View Source

const (
	// AutoOrientation chooses orientation based on longest edge
	AutoOrientation      Orientation = "auto"
	PortraitOrientation              = "portrait"
	LandscapeOrientation             = "landscape"
)

Different orientation modes.

View Source

const (
	// NoTextScaling specifies that no text scaling will be performed.
	NoTextScaling TextScaling = "off"
	// ContainTextScaling fits the text to the detected word boundary,
	// whilst maintaining the correct aspect ratio for the font.
	ContainTextScaling = "contain"
	// MatchTextScaling fits the text to the detected word boundary exactly,
	// scaling the font if required.
	MatchTextScaling = "match"
)

View Source

const DefaultJPEGCompression int = 75

Variables ¶

View Source

var JPEGCompression int = DefaultJPEGCompression

Functions ¶

This section is empty.

Types ¶

type Document ¶

type Document struct {
	*gofpdf.Fpdf
	// contains filtered or unexported fields
}

Document is a wrapped version of gofpdf.Fpd which adds additional methods for constructing documents with OCR-generated text.

func NewDocument ¶

func NewDocument(size string) *Document

NewDocument returns a new Document of the specified size.

func (*Document) AddImageLayer ¶

func (d *Document) AddImageLayer(image Image, imagename string,
	format string, w, h float64)

AddImageLayer adds the specified image to the page, embedding it using the given format, and appear at the specified size (in page units).

func (*Document) AddPage ¶

func (d *Document) AddPage(image Image, imagename string,
	words []Word, format string) error

AddPage appends the given image to the document, annotating the document with the detected words. Ensure `name` is unique for each distinct image.

func (*Document) AddWords ¶

func (d *Document) AddWords(words []Word)

AddWords adds the specified words to the page.

func (*Document) GetPageConfiguration ¶

func (d *Document) GetPageConfiguration(iw, ih float64) (
	w, h float64, orientation Orientation)

GetPageConfiguration returns a suitable page size and orientation to contain an image of the specified dimensions.

func (*Document) SetDebug ¶

func (d *Document) SetDebug(enabled bool)

SetDebug enables debug mode, in which detected words are outlined, and the text layer is arranged on top of the image (scan) layer.

func (*Document) SetOrientation ¶

func (d *Document) SetOrientation(orientation Orientation)

SetOrientation sets the orientation of new pages

func (*Document) SetTextScaling ¶

func (d *Document) SetTextScaling(mode TextScaling)

SetTextScaling enables the scaling of embedded text such that it matches the same area that the original text was detected.

type Image ¶

type Image struct {
	// contains filtered or unexported fields
}

func NewImageFromFile ¶

func NewImageFromFile(filename string) (*Image, error)

NewImageFromFile creates and returns a new image loaded from the given file path.

func (*Image) Adjust ¶

func (i *Image) Adjust(threshold float32) *Image

Adjust improves the clarity and contrast of the image, generally reducing scanning artifacts.

func (*Image) CPIX ¶

func (i *Image) CPIX() *C.PIX

func (Image) Dimensions ¶

func (i Image) Dimensions() (int32, int32, int32)

Dimensions calculates the width, height and colour depth of the image.

func (Image) FormatString ¶

func (i Image) FormatString() string

FormatString returns the image format as a string, e.g. 'jpg'

func (Image) Reader ¶

func (i Image) Reader(format string) (*bytes.Buffer, string, error)

Reader returns an io.Reader for the image data. If format is not specified, the reader will produce image data in the original image format. Otherwise, `format` must be either "jpeg" or "png"

func (Image) ReaderJPEG ¶

func (i Image) ReaderJPEG(quality int, progressive bool) (*bytes.Buffer, error)

ReaderJPEG returns an io.Reader for the image data, returning a compressed JPEG of the specified quality (0-100).

func (Image) ReaderPNG ¶

func (i Image) ReaderPNG(gamma float32) (*bytes.Buffer, error)

ReaderPNG returns an io.Reader for the image data, in PNG format.

func (*Image) Scale ¶

func (i *Image) Scale(w, h int32) *Image

Scale resizes the image to the specified dimensions.

func (*Image) ScaleDown ¶

func (i *Image) ScaleDown(w, h int32) *Image

ScaleDown scales down the image to the specified dimensions, returning the original image if it is already smaller (in terms of pixel count)

type Orientation ¶

type Orientation string

Orientation defines page orientations

type Tess ¶

type Tess struct {
	// contains filtered or unexported fields
}

func NewTess ¶

func NewTess(datapath string, language string) (*Tess, error)

func (*Tess) SetImagePix ¶

func (t *Tess) SetImagePix(pix *C.struct_Pix)

SetImagePix sets the image to perform recognition on

func (*Tess) Words ¶

func (t *Tess) Words() []Word

Words analyses the document and returns a list of recognised words.

type TextScaling ¶

type TextScaling string

TextScaling defines text scaling modes

type Word ¶

type Word struct {
	Text   string
	Left   int
	Right  int
	Top    int
	Bottom int
	Width  int
	Height int
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
goscan2pdf

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL