chardet

package module
v0.0.0-...-b7413ea Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 20, 2021 License: MIT Imports: 4 Imported by: 68

README

chardet

chardet is library to automatically detect charset of texts for Go programming language. It's based on the algorithm and data in ICU's implementation.

The project was created by saintfish. In January 2015 it was forked by the gogits project in order to incorporate bugfixes and new features.

Documentation and Usage

See pkgdoc

Documentation

Overview

Package chardet ports character set detection from ICU.

Index

Constants

This section is empty.

Variables

View Source
var (
	NotDetectedError = errors.New("Charset not detected.")
)

Functions

This section is empty.

Types

type Detector

type Detector struct {
	// contains filtered or unexported fields
}

Detector implements charset detection.

func NewHtmlDetector

func NewHtmlDetector() *Detector

NewHtmlDetector creates a Detector for Html.

func NewTextDetector

func NewTextDetector() *Detector

NewTextDetector creates a Detector for plain text.

func (*Detector) DetectAll

func (d *Detector) DetectAll(b []byte) ([]Result, error)

DetectAll returns all Results which have non-zero Confidence. The Results are sorted by Confidence in descending order.

func (*Detector) DetectBest

func (d *Detector) DetectBest(b []byte) (r *Result, err error)

DetectBest returns the Result with highest Confidence.

type Result

type Result struct {
	// IANA name of the detected charset.
	Charset string
	// IANA name of the detected language. It may be empty for some charsets.
	Language string
	// Confidence of the Result. Scale from 1 to 100. The bigger, the more confident.
	Confidence int
}

Result contains all the information that charset detector gives.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL