chardet

package
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 25, 2023 License: MIT, MIT Imports: 8 Imported by: 0

README

chardet

chardet is library to automatically detect charset of texts for Go programming language. It's based on the algorithm and data in ICU's implementation.

Documentation and Usage

See pkgdoc

Documentation

Overview

Package chardet ports character set detection from ICU.

Index

Constants

This section is empty.

Variables

View Source
var NotDetectedError = errors.New("Charset not detected.")

Functions

func DecodeBIG5

func DecodeBIG5(s []byte) ([]byte, error)

DecodeBIG5 convert BIG5 to UTF-8

func DecodeGBK

func DecodeGBK(s []byte) ([]byte, error)

DecodeGBK convert GBK to UTF-8

func EncodeBIG5

func EncodeBIG5(s []byte) ([]byte, error)

EncodeBIG5 convert UTF-8 to BIG5

func TransToUTF

func TransToUTF(input []byte) (output []byte, err error)

TransToUTF translate code to simplechinese

Types

type Detector

type Detector struct {
	// contains filtered or unexported fields
}

Detector implements charset detection.

func NewHtmlDetector

func NewHtmlDetector() *Detector

NewHtmlDetector creates a Detector for Html.

func NewTextDetector

func NewTextDetector() *Detector

NewTextDetector creates a Detector for plain text.

func (*Detector) DetectAll

func (d *Detector) DetectAll(b []byte) ([]Result, error)

DetectAll returns all Results which have non-zero Confidence. The Results are sorted by Confidence in descending order.

func (*Detector) DetectBest

func (d *Detector) DetectBest(b []byte) (r *Result, err error)

DetectBest returns the Result with highest Confidence.

type Result

type Result struct {
	// IANA name of the detected charset.
	Charset string
	// IANA name of the detected language. It may be empty for some charsets.
	Language string
	// Confidence of the Result. Scale from 1 to 100. The bigger, the more confident.
	Confidence int
}

Result contains all the information that charset detector gives.

func Detect

func Detect(b []byte) (*Result, error)

Detect detect charset

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL