xcharset

package
v1.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 5, 2021 License: MIT Imports: 10 Imported by: 0

README

xcharset

Dependencies

  • github.com/Aoi-hosizora/ahlib
  • github.com/saintfish/chardet
  • golang.org/x/text

Documents

Types
  • type DetectResult struct
Variables
  • None
Constants
  • const IANA_UTF8 string
  • const IANA_UTF16BE string
  • const IANA_UTF16LE string
  • const IANA_UTF32BE string
  • const IANA_UTF32LE string
  • const IANA_ISO8859_1 string
  • const IANA_ISO8859_2 string
  • const IANA_ISO8859_5 string
  • const IANA_ISO8859_6 string
  • const IANA_ISO8859_7 string
  • const IANA_ISO8859_8 string
  • const IANA_ISO8859_8I string
  • const IANA_ISO8859_9 string
  • const IANA_KOI8R string
  • const IANA_WINDOWS1251 string
  • const IANA_WINDOWS1256 string
  • const IANA_IBM424RTL string
  • const IANA_IBM424LTR string
  • const IANA_IBM420RTL string
  • const IANA_IBM420LTR string
  • const IANA_SHIFTJIS string
  • const IANA_GBK string
  • const IANA_GB18030 string
  • const IANA_BIG5 string
  • const IANA_EUCJP string
  • const IANA_EUCKR string
  • const IANA_ISO2022JP string
  • const IANA_ISO2022KR string
  • const IANA_ISO2022CN string
Functions
  • func DetectBestCharset(bs []byte) (*DetectResult, bool)
  • func DetectAllCharsets(bs []byte) ([]*DetectResult, bool)
  • func EncodeString(encoding encoding.Encoding, s string) (string, error)
  • func DecodeString(encoding encoding.Encoding, s string) (string, error)
  • func EncodeBytes(encoding encoding.Encoding, bs []byte) ([]byte, error)
  • func DecodeBytes(encoding encoding.Encoding, bs []byte) ([]byte, error)
  • func GetEncoding(iana string) (encode encoding.Encoding, exist bool)
Methods
  • None

Documentation

Index

Constants

View Source
const (
	IANA_UTF8    = "UTF-8"    // *
	IANA_UTF16BE = "UTF-16BE" // *
	IANA_UTF16LE = "UTF-16LE" // *
	IANA_UTF32BE = "UTF-32BE" // *
	IANA_UTF32LE = "UTF-32LE" // *

	IANA_ISO8859_1   = "ISO-8859-1"   // en, da, de, es, fr, it, nl, no, pt, sv
	IANA_ISO8859_2   = "ISO-8859-2"   // cs, hu, pl, ro
	IANA_ISO8859_5   = "ISO-8859-5"   // ru
	IANA_ISO8859_6   = "ISO-8859-6"   // ar
	IANA_ISO8859_7   = "ISO-8859-7"   // el
	IANA_ISO8859_8   = "ISO-8859-8"   // he
	IANA_ISO8859_8I  = "ISO-8859-8-I" // he
	IANA_ISO8859_9   = "ISO-8859-9"   // tr
	IANA_KOI8R       = "KOI8-R"       // ru
	IANA_WINDOWS1251 = "windows-1251" // ar
	IANA_WINDOWS1256 = "windows-1256" // ar
	IANA_IBM424RTL   = "IBM424_rtl"   // he
	IANA_IBM424LTR   = "IBM424_ltr"   // he
	IANA_IBM420RTL   = "IBM420_rtl"   // ar
	IANA_IBM420LTR   = "IBM420_ltr"   // ar

	IANA_SHIFTJIS  = "Shift_JIS"   // ja
	IANA_GBK       = "GBK"         // zh
	IANA_GB18030   = "GB18030"     // zh
	IANA_BIG5      = "Big5"        // zh
	IANA_EUCJP     = "EUC-JP"      // ja
	IANA_EUCKR     = "EUC-KR"      // ko
	IANA_ISO2022JP = "ISO-2022-JP" // jp
	IANA_ISO2022KR = "ISO-2022-KR" // kr
	IANA_ISO2022CN = "ISO-2022-CN" // cn
)

See https://github.com/saintfish/chardet/blob/master/detector.go and https://www.iana.org/assignments/charset-reg/charset-reg.xhtml.

Variables

This section is empty.

Functions

func DecodeBytes

func DecodeBytes(encoding encoding.Encoding, bs []byte) ([]byte, error)

DecodeBytes decodes a bytes to given encoding.

func DecodeString

func DecodeString(encoding encoding.Encoding, s string) (string, error)

DecodeString decodes a string to given encoding.

func EncodeBytes

func EncodeBytes(encoding encoding.Encoding, bs []byte) ([]byte, error)

EncodeBytes encodes a bytes to given encoding.

func EncodeString

func EncodeString(encoding encoding.Encoding, s string) (string, error)

EncodeString encodes a string to given encoding.

func GetEncoding

func GetEncoding(iana string) (encode encoding.Encoding, exist bool)

GetEncoding returns an encoding.Encoding from some IANA or MIME names.

Types

type DetectResult

type DetectResult struct {
	// Charset represents IANA or MIME name of the detected charset.
	Charset string

	// Language represents IANA name of the detected language. It may be empty for some charsets.
	Language string

	// Confidence represents the confidence of the result. Scale from 1 to 100.
	Confidence int
}

DetectResult contains the information for charset detector. See chardet.Result.

func DetectAllCharsets added in v1.5.0

func DetectAllCharsets(bs []byte) ([]*DetectResult, bool)

DetectAllCharsets detects bytes and returns all charsets in confidence's descending order.

func DetectBestCharset added in v1.5.0

func DetectBestCharset(bs []byte) (*DetectResult, bool)

DetectBestCharset detects bytes and returns the charset result with highest confidence.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL