Documentation ¶
Index ¶
- func GetEncodingFromCharsetName(name string) (e encoding.Encoding, err error)
- func IsValidUTF8(content []byte) bool
- func ToUtf8WithCharsetName(content []byte, charsetName string) ([]byte, error)
- func ToUtf8WithDecoder(content []byte, d Decoder) ([]byte, error)
- func ToUtf8WithEncoding(content []byte, e encoding.Encoding) ([]byte, error)
- type Decoder
- type Result
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GetEncodingFromCharsetName ¶
GetEncodingFromCharsetName return encoding.Encoding for given charset name (case insensitive).
It will return errInvalidName if the package can't find correspond encoding.Encoding.
Charset name reference:
https://encoding.spec.whatwg.org/#names-and-labels
http://www.iana.org/assignments/character-sets/character-sets.xhtml
func IsValidUTF8 ¶
Check whether content is valid under UTF-8 rule
func ToUtf8WithCharsetName ¶
Get UTF-8 encoded []byte with charset name.
It will return errInvalidName if there is charset name is not valid ¶
or errWrongDecoder if content can't decoded by the correspond Decoder
Charset name reference:
https://encoding.spec.whatwg.org/#names-and-labels
http://www.iana.org/assignments/character-sets/character-sets.xhtml
func ToUtf8WithDecoder ¶
Get UTF-8 encoded []byte with Decoder.
Types ¶
type Decoder ¶
type Decoder interface { transform.Transformer }
alias for transform.Transformer
func GetDecoderFromCharsetName ¶
GetDecoderFromCharsetName return Decoder for given charset name (case insensitive).
It will return errInvalidName if the package can't find correspond Decoder.
Reference: http://www.iana.org/assignments/character-sets/character-sets.xhtml
and http://www.iana.org/assignments/character-sets/character-sets.xhtml.
type Result ¶
type Result struct { // IANA name of the detected charset. Charset string // IANA name of the detected language. It may be empty for some charsets. Language string // Confidence of the Result. Scale from 1 to 100. The bigger, the more confident. Confidence int // a Decoder which can convert the Result.Charset to utf-8, default encoding.Nop.NewDecoder() which won't try to convert the charset. Decoder transform.Transformer // Whether the charset can be converted by this package Convertible bool }
Result contains all the information that charset detecfr gives.
func DetectAll ¶
DetectAll returns all chardet.Results which have non-zero Confidence. The Results are sorted by Confidence in descending order.
Same as saintfish/chardet - chardet.NewTextDetector().DetectAll() but save matched Decoder in result
func DetectAndConvertToUtf8 ¶
Detect and convert content to UTF-8 encoded.
func DetectEncoding ¶
DetectEncoding return the Result with highest Confidence.