charenc

package module
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 8, 2022 License: MIT Imports: 4 Imported by: 0

README

charenc

GitHub Language GitHub license

A simple character encoder implemented by Go.

The encoder transforms text encoding from ANSI, UTF8, BOM UTF8 and BOM UTF16 BE/LE to specific encoding which supports ANSI and UTF8 now.

Install

go get github.com/ChenYuTong10/charenc

Example

Encode other encodings to Ansi.

import (
    "log"

    "github.com/ChenYuTong10/charenc"
)

func Foo() {
    stream, err := os.ReadFile("utf8.txt")
        if err != nil {
        log.Printf("read file error: %v", err)
        return
    }

    stream, err = charenc.ToAnsi(stream, "UTF8")
    if err != nil {
        log.Printf("ansi encode error: %v", err)
        return
    }

    // do anything you want
}

Encode other encodings to UTF8.

import (
    "log"

    "github.com/ChenYuTong10/charenc"
)

func Foo() {
    stream, err := os.ReadFile("utf16BE.txt")
        if err != nil {
        log.Printf("read file error: %v", err)
        return
    }

    stream, err = charenc.ToUTF8(stream, "UTF-16 BE")
    if err != nil {
        log.Printf("ansi encode error: %v", err)
        return
    }

    // do anything you want
}

Usually, you may detect the encoding of a text and transform it to other encodings. In this case, you can use github.com/ChenYuTong10/chardet package to work together.

import (
    "log"
    "os"

    "github.com/ChenYuTong10/chardet"
    "github.com/ChenYuTong10/charenc"
)

func Foo() {
    stream, err := os.ReadFile("example.txt")
        if err != nil {
        log.Printf("read file error: %v", err)
        return
    }

    d := new(Detector)
    d.Feed(stream)
    
    encoding := d.Encoding

    // transform encoding to ANSI
    stream, err = charenc.ToAnsi(stream, encoding)
    if err != nil {
        log.Printf("ansi encode error: %v", err)
        return
    }

    // do anything you want
}

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AnsiToUTF8

func AnsiToUTF8(stream []byte) ([]byte, error)

AnsiToUTF8 transforms encoding from Ansi to UTF8.

func BomUTF8ToAnsi

func BomUTF8ToAnsi(stream []byte) ([]byte, error)

BomUTF8ToAnsi transforms stream encoding from BomUTF8 to Ansi.

func BomUTF8ToUTF8

func BomUTF8ToUTF8(stream []byte) ([]byte, error)

BomUTF8ToUTF8 cut first three bytes BOM prefix of the stream.

func ToAnsi

func ToAnsi(stream []byte, encoding string) ([]byte, error)

ToAnsi dispatches to different handler according to the encoding. If the encoding has been Ansi, it returns directly. If the encoding is out of ANSI, UTF8, BOM UTF8, UTF16 BE/LE, it returns Unexpected Error.

func ToUTF8

func ToUTF8(stream []byte, encoding string) ([]byte, error)

ToUTF8 dispatches to different handler according to the encoding. If the encoding has been UTF8, it returns directly. If the encoding is out of ANSI, UTF8, BOM UTF8, UTF16 BE/LE, it returns Unexpected Error.

func UTF16BEToAnsi

func UTF16BEToAnsi(stream []byte) ([]byte, error)

UTF16BEToAnsi transforms stream encoding from UTF16BE to Ansi.

func UTF16BEToUTF8

func UTF16BEToUTF8(stream []byte) ([]byte, error)

UTF16BEToUTF8 transforms encoding from UTF16 BE to UTF8.

func UTF16LEToAnsi

func UTF16LEToAnsi(stream []byte) ([]byte, error)

UTF16LEToAnsi transforms stream encoding from UTF16LE to Ansi.

func UTF16LEToUTF8

func UTF16LEToUTF8(stream []byte) ([]byte, error)

UTF16LEToUTF8 transforms encoding from UTF16 LE to UTF8.

func UTF8ToAnsi

func UTF8ToAnsi(stream []byte) ([]byte, error)

UTF8ToAnsi transforms stream encoding from UTF8 to Ansi.

Types

type UnsupportedEncoding

type UnsupportedEncoding string

func (UnsupportedEncoding) Error

func (e UnsupportedEncoding) Error() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL