nihongo

package module
v0.0.0-...-77c60e4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 5, 2015 License: ISC Imports: 6 Imported by: 0

README

NihonGo

Build Status Coverage Status License

NihonGo is an utility of Japanese text for Go language.

go get github.com/dogenzaka/nihongo

Features

  • Converting Katakana / Hiragana
  • Unicode normalization
  • Detecting Katakana / Hiragana strings in text
  • Simple Japanese tokenizer ported TinySegmenter

Examples

import (
  "fmt"
  "github.com/dogenzaka/nihongo"
)

func TestNormalize() {
  normalized := nihongo.Normalize("テストテスト+=")
  fmt.Println(normalized) // テストテスト+=
}

func TestToHiragana() {
  hira := nihongo.ToHiragana("テストてすと")
  fmt.Println(hira) // てすとてすと
}

func TestToKatakana() {
  kana := nihongo.ToKatakana("テストてすと")
  fmt.Println(kana) // テストテスト
}

func TestTokenize() {
  words := nihongo.Tokenize("私は人間です")
  fmt.Println(words) // ["私" "は" "人間" "です"]
}

func TestContainsHiragana() {
  nihongo.ContainsHiragana("ひらがな") // true
  nihongo.ContiansHiragana("日本語") // false
}

func TestContainsKatakana() {
  nihongo.ContainsKatakana("カタカナ") // true
  nihongo.ContiansKatakana("日本語") // false
}

License

ISC

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ContainsHiragana

func ContainsHiragana(text string) bool

ContainsHiragana returns true when text contains hiragana

func ContainsKatakana

func ContainsKatakana(text string) bool

ContainsKatakana returns true when text contains katakana

func Normalize

func Normalize(text string) string

Normalize japanese text which will convert with NFKC normalization. Hankaku-Kana -> Zenkaku-Kana Zenkaku special chars -> Hankaku special chars

func ToHiragana

func ToHiragana(text string) string

ToHiragana converts all katakana text to hiragana. You should normalize text before converting.

func ToKatakana

func ToKatakana(text string) string

ToKatakana converts all hiragana text to katakana. You should normalize text before converting.

func Tokenize

func Tokenize(input string) []string

Tokenize splits sentence to word array in Japanese

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL