goawabi

package module
v0.2.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 18, 2023 License: MIT Imports: 9 Imported by: 0

README

goawabi

goawabi is a morphological analyzer using mecab dictionary, written in Go.

See also an original Rust implementation awabi https://github.com/nakagami/awabi .

Requirements and how to install

MeCab https://taku910.github.io/mecab/ and related dictionary is required.

Debian/Ubuntu
$ sudo apt install mecab
$ sudo apt install mecab-ipadic-utf8
$ go get github.com/nakagami/goawabi/cmd/goawabi
Mac OS X (homebrew)
$ brew install mecab
$ brew install mecab-ipadic
$ go get github.com/nakagami/goawabi/cmd/goawabi

How to use

goawabi command
$ echo 'すもももももももものうち' |goawabi
すもも	名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も	助詞,係助詞,*,*,*,*,も,モ,モ
もも	名詞,一般,*,*,*,*,もも,モモ,モモ
も	助詞,係助詞,*,*,*,*,も,モ,モ
もも	名詞,一般,*,*,*,*,もも,モモ,モモ
の	助詞,連体化,*,*,*,*,の,ノ,ノ
うち	名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
EOS
$ echo 'すもももももももものうち' |goawabi  -N 2
すもも	名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も	助詞,係助詞,*,*,*,*,も,モ,モ
もも	名詞,一般,*,*,*,*,もも,モモ,モモ
も	助詞,係助詞,*,*,*,*,も,モ,モ
もも	名詞,一般,*,*,*,*,もも,モモ,モモ
の	助詞,連体化,*,*,*,*,の,ノ,ノ
うち	名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
EOS
すもも	名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も	助詞,係助詞,*,*,*,*,も,モ,モ
もも	名詞,一般,*,*,*,*,もも,モモ,モモ
もも	名詞,一般,*,*,*,*,もも,モモ,モモ
も	助詞,係助詞,*,*,*,*,も,モ,モ
の	助詞,連体化,*,*,*,*,の,ノ,ノ
うち	名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
EOS
use as library

See main as sample code.

See also

Documentation

Index

Constants

View Source
const MAX_GROUPING_SIZE = 24

Variables

This section is empty.

Functions

This section is empty.

Types

type DicEntry

type DicEntry struct {
	// contains filtered or unexported fields
}

type Lattice

type Lattice struct {
	// contains filtered or unexported fields
}

type Node

type Node struct {
	// contains filtered or unexported fields
}

type Tokenizer

type Tokenizer struct {
	// contains filtered or unexported fields
}

func NewTokenizer

func NewTokenizer(path string) (*Tokenizer, error)

func (*Tokenizer) Tokenize

func (tok *Tokenizer) Tokenize(str string) ([][2]string, error)

func (*Tokenizer) TokenizeNBest

func (tok *Tokenizer) TokenizeNBest(str string, n int) ([][][2]string, error)

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL