blevejieba

package module
v1.0.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 6, 2020 License: MIT Imports: 14 Imported by: 0

README

blevejieba

GoDoc godoc

基于gojieba的bleve插件

Documentation

Overview

* blevejieba是一个bleve的中文分词插件,基于gojieba开发

Index

Constants

View Source
const Name = "jieba"

Name is the jieba analyzer/tokenizer name.

Variables

This section is empty.

Functions

func AnalyzerConstructor

func AnalyzerConstructor(config map[string]interface{}, cache *registry.Cache) (*analysis.Analyzer, error)

func Dotoken added in v1.0.6

func Dotoken(word string) ([]gojieba.Word, int)

处理分词

func IsChinese added in v1.0.6

func IsChinese(str string) bool

判断是否含有中文

func IsExist added in v1.0.8

func IsExist(f string) bool

func Is_date added in v1.0.6

func Is_date(date_str string) bool

是否为时间格式

func Is_price added in v1.0.6

func Is_price(price string) (string, bool)

是否为价钱

func JiebaTokenizerConstructor

func JiebaTokenizerConstructor(config map[string]interface{}, cache *registry.Cache) (
	analysis.Tokenizer, error)

JiebaTokenizerConstructor creates a JiebaTokenizer. Parameter config can contains following parameter:

dict_path: optional, the path of the dictionary file.
hmm_path: optional, specify whether to use Hidden Markov Model, see NewJiebaTokenizer for details.
userdict_path: optional, specify user dict file path
idf_path: optional, specify idf file path
stopdict_path: optional, specify user stop dict file path
is_search: optional, speficy whether to use isSearch mode, see NewJiebaTokenizer for details.

func NewGoJiebaIndexMapping

func NewGoJiebaIndexMapping(opt *Options) (mapping.IndexMapping, error)

func NewJiebaTokenizer

func NewJiebaTokenizer(dictFilePath, hmm, userDictPath, idfDict, stopDict string, searchMode bool) (analysis.Tokenizer, error)

func NewMemIndexWithGoJieba

func NewMemIndexWithGoJieba(opt *Options) (bleve.Index, error)

func NewStoreIndexWithGoJieba

func NewStoreIndexWithGoJieba(store string, opt *Options) (bleve.Index, error)

func OpenStoreIndexWithGoJieba added in v1.0.5

func OpenStoreIndexWithGoJieba(store string, opt *Options) (bleve.Index, error)

func Regword added in v1.0.6

func Regword(word string) string

func StopTokenFilterConstructor

func StopTokenFilterConstructor(config map[string]interface{}, cache *registry.Cache) (analysis.TokenFilter, error)

func TokenMapConstructor

func TokenMapConstructor(config map[string]interface{}, cache *registry.Cache) (analysis.TokenMap, error)

TokenMapConstructor create a stop word token map. Parameter config can contains following parameters:

stopdict_path: optional, user stop dict file path

Types

type JiebaTokenizer

type JiebaTokenizer struct {
	// contains filtered or unexported fields
}

JiebaTokenizer is the beleve tokenizer for jiebago.

func (*JiebaTokenizer) Tokenize

func (jt *JiebaTokenizer) Tokenize(input []byte) analysis.TokenStream

Tokenize cuts input into bleve token stream.

type Options

type Options struct {
	// contains filtered or unexported fields
}

func NewOptions

func NewOptions() *Options

func (*Options) WithHMMPath

func (o *Options) WithHMMPath(p string) *Options

func (*Options) WithIDFDictPath

func (o *Options) WithIDFDictPath(p string) *Options

func (*Options) WithJiebaDictPath

func (o *Options) WithJiebaDictPath(p string) *Options

func (*Options) WithSearch

func (o *Options) WithSearch(search bool) *Options

func (*Options) WithStopDictPath

func (o *Options) WithStopDictPath(p string) *Options

func (*Options) WithUserDictPath

func (o *Options) WithUserDictPath(p string) *Options

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL