Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type EnglishWordsCounterPQ ¶
type EnglishWordsCounterPQ struct { Map map[string]*priority.Item Priority *priority.Queue sync.Mutex }
EnglishWordsCounterPQ 分词,统计频数,英文适用,基于堆(优先队列)
func (*EnglishWordsCounterPQ) AddSentence ¶
func (c *EnglishWordsCounterPQ) AddSentence(s string)
func (*EnglishWordsCounterPQ) PopMostCommon ¶
func (c *EnglishWordsCounterPQ) PopMostCommon() string
type EnglishWordsCounterQS ¶
type EnglishWordsCounterQS struct { Map map[string]int // [word]: idx_in_List List wordsList sync.Mutex }
EnglishWordsCounterQS 是基于顺序表和快速排序的 counter Benchmark 时空性能不如 PQ
func (*EnglishWordsCounterQS) AddSentence ¶
func (c *EnglishWordsCounterQS) AddSentence(s string)
func (*EnglishWordsCounterQS) PopMostCommon ¶
func (c *EnglishWordsCounterQS) PopMostCommon() string
type WordsCounter ¶
type WordsCounter interface { // AddSentence 从句子中分词,统计 AddSentence(s string) // PopMostCommon 获取频次最高的词 PopMostCommon() string }
WordsCounter 词频统计
目前只有英文的,推荐用 EnglishWordsCounterPQ,这个性能比较好:
$ go test -bench=. -benchmem goos: darwin goarch: amd64 pkg: spotifyplaylist/nlp cpu: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz BenchmarkEnglishWordCounterPQ-4 33 35152096 ns/op 7236090 B/op 57066 allocs/op BenchmarkEnglishWordCounterQS-4 28 39783905 ns/op 7251620 B/op 57171 allocs/op
func NewEnglishWordCounterPQ ¶
func NewEnglishWordCounterPQ() WordsCounter
func NewEnglishWordCounterQS ¶
func NewEnglishWordCounterQS() WordsCounter
Click to show internal directories.
Click to hide internal directories.