corpus

package
v0.0.0-...-ba2758a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 20, 2019 License: Apache-2.0 Imports: 14 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	FakeSeeker = fakeNopSeeker{
		ReadCloser: ioutil.NopCloser(bytes.NewReader([]byte(text))),
	}
)

Functions

This section is empty.

Types

type CountModelCorpus

type CountModelCorpus struct {
	// contains filtered or unexported fields
}

CountModelCorpus stores corpus and co-occurrence values between words.

func NewCountModelCorpus

func NewCountModelCorpus() *CountModelCorpus

NewCountModelCorpus creates *CountModelCorpus.

func (*CountModelCorpus) PairsIntoGlove

func (c *CountModelCorpus) PairsIntoGlove(window int, xmax int, alpha float64, verbose bool) ([]Pair, error)

func (*CountModelCorpus) PairsIntoLexvec

func (c *CountModelCorpus) PairsIntoLexvec(window int, relationType RelationType, smooth float64, verbose bool) (PairMap, error)

func (CountModelCorpus) Parse

func (c CountModelCorpus) Parse(f io.Reader, toLower bool, minCount int, batchSize int, verbose bool) error

type CountType

type CountType int

CountType is a list of types to count co-occurences.

const (
	INCREMENT CountType = iota
	// DISTANCE weights values for co-occurrence times.
	DISTANCE
)

type Pair

type Pair struct {
	// L1 and L2 store index number for two co-occurrence words.
	L1, L2 int
	// F stores the measures of co-occurrence, such as PMI.
	F float64
	// Coefficient stores a coefficient for weighted matrix factorization.
	Coefficient float64
}

Pair stores co-occurrence information.

type PairMap

type PairMap map[uint64]float64

PairMap stores co-occurrences.

type RelationType

type RelationType int

RelationType is a list of types for strength relations between co-occurrence words.

const (
	PPMI RelationType = iota
	PMI
	CO
	LOGCO
)

func (RelationType) String

func (r RelationType) String() string

String describes relation type name.

type Word2vecCorpus

type Word2vecCorpus struct {
	// contains filtered or unexported fields
}

Word2vecCorpus stores corpus.

func NewWord2vecCorpus

func NewWord2vecCorpus() *Word2vecCorpus

NewWord2vecCorpus creates *Word2vecCorpus.

func (*Word2vecCorpus) HuffmanTree

func (wc *Word2vecCorpus) HuffmanTree(dimension int) (map[int]*node.Node, error)

HuffmanTree builds word nodes map.

func (Word2vecCorpus) Parse

func (c Word2vecCorpus) Parse(f io.Reader, toLower bool, minCount int, batchSize int, verbose bool) error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL