Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CountVectoriser1 ¶
type CountVectoriser1 struct { Vocabulary map[string]int // contains filtered or unexported fields }
func NewCountVectoriser1 ¶
func NewCountVectoriser1(removeStopwords bool) *CountVectoriser1
func (*CountVectoriser1) Fit ¶
func (v *CountVectoriser1) Fit(train ...string) *CountVectoriser1
func (*CountVectoriser1) FitTransform ¶
func (v *CountVectoriser1) FitTransform(docs ...string) (*mat64.Dense, error)
type CountVectoriser2 ¶
type CountVectoriser2 struct { Vocabulary map[string]int // contains filtered or unexported fields }
func NewCountVectoriser2 ¶
func NewCountVectoriser2(removeStopwords bool) *CountVectoriser2
func (*CountVectoriser2) Fit ¶
func (v *CountVectoriser2) Fit(train ...string) *CountVectoriser2
func (*CountVectoriser2) FitTransform ¶
func (v *CountVectoriser2) FitTransform(docs ...string) (*mat64.Dense, error)
type CountVectoriser3 ¶
type CountVectoriser3 struct { Vocabulary map[string]int // contains filtered or unexported fields }
func NewCountVectoriser3 ¶
func NewCountVectoriser3(removeStopwords bool) *CountVectoriser3
func (*CountVectoriser3) Fit ¶
func (v *CountVectoriser3) Fit(train ...string) *CountVectoriser3
func (*CountVectoriser3) FitTransform ¶
func (v *CountVectoriser3) FitTransform(docs ...string) (*mat64.Dense, error)
type DOKCountVectoriser1 ¶
type DOKCountVectoriser1 struct { Vocabulary map[string]int // contains filtered or unexported fields }
func NewDOKCountVectoriser1 ¶
func NewDOKCountVectoriser1(removeStopwords bool) *DOKCountVectoriser1
func (*DOKCountVectoriser1) Fit ¶
func (v *DOKCountVectoriser1) Fit(train ...string) *DOKCountVectoriser1
func (*DOKCountVectoriser1) FitTransform ¶
func (v *DOKCountVectoriser1) FitTransform(docs ...string) (*sparse.DOK, error)
type SparseTfidfTransformer ¶
type SparseTfidfTransformer struct {
// contains filtered or unexported fields
}
func (*SparseTfidfTransformer) Fit ¶
func (t *SparseTfidfTransformer) Fit(mat mat64.Matrix) *SparseTfidfTransformer
func (*SparseTfidfTransformer) FitTransform ¶
type TfidfTransformer1 ¶
type TfidfTransformer1 struct {
// contains filtered or unexported fields
}
func (*TfidfTransformer1) Fit ¶
func (t *TfidfTransformer1) Fit(mat mat64.Matrix) Transformer
func (*TfidfTransformer1) FitTransform ¶
type TfidfTransformer2 ¶
type TfidfTransformer2 struct {
// contains filtered or unexported fields
}
TfidfTransformer takes a raw term document matrix and weights each raw term frequency value depending upon how commonly it occurs across all documents within the corpus. For example a very commonly occuring word like `the` is likely to occur in all documents and so would be weighted down. More precisely, TfidfTransformer applies a tf-idf algorithm to the matrix where each term frequency is multiplied by the inverse document frequency. Inverse document frequency is calculated as log(n/df) where df is the number of documents in which the term occurs and n is the total number of documents within the corpus. We add 1 to both n and df before division to prevent division by zero.
func NewTfidfTransformer ¶
func NewTfidfTransformer() *TfidfTransformer2
NewTfidfTransformer constructs a new TfidfTransformer.
func (*TfidfTransformer2) Fit ¶
func (t *TfidfTransformer2) Fit(mat mat64.Matrix) Transformer
Fit takes a training term document matrix, counts term occurances across all documents and constructs an inverse document frequency transform to apply to matrices in subsequent calls to Transform().
func (*TfidfTransformer2) FitTransform ¶
FitTransform is exactly equivalent to calling Fit() followed by Transform() on the same matrix. This is a convenience where separate trianing data is not being used to fit the model i.e. the model is fitted on the fly to the test data.
type TfidfTransformer3 ¶
type TfidfTransformer3 struct {
// contains filtered or unexported fields
}
func (*TfidfTransformer3) Fit ¶
func (t *TfidfTransformer3) Fit(mat mat64.Matrix) Transformer
Fit takes a training term document matrix, counts term occurances across all documents and constructs an inverse document frequency transform to apply to matrices in subsequent calls to Transform().
func (*TfidfTransformer3) FitTransform ¶
FitTransform is exactly equivalent to calling Fit() followed by Transform() on the same matrix. This is a convenience where separate trianing data is not being used to fit the model i.e. the model is fitted on the fly to the test data.