bay

package
v0.0.0-...-bc146ff Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 4, 2014 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package bay implements Naive Bayesian Classifier.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CSV

func CSV(filename string) [][]string

CSV reads data from a csv file. [][]string: the first [] is row, the second [] is column. len(output) would be the number of total rows. Use the following line to traverse by all rows and only the first column. for i := 0; i < len(output); i++

output[i][0]

func GetCdC

func GetCdC(DATA []TD) []int

GetCdC gets the candidate classs from the training data. We assume that the class string is of only one ftw.

func GetCdFt

func GetCdFt(DATA []TD, include string, exclude []string) []string

GetCdFt extracts the candidate feature words from the training data and feature range data. Previous step to mutual information filtering. For example, retrieve the useful words: simple, easy, like, hate, etc. All raw data are already processed before calling this function. This function just extract the raw feature data. More informative words will be selected with mutual information.

func GetExcFt

func GetExcFt(filename string) []string

GetExcFt imports "exclude" feature candidate range data from a csv file. Relatively small amount of data. Just to be used with linear search.

func GetInclFt

func GetInclFt(filename string) string

GetInclFt imports "include" feature candidate range data from a csv file. Possibly big file, so use strings.Contains method should be faster.

func InfFtWd

func InfFtWd(DATA []TD, CLASSES []int, fts []string, howmany int) []string

Return the most informative n words.

func InfTop5Ft

func InfTop5Ft(DATA []TD, CLASSES []int, fts []string) []string

InfTop5Ft, from mutual information, extracts the most informative features.

func JtProbFC

func JtProbFC(DATA []TD, ftw string, klass int) float64

JtProbFC returns the joint probability of feature and class. P(Feature ∩ Class)

func JtProbNFC

func JtProbNFC(DATA []TD, ftw string, klass int) float64

JtProbNFC returns the joint probability of Non-feature and class. P(Feature ∩ Class)

func MutInfByFt

func MutInfByFt(DATA []TD, CLASSES []int, ftw string) float64

MutInfByFt calculates the mutual information probability to detect mutually informative features. For example, it returns higher probability for "like" rather than "the."

func NBC

func NBC(DATA []TD, include string, exclude []string, str string) int

NBC implements Naive Bayesian Classifier.

func Print

func Print(DATA []TD, include string, exclude []string, str string)

Print prints out the outcome.

func ProbByC

func ProbByC(DATA []TD, klass int) float64

ProbByC returns the probability of class in total cases. P(Class)

func ProbByF

func ProbByF(DATA []TD, ftw string) float64

ProbByF returns the probability of class in total cases. P(Feature)

func ProbByFC

func ProbByFC(DATA []TD, ftw string, klass int) float64

ProbByFC returns the conditional probaility between feature and class. P(Feature | Class) For example, use this to get P("like"|+)

func ProbByNF

func ProbByNF(DATA []TD, ftw string) float64

ProbByNF returns the probability of feature NOT occurring. P(~Feature)

func TotalWt

func TotalWt(DATA []TD) int

TotalWt returns the total weight value.

func WtByC

func WtByC(DATA []TD, klass int) int

WtByC returns the total weight value of certain class.

func WtByF

func WtByF(DATA []TD, ftw string) int

WtByF returns the total weight value of certain feature.

func WtByFC

func WtByFC(DATA []TD, ftw string, klass int) int

WtByFC returns the total weight value by both class and feature words. For example, use this to get "like" in "positive" class.

func WtByNF

func WtByNF(DATA []TD, ftw string) int

WtByNF returns the total weight value when the input feature does not occur. W(~"like")

func WtByNFC

func WtByNFC(DATA []TD, ftw string, klass int) int

WtByNFC returns the total weight value by class and with the Non-feature word. For example, use this to get "like" in "positive" class.

Types

type TD

type TD struct {
	// consider every string in lower case
	// if not, convert it to lower case
	// class = positive, negative like sentiment
	// for the purpose of multiple classes
	// we use integer format
	// for example, the preference degree as class
	// 	will span from 1 to 10; 10 is the most preferred
	// weight values are 10 * class
	Class  int
	Weight int
	Text   string
}

func GetStruct

func GetStruct(filename string) []TD

GetStruct imports data from a csv file and construct the structure.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL