Documentation ¶
Overview ¶
Package bay implements Naive Bayesian Classifier.
Index ¶
- func CSV(filename string) [][]string
- func GetCdC(DATA []TD) []int
- func GetCdFt(DATA []TD, include string, exclude []string) []string
- func GetExcFt(filename string) []string
- func GetInclFt(filename string) string
- func InfFtWd(DATA []TD, CLASSES []int, fts []string, howmany int) []string
- func InfTop5Ft(DATA []TD, CLASSES []int, fts []string) []string
- func JtProbFC(DATA []TD, ftw string, klass int) float64
- func JtProbNFC(DATA []TD, ftw string, klass int) float64
- func MutInfByFt(DATA []TD, CLASSES []int, ftw string) float64
- func NBC(DATA []TD, include string, exclude []string, str string) int
- func Print(DATA []TD, include string, exclude []string, str string)
- func ProbByC(DATA []TD, klass int) float64
- func ProbByF(DATA []TD, ftw string) float64
- func ProbByFC(DATA []TD, ftw string, klass int) float64
- func ProbByNF(DATA []TD, ftw string) float64
- func TotalWt(DATA []TD) int
- func WtByC(DATA []TD, klass int) int
- func WtByF(DATA []TD, ftw string) int
- func WtByFC(DATA []TD, ftw string, klass int) int
- func WtByNF(DATA []TD, ftw string) int
- func WtByNFC(DATA []TD, ftw string, klass int) int
- type TD
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CSV ¶
CSV reads data from a csv file. [][]string: the first [] is row, the second [] is column. len(output) would be the number of total rows. Use the following line to traverse by all rows and only the first column. for i := 0; i < len(output); i++
output[i][0]
func GetCdC ¶
GetCdC gets the candidate classs from the training data. We assume that the class string is of only one ftw.
func GetCdFt ¶
GetCdFt extracts the candidate feature words from the training data and feature range data. Previous step to mutual information filtering. For example, retrieve the useful words: simple, easy, like, hate, etc. All raw data are already processed before calling this function. This function just extract the raw feature data. More informative words will be selected with mutual information.
func GetExcFt ¶
GetExcFt imports "exclude" feature candidate range data from a csv file. Relatively small amount of data. Just to be used with linear search.
func GetInclFt ¶
GetInclFt imports "include" feature candidate range data from a csv file. Possibly big file, so use strings.Contains method should be faster.
func JtProbNFC ¶
JtProbNFC returns the joint probability of Non-feature and class. P(Feature ∩ Class)
func MutInfByFt ¶
MutInfByFt calculates the mutual information probability to detect mutually informative features. For example, it returns higher probability for "like" rather than "the."
func ProbByFC ¶
ProbByFC returns the conditional probaility between feature and class. P(Feature | Class) For example, use this to get P("like"|+)
func WtByFC ¶
WtByFC returns the total weight value by both class and feature words. For example, use this to get "like" in "positive" class.
Types ¶
type TD ¶
type TD struct { // consider every string in lower case // if not, convert it to lower case // class = positive, negative like sentiment // for the purpose of multiple classes // we use integer format // for example, the preference degree as class // will span from 1 to 10; 10 is the most preferred // weight values are 10 * class Class int Weight int Text string }