mockingbird

package module
v0.0.0-...-461a5da Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 3, 2015 License: MIT Imports: 9 Imported by: 0

README

Mockingbird Build Status

Introduction

Linguist's Classifier in Go.

Linguist can be used as a Go package by

import "github.com/lazywei/linguist"

and it also has a CLI (command line interface) in cli/

$ cd cli/
$ ./build.sh
$ ./mockingbird --help

Command Line Interface Usage

Preparing LIBSVM format dataset

Collect Rosetta Code
  1. Clone the RosettaCodeData
git clone git@github.com:acmeism/RosettaCodeData.git
  1. Build this cli executable
cd cli/
./build.sh
  1. Run the collectRosetta according to the cloned RosettaCodeData, and collect files to ../samples
./mockingbird collectRosetta path/to/clones/RosettaCodeData ../samples
Build Bag-of-Words and Convert Samples to Libsvm

Build from scratch

./mockingbird convertLibsvm ../samples ../

This will save libsvm.samples and bow.gob to ../. The bow.gob is the parameters for constructing bag-of-words. This can be used afterward:

./mockingbird convertLibsvm ../samples ../ --bowPath ../bow.gob

Train and Predict

Train

For example, train a logisitic regression classifier:

./mockingbird train --sample=./test_fixture/test_samples.libsvm --solver 1

This will save a model file in $PWD/model/lr.model, which can be used in later prediction.

Full usage:

usage: mockingbird train [<flags>]

Train Classifier

Flags:
  --help            Show help (also see --help-long and --help-man).
  --sample="samples.libsvm"
                    Path for samples (in libsvm format)
  --output="model"  Path for saving trained model
  --solver=0        0 = NaiveBayes, 1 = LogisticRegression
Predict

For example, make prediction via previously trained logisitic regression classifier:

./mockingbird predict --model=./model/lr.model --data=./test_fixture/test_samples.libsvm --solver=1

Full usage:

usage: mockingbird predict --data=DATA [<flags>]

Predict via trained Classifier

Flags:
  --help       Show help (also see --help-long and --help-man).
  --model="./model/naive_bayes.gob"
               Path for loading saved model
  --data=DATA  Path for testing data (in libsvm format)
  --solver=0   0 = NaiveBayes, 1 = LogisticRegression

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ExtractTokens

func ExtractTokens(data string) []string

Types

type Classifier

type Classifier interface {
	Fit()
	Predict()
}

type LogisticRegression

type LogisticRegression struct {
	// contains filtered or unexported fields
}

func NewLogisticRegression

func NewLogisticRegression() *LogisticRegression

func NewLogisticRegressionFromModel

func NewLogisticRegressionFromModel(filepath string) *LogisticRegression

func (*LogisticRegression) Fit

func (lr *LogisticRegression) Fit(X, y *mat64.Dense)

func (*LogisticRegression) Predict

func (lr *LogisticRegression) Predict(X *mat64.Dense) []Prediction

func (*LogisticRegression) SaveModel

func (lr *LogisticRegression) SaveModel(filepath string)

type NaiveBayes

type NaiveBayes struct {
	// contains filtered or unexported fields
}

func NewNaiveBayes

func NewNaiveBayes() *NaiveBayes

func NewNaiveBayesFromGob

func NewNaiveBayesFromGob(gobStr string) *NaiveBayes

func (*NaiveBayes) Fit

func (nb *NaiveBayes) Fit(X, y *mat64.Dense)

func (*NaiveBayes) GetParams

func (nb *NaiveBayes) GetParams() (
	tokensTotal int,
	langsTotal int,
	langsCount map[int]int,
	tokensTotalPerLang map[int]int,
	tokenCountPerLang map[int](map[int]int))

func (*NaiveBayes) Predict

func (nb *NaiveBayes) Predict(X *mat64.Dense) []Prediction

func (*NaiveBayes) ToGob

func (nb *NaiveBayes) ToGob() string

type Prediction

type Prediction struct {
	Label    int
	Language string
	Score    float64
}

Directories

Path Synopsis
This package provides Ruby's StringScanner-like functions.
This package provides Ruby's StringScanner-like functions.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL