still

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 2, 2015 License: MIT Imports: 8 Imported by: 0

README

Still

License Version Wercker Coverage

A command-line tool to filter out needless text by using statistical classifier.

Installation

For installation, execute the following command:

$ go get github.com/mitsuse/still/cmd/still

Dependencies

Still includes the following libraries as vendored packages:

Usage

Build a model

Still requires the model file to filter out text, which consists of weights for the binary linear classifier.

To build the model, use still build:

$ still build -m model.still -e examples.json -i 3

-m represents the output path of a built model. -e is used to specify the path of training data. The JSON of training data should be a single array of objects which consists of "text" and "class" as follow:

[
  {
    "text": "Go 1.5 is released https://blog.golang.org/go1.5 #go_blog",
    "class": 1
  },
  {
    "text": "OnHub – Google https://on.google.com/hub/",
    "class": 0
  }
]

The "text" field is used for example of classification. The "class" field represents the correct label of classification result.

To set the number of iterations, use -i. The training data are read N times when N is given as the value for -i.

Test a model

Still can test the trained model on test data with the following command:

$ still test -m model.still -e examples.json

-m is used for the path of a training model. -e represents the path of test data. The test data has the same format as the training data.

Test command show precision and recall.

Filter out text

Still is used as a filter for the standard IO like grep:

$ cat input.txt | still filter -m model.still

In the above command, The classification examples are lines of input.txt. The option -f can be used to print filtered-out text instead.

License

Please read LICENSE.txt.

Documentation

Index

Constants

View Source
const (
	AlreadyInitializedError  = "AlreadyInitializedError"
	IncompatibleVersionError = "IncompatibleVersionError"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Example

type Example struct {
	Text  string `json:"text"`
	Class int    `json:"class"`
}

type Extractor

type Extractor struct {
	// contains filtered or unexported fields
}

func (*Extractor) Dimensions

func (e *Extractor) Dimensions() int

func (*Extractor) Extract

func (e *Extractor) Extract(text string) matrix.Matrix

func (*Extractor) MarshalJSON

func (e *Extractor) MarshalJSON() ([]byte, error)

func (*Extractor) UnmarshalJSON

func (e *Extractor) UnmarshalJSON(b []byte) error

type Still

type Still struct {
	// contains filtered or unexported fields
}

func Deserialize

func Deserialize(reader io.Reader) (*Still, error)

func Learn

func Learn(iterations int, exampleSeq []*Example) *Still

func (*Still) Filter

func (s *Still) Filter(text string) bool

func (*Still) FilterAll

func (s *Still) FilterAll(inputSeq []string) []string

func (*Still) MarshalJSON

func (s *Still) MarshalJSON() ([]byte, error)

func (*Still) Serialize

func (s *Still) Serialize(writer io.Writer) error

func (*Still) UnmarshalJSON

func (s *Still) UnmarshalJSON(b []byte) error

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL