golinear

package module

v1.0.0 Latest Latest Go to latest Published: Aug 29, 2018 License: BSD-3-Clause Imports: 6 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/danieldk/golinear

Links

Open Source Insights

README ¶

Introduction

golinear is a package for training and using linear classifiers in the Go programming language (golang).

Installation

To use this package, you need the liblinear library. On Mac OS X, you can install this library with Homebrew:

brew install liblinear

Ubuntu and Debian provide packages for liblinear. However, at the time of writing (July 2, 2014), these were serverly outdated. This package requires version 1.9 or later.

This latest API-stable version (v1) can be installed with the go command:

go get gopkg.in/danieldk/golinear.v1

or included in your source code:

import "gopkg.in/danieldk/golinear.v1"

The package documentation is available at: http://godoc.org/gopkg.in/danieldk/golinear.v1

Plans

Port classification to Go.
Port training to Go.

We will take a pragmatic approach to porting code to Go: if the performance penalty is minor, ported code will flow to the main branch. Otherwise, we will keep it around until the performance is good enough.

Examples

Examples for using golinear can be found at:

https://github.com/danieldk/golinear-examples

Documentation ¶

Overview ¶

Package golinear trains and applies linear classifiers.

The package is a binding against liblinear with a Go-ish interface. Trained models can be saved to and loaded from disk, to avoid the (potentially) costly training process.

A model is trained using a problem. A problem consists of training instances, where each training instance has a class label and a feature vector. The training procedure attempts to find one or more functions that separate the instances of two classes. This model can then predict the class of unseen instances.

Consider for instance that we would like to do sentiment analysis, using the following, humble, training corpus:

Positive: A beautiful album.
Negative: A crappy ugly album.

To represent this as a problem, we have to convert the classses (positive/negative) to an integral class labels and extract features. In this case, we can simply label the classes as positive: 0, negative: 1. We will use the words as our features (a: 1, beautiful: 2, album: 3, crappy: 4, ugly: 5) and use booleans as our feature values. In other words, the sentences will have the following feature vectors:

            1   2   3   4   5
          +---+---+---+---+---+
Positive: | 1 | 1 | 1 | 0 | 0 |
          +---+---+---+---+---+

          +---+---+---+---+---+
Negative: | 1 | 0 | 1 | 1 | 1 |
          +---+---+---+---+---+

We can now construct the problem using this representation:

problem := golinear.NewProblem()
problem.Add(golinear.TrainingInstance{0, golinear.FromDenseVector([]float64{1, 1, 1, 0, 0})})
problem.Add(golinear.TrainingInstance{1, golinear.FromDenseVector([]float64{1, 0, 1, 1, 1})})

The problem is used to train a linear classifier using a set of parameters to choose the type of solver, constraint violation cost, etc. We will use the default parameters, which train a L2-regularized L2-loss support vector classifier.

param := golinear.DefaultParameters()
model, err := golinear.TrainModel(param, problem)
if err != nil {
	log.Fatal(err)
}

Of course, now we would like to use this model to classify other sentences. For instance:

This is a beautiful book.

We map this sentence to the feature vector that we used during training, simply ignoring words that we did not encounter while training the model:

          +---+---+---+---+---+
????????: | 1 | 1 | 0 | 0 | 0 |
          +---+---+---+---+---+

The Predict method of the model is used to predict the label of this feature vector.

label := model.Predict(golinear.FromDenseVector([]float64{1, 1, 0, 0, 0}))

As expected, the model will predict the sentence to be positive (0).

Index ¶

func CrossValidation(problem *Problem, param Parameters, nFolds uint) ([]float64, error)
type ClassWeight
type FeatureValue
type FeatureVector
- func FromDenseVector(denseVector []float64) FeatureVector
type Model
- func LoadModel(filename string) (*Model, error)
- func TrainModel(param Parameters, problem *Problem) (*Model, error)
type Parameters
- func DefaultParameters() Parameters
type Problem
- func NewProblem() *Problem
type ProblemIterFunc
type SolverType
type TrainingInstance

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func CrossValidation ¶

func CrossValidation(problem *Problem, param Parameters, nFolds uint) ([]float64, error)

Perform cross validation. The instances in the problem are separated in the given number of folds. Each fold is sequentially evaluated using the model trained with the remaining folds. The slice that is returned contains the predicted instance classes.

Types ¶

type ClassWeight ¶

type ClassWeight struct {
	Label int
	Value float64
}

type FeatureValue ¶

type FeatureValue struct {
	Index int
	Value float64
}

Represents a feature and its value. The Index of a feature is used to uniquely identify the feature, and should start at 1.

type FeatureVector ¶

type FeatureVector []FeatureValue

Sparse feature vector, represented as the list (slice) of non-zero features.

func FromDenseVector ¶

func FromDenseVector(denseVector []float64) FeatureVector

Convert a dense feature vector, represented as a slice of feature values to the sparse representation used by this package. The features will be numbered 1..len(denseVector). The following vectors will be equal:

gosvm.FromDenseVector([]float64{0.2, 0.1, 0.3, 0.6})
gosvm.FeatureVector{{1, 0.2}, {2, 0.1}, {3, 0.3}, {4, 0.6}}

type Model ¶

type Model struct {
	// contains filtered or unexported fields
}

A model contains the trained model and can be used to predict the class of a seen or unseen instance.

func LoadModel ¶

func LoadModel(filename string) (*Model, error)

Load a previously saved model.

func TrainModel ¶

func TrainModel(param Parameters, problem *Problem) (*Model, error)

Train an SVM using the given parameters and problem.

func (*Model) Bias ¶

func (model *Model) Bias() float64

Extracts the bias of a two-class problem.

func (*Model) Labels ¶

func (model *Model) Labels() []int

Get a slice with class labels

func (*Model) Predict ¶

func (model *Model) Predict(nodes []FeatureValue) float64

Predict the label of an instance using the given model.

func (*Model) PredictDecisionValues ¶

func (model *Model) PredictDecisionValues(nodes []FeatureValue) (float64, map[int]float64, error)

Predict the label of an instance. In contrast to Predict, it also returns the per-label decision values.

func (*Model) PredictDecisionValuesSlice ¶

func (model *Model) PredictDecisionValuesSlice(nodes []FeatureValue) (float64, []float64, error)

Predict the label of an instance. In contrast to Predict, it also returns the per-label decision values. The PredictDecisionValues function is more user-friendly, but has the overhead of constructing a map. If you are only interested in the classes with the highest decision values, it may be better to use this function in conjunction with Labels().

func (*Model) PredictProbability ¶

func (model *Model) PredictProbability(nodes []FeatureValue) (float64, map[int]float64, error)

Predict the label of an instance, given a model with probability information. This method returns the label of the predicted class, a map of class probabilities. Probability estimates are currently given for logistic regression only. If another solver is used, the probability of each class is zero.

func (*Model) PredictProbabilitySlice ¶

func (model *Model) PredictProbabilitySlice(nodes []FeatureValue) (float64, []float64, error)

Predict the label of an instance, given a model with probability information. This method returns the label of the predicted class, a map of class probabilities. Probability estimates are currently given for logistic regression only. If another solver is used, the probability of each class is zero.

The PredictProbability function is more user-friendly, but has the overhead of constructing a map. If you are only interested in the classes with the highest probabilities, it may be better to use this function in conjunction with Labels().

func (*Model) Save ¶

func (model *Model) Save(filename string) error

Save the model to a file.

func (*Model) Weights ¶

func (model *Model) Weights() []float64

Extracts the weight vector of a two-class problem.

func (*Model) WeightsMulti ¶

func (model *Model) WeightsMulti() [][]float64

Extracts the weight vectors of a multi-class problem.

NOT IMPLEMENTED.

type Parameters ¶

type Parameters struct {
	// The type of solver
	SolverType SolverType

	// The cost of constraints violation.
	Cost float64
	// The relative penalty for each class.
	RelCosts []ClassWeight
}

Parameters for training a linear model.

func DefaultParameters ¶

func DefaultParameters() Parameters

type Problem ¶

type Problem struct {
	// contains filtered or unexported fields
}

A problem is a set of instances and corresponding labels.

func NewProblem ¶

func NewProblem() *Problem

func (*Problem) Add ¶

func (problem *Problem) Add(trainInst TrainingInstance) error

func (*Problem) Bias ¶

func (problem *Problem) Bias() float64

func (*Problem) Iterate ¶

func (problem *Problem) Iterate(fun ProblemIterFunc)

Iterate over the training instances in a problem.

func (*Problem) SetBias ¶

func (problem *Problem) SetBias(bias float64)

type ProblemIterFunc ¶

type ProblemIterFunc func(instance *TrainingInstance) bool

Function prototype for iteration over problems. The function should return 'true' if the iteration should continue or 'false' otherwise.

type SolverType ¶

type SolverType struct {
	// contains filtered or unexported fields
}

func NewL1RL2LossSvc ¶

func NewL1RL2LossSvc(epsilon float64) SolverType

L1-regularized L2-loss support vector classification.

func NewL1RL2LossSvcDefault ¶

func NewL1RL2LossSvcDefault() SolverType

L1-regularized L2-loss support vector classification, epsilon = 0.01.

func NewL1RLogisticRegression ¶

func NewL1RLogisticRegression(epsilon float64) SolverType

L1-regularized logistic regression.

func NewL1RLogisticRegressionDefault ¶

func NewL1RLogisticRegressionDefault() SolverType

L1-regularized logistic regression, epsilon = 0.01.

func NewL2RL1LossSvRegressionDual ¶

func NewL2RL1LossSvRegressionDual(epsilon float64) SolverType

L2-regularized L1-loss support vector regression (dual).

func NewL2RL1LossSvRegressionDualDefault ¶

func NewL2RL1LossSvRegressionDualDefault(epsilon float64) SolverType

L2-regularized L1-loss support vector regression (dual), epsilon = 0.1.

func NewL2RL1LossSvcDual ¶

func NewL2RL1LossSvcDual(epsilon float64) SolverType

L2-regularized L1-loss support vector classification (dual).

func NewL2RL1LossSvcDualDefault ¶

func NewL2RL1LossSvcDualDefault() SolverType

L2-regularized L1-loss support vector classification (dual), epsilon = 0.1.

func NewL2RL2LossSvRegression ¶

func NewL2RL2LossSvRegression(epsilon float64) SolverType

L2-regularized L2-loss support vector regression (primal).

func NewL2RL2LossSvRegressionDefault ¶

func NewL2RL2LossSvRegressionDefault(epsilon float64) SolverType

L2-regularized L2-loss support vector regression (primal), epsilon = 0.001.

func NewL2RL2LossSvRegressionDual ¶

func NewL2RL2LossSvRegressionDual(epsilon float64) SolverType

L2-regularized L2-loss support vector regression (dual).

func NewL2RL2LossSvRegressionDualDefault ¶

func NewL2RL2LossSvRegressionDualDefault(epsilon float64) SolverType

L2-regularized L2-loss support vector regression (dual), epsilon = 0.1.

func NewL2RL2LossSvcDual ¶

func NewL2RL2LossSvcDual(epsilon float64) SolverType

L2-regularized L2-loss support vector classification (dual).

func NewL2RL2LossSvcDualDefault ¶

func NewL2RL2LossSvcDualDefault() SolverType

L2-regularized L2-loss support vector classification (dual), epsilon = 0.1.

func NewL2RL2LossSvcPrimal ¶

func NewL2RL2LossSvcPrimal(epsilon float64) SolverType

L2-regularized L2-loss support vector classification (primal).

func NewL2RL2LossSvcPrimalDefault ¶

func NewL2RL2LossSvcPrimalDefault() SolverType

L2-regularized L2-loss support vector classification (primal), epsilon = 0.01.

func NewL2RLogisticRegression ¶

func NewL2RLogisticRegression(epsilon float64) SolverType

L2-regularized logistic regression (primal).

func NewL2RLogisticRegressionDefault ¶

func NewL2RLogisticRegressionDefault() SolverType

L2-regularized logistic regression (primal), epsilon = 0.01.

func NewL2RLogisticRegressionDual ¶

func NewL2RLogisticRegressionDual(epsilon float64) SolverType

L2-regularized logistic regression (dual) for regression.

func NewL2RLogisticRegressionDualDefault ¶

func NewL2RLogisticRegressionDualDefault() SolverType

L2-regularized logistic regression (dual) for regression, epsilon = 0.1.

func NewMCSVMCS ¶

func NewMCSVMCS(epsilon float64) SolverType

Support vector classification by Crammer and Singer.

func NewMCSVMCSDefault ¶

func NewMCSVMCSDefault() SolverType

Support vector classification by Crammer and Singer, epsilon = 0.1.

type TrainingInstance ¶

type TrainingInstance struct {
	Label    float64
	Features FeatureVector
}

Training instance, consisting of the label of the instance and its feature vector. In classification, the label is an integer indicating the class label. In regression, the label is the target value, which can be any real number. The label is not used for one-class SVMs.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL