sbr

package module
v0.3.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2018 License: MIT Imports: 16 Imported by: 0

README

sbr-go

Build Status Godoc

A recommender system package for Go.

Sbr implements state-of-the-art sequence-based models, using the history of what a user has liked to suggest new items. As a result, it makes accurate predictions that can be updated in real-time in response to user actions without model re-training.

Usage

You can fit a model on the Movielens 100K dataset in about 10 seconds using the following code:

	// Load the data.
	data, err := sbr.GetMovielens()
	if err != nil {
		panic(err)
	}
	fmt.Printf("Loaded movielens data: %v users and %v items for a total of %v interactions\n",
		data.NumUsers(), data.NumItems(), data.Len())

	// Split into test and train.
	rng := rand.New(rand.NewSource(42))
	train, test := sbr.TrainTestSplit(data, 0.2, rng)
	fmt.Printf("Train len %v, test len %v\n", train.Len(), test.Len())

	// Instantiate the model.
	model := sbr.NewImplicitLSTMModel(train.NumItems())

	// Set the hyperparameters.
	model.ItemEmbeddingDim = 32
	model.LearningRate = 0.16
	model.L2Penalty = 0.0004
	model.NumEpochs = 10
	model.NumThreads = 1

	// Set random seed
	var randomSeed [16]byte
	for idx := range randomSeed {
		randomSeed[idx] = 42
	}
	model.RandomSeed = randomSeed

	// Fit the model.
	fmt.Printf("Fitting the model...\n")
	loss, err := model.Fit(&train)
	if err != nil {
		panic(err)
	}

	// And evaluate.
	fmt.Printf("Evaluating the model...\n")
	mrr, err := model.MRRScore(&test)
	if err != nil {
		panic(err)
	}
	fmt.Printf("Loss %v, MRR: %v\n", loss, mrr)

Installation

Run

go get github.com/maciejkula/sbr-go

followed by

make

in the installation directory. This will download the package's native dependencies. On both OSX and Linux, the resulting binaries are fully statically linked, and you can deploy them like any other Go binary.

If you prefer to build the dependencies from source, run make source instead.

Documentation

Overview

A recommender system package for Go.

Sbr implements cutting-edge sequence-based recommenders: for every user, we examine what they have interacted up to now to predict what they are going to consume next.

Usage

You can fit a model on the Movielens 100K dataset in about 10 seconds using the following (taken from https://github.com/maciejkula/sbr-go/blob/master/examples/movielens/main.go):

 import (
	 "fmt"
	 "math/rand"

	 sbr "github.com/maciejkula/sbr-go"
 )

 data, err := sbr.GetMovielens()
 if err != nil {
     panic(err)
 }
 fmt.Printf("Loaded movielens data: %v users and %v items for a total of %v interactions\n",
     data.NumUsers(), data.NumItems(), data.Len())

 // Split into test and train.
 rng := rand.New(rand.NewSource(42))
 train, test := sbr.TrainTestSplit(data, 0.2, rng)
 fmt.Printf("Train len %v, test len %v\n", train.Len(), test.Len())

 // Instantiate the model.
 model := sbr.NewImplicitLSTMModel(train.NumItems())

 // Set the hyperparameters.
 model.ItemEmbeddingDim = 32
 model.LearningRate = 0.16
 model.L2Penalty = 0.0004
 model.NumEpochs = 10
 model.NumThreads = 1

 // Set random seed
 var randomSeed [16]byte
 for idx := range randomSeed {
     randomSeed[idx] = 42
 }
 model.RandomSeed = randomSeed

 // Fit the model.
 fmt.Printf("Fitting the model...\n")
 loss, err := model.Fit(&train)
 if err != nil {
     panic(err)
 }

 // And evaluate.
 fmt.Printf("Evaluating the model...\n")
 mrr, err := model.MRRScore(&test)
 if err != nil {
     panic(err)
 }
 fmt.Printf("Loss %v, MRR: %v\n", loss, mrr)

Installation

Run

go get github.com/maciejkula/sbr-go

followed by

make

in the installation directory. This wil download the package's native dependencies. On both OSX and Linux, the resulting binaries are fully statically linked, and you can deploy them like any other Go binary.

Index

Constants

View Source
const (
	// Bayesian personalised ranking loss.
	BPR Loss = 0
	// Pairwise hinge loss.
	Hinge Loss = 1
	// WARP loss. More accurate in most cases than
	// the other loss functions at the expense of
	// fitting speed.
	WARP Loss = 2
	// ADAM optimizer.
	Adam Optimizer = 0
	// Adagrad optimizer.
	Adagrad Optimizer = 1
)

Variables

This section is empty.

Functions

func TrainTestSplit

func TrainTestSplit(data *Interactions, testFraction float64, rng *rand.Rand) (Interactions, Interactions)

Split the interaction data into training and test sets. The data is split so that there is no overlap between users in training and test sets, making perfomance evaluation reflect the model's perfomance on entirely new users.

Returns a tuple of (training, test) data.

Types

type ImplicitLSTMModel

type ImplicitLSTMModel struct {
	// Number of items in the model.
	NumItems int
	// Maximum sequence length to consider. Setting
	// this to lower values will yield models that
	// are faster to train and evaluate, but have
	// a shorter memory.
	MaxSequenceLength int
	// Dimension of item embeddings. Setting this to
	// higher values will yield models that are slower
	// to fit but are potentially more expressive (at
	// the risk of overfitting).
	ItemEmbeddingDim int
	// Initial learning rate.
	LearningRate float32
	// L2 penalty.
	L2Penalty float32
	// Whether the LSTM should use coupled forget and update
	// gates, yielding a model that's faster to train.
	Coupled bool
	// Number of threads to use for training.
	NumThreads int
	// Number of epochs to use for training. To run more epochs,
	// call the fit method multiple times.
	NumEpochs int
	// Type of loss function to use.
	Loss Loss
	// Optimizer to use.
	Optimizer  Optimizer
	RandomSeed [16]byte
	// contains filtered or unexported fields
}

An implicit-feedback LSTM-based sequence model.

func NewImplicitLSTMModel

func NewImplicitLSTMModel(numItems int) *ImplicitLSTMModel

Build a new model with a capacity to represent a certain number of items. In order to avoid leaking memory, the model must be freed usint its Free method once no longer in use.

func (*ImplicitLSTMModel) Fit

func (self *ImplicitLSTMModel) Fit(data *Interactions) (float32, error)

Fit the model on the supplied data, returning the loss value after fitting. Calling this multiple times will resume training.

func (*ImplicitLSTMModel) Free

func (self *ImplicitLSTMModel) Free()

Free the memory associated with the underlying model.

Unlike other methods of the model, calling Free is _not_ thread safe. Use an external synchronisation method when freeing a model used from multiple goroutines.

func (*ImplicitLSTMModel) MRRScore

func (self *ImplicitLSTMModel) MRRScore(data *Interactions) (float32, error)

Compute the mean reciprocal rank score of the model on supplied interaction data.

Higher MRR values reflect better predictive performance of the model. The score is calculated by taking all but the last interactions of all users as their history, then making predictions for the last item they are going to see.

func (*ImplicitLSTMModel) MarshalBinary

func (self *ImplicitLSTMModel) MarshalBinary() ([]byte, error)

Serialize the model into a byte array. Satisfies the encoding.BinaryMarshaler interface.

func (*ImplicitLSTMModel) Predict

func (self *ImplicitLSTMModel) Predict(interactionHistory []int, itemsToScore []int) ([]float32, error)

Make predictions. Provides scores for itemsToScore for a user who has seen interactionHistory items. Items in the history argument should be arranged chronologically, from the earliest seen item to the latest seen item.

Returns a slice of scores for the supplied items, where a higher score indicates a better recommendation.

func (*ImplicitLSTMModel) UnmarshalBinary

func (self *ImplicitLSTMModel) UnmarshalBinary(data []byte) error

Deserialize the model from a byte array. Satisfies the encoding.BinaryUnmarshaler interface.

type Indexer

type Indexer struct {
	// contains filtered or unexported fields
}

Helper for translating user and item ids into contiguous indices.

func NewIndexer

func NewIndexer() Indexer

Build a new indexer.

func (*Indexer) Add

func (self *Indexer) Add(id string) int

Add a new id to the indexer, returning its model index.

func (*Indexer) GetId

func (self *Indexer) GetId(idx int) (string, bool)

Get the id from a model index.

type Interactions

type Interactions struct {
	// contains filtered or unexported fields
}

Contains interactons for training the model.

func GetMovielens

func GetMovielens() (*Interactions, error)

Download and return the Movielens 100K dataset.

func NewInteractions

func NewInteractions(numUsers int, numItems int) Interactions

Construct new empty interactions.

func (*Interactions) Append

func (self *Interactions) Append(userId int, itemId int, timestamp int)

Add a (user, item, timestamp) triple to the dataset.

func (*Interactions) Len

func (self *Interactions) Len() int

Return the number of interactions.

func (*Interactions) NumItems

func (self *Interactions) NumItems() int

Get the total number of distinct items in the data.

func (*Interactions) NumUsers

func (self *Interactions) NumUsers() int

Get the total number of distinct users in the data.

type Loss

type Loss int

type Optimizer

type Optimizer int

Directories

Path Synopsis
examples

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL