disco

package module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 30, 2023 License: MIT Imports: 15 Imported by: 0

README

Disco Go

🔥 Recommendations for Go using collaborative filtering

  • Supports user-based and item-based recommendations
  • Works with explicit and implicit feedback
  • Uses high-performance matrix factorization

Build Status

Installation

Run:

go get github.com/ankane/disco-go

Getting Started

Import the package

import "github.com/ankane/disco-go"

Prep your data in the format userId, itemId, value

data := disco.NewDataset[string, string]()
data.Push("user_a", "item_a", 5.0)
data.Push("user_a", "item_b", 3.5)
data.Push("user_b", "item_a", 4.0)

IDs can be integers or strings

data.Push(1, "item_a", 5.0)

If users rate items directly, this is known as explicit feedback. Fit the recommender with:

recommender, err := disco.FitExplicit(data)

If users don’t rate items directly (for instance, they’re purchasing items or reading posts), this is known as implicit feedback. Use 1.0 or a value like number of purchases or page views for the dataset, and fit the recommender with:

recommender, err := disco.FitImplicit(data)

Get user-based recommendations - “users like you also liked”

recommender.UserRecs(userId, 5)

Get item-based recommendations - “users who liked this item also liked”

recommender.ItemRecs(itemId, 5)

Get predicted ratings for a specific user and item

recommender.Predict(userId, itemId)

Get similar users

recommender.SimilarUsers(userId, 5)

Examples

MovieLens

Load the data

data, err := disco.LoadMovieLens()

Create a recommender

recommender, err := disco.FitExplicit(data, disco.Factors(20))

Get similar movies

recommender.ItemRecs("Star Wars (1977)")

Storing Recommendations

Save recommendations to your database.

Alternatively, you can store only the factors and use a library like pgvector-go. See an example.

Algorithms

Disco uses high-performance matrix factorization.

Specify the number of factors and iterations

recommender, err := disco.FitExplicit(data, disco.Factors(8), disco.Iterations(20))

Progress

Pass a callback to show progress

callback := func(info disco.FitInfo) { fmt.Printf("%+v\n", info) }
recommender, err := disco.FitExplicit(data, disco.Callback(callback))

Note: TrainLoss and ValidLoss are not available for implicit feedback

Validation

Pass a validation set with explicit feedback

recommender, err := disco.FitEvalExplicit(trainSet, validSet)

The loss function is RMSE

Cold Start

Collaborative filtering suffers from the cold start problem. It’s unable to make good recommendations without data on a user or item, which is problematic for new users and items.

recommender.UserRecs(newUserId, 5) // returns empty array

There are a number of ways to deal with this, but here are some common ones:

  • For user-based recommendations, show new users the most popular items
  • For item-based recommendations, make content-based recommendations

Reference

Get ids

recommender.UserIds()
recommender.ItemIds()

Get the global mean

recommender.GlobalMean()

Get factors

recommender.UserFactors(userId)
recommender.ItemFactors(itemId)

References

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/ankane/disco-go.git
cd disco-go
go mod tidy
go test -v

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Dataset

type Dataset[T Id, U Id] struct {
	// contains filtered or unexported fields
}

func LoadMovieLens

func LoadMovieLens() (*Dataset[int, string], error)

func NewDataset

func NewDataset[T Id, U Id]() *Dataset[T, U]

func (*Dataset[T, U]) Len

func (d *Dataset[T, U]) Len() int

func (*Dataset[T, U]) Push

func (d *Dataset[T, U]) Push(userId T, itemId U, value float32)

func (*Dataset[T, U]) SplitRandom

func (d *Dataset[T, U]) SplitRandom(p float32) (*Dataset[T, U], *Dataset[T, U])

type FitInfo

type FitInfo struct {
	Iteration int
	TrainLoss float32
	ValidLoss float32
}

type Id

type Id interface {
	string | int | uint | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64
}

type Option

type Option func(*config)

func Alpha added in v0.1.1

func Alpha(alpha float32) Option

func Callback

func Callback(callback func(info FitInfo)) Option

func Factors

func Factors(factors int) Option

func Iterations

func Iterations(iterations int) Option

func LearningRate

func LearningRate(learningRate float32) Option

func Regularization

func Regularization(regularization float32) Option

func Seed

func Seed(seed int64) Option

type Rec

type Rec[T Id] struct {
	Id    T
	Score float32
}

type Recommender

type Recommender[T Id, U Id] struct {
	// contains filtered or unexported fields
}

func FitEvalExplicit

func FitEvalExplicit[T Id, U Id](trainSet *Dataset[T, U], validSet *Dataset[T, U], options ...Option) (*Recommender[T, U], error)

func FitExplicit

func FitExplicit[T Id, U Id](trainSet *Dataset[T, U], options ...Option) (*Recommender[T, U], error)

func FitImplicit

func FitImplicit[T Id, U Id](trainSet *Dataset[T, U], options ...Option) (*Recommender[T, U], error)

func (*Recommender[T, U]) GlobalMean

func (r *Recommender[T, U]) GlobalMean() float32

func (*Recommender[T, U]) ItemFactors

func (r *Recommender[T, U]) ItemFactors(itemId U) []float32

func (*Recommender[T, U]) ItemIds

func (r *Recommender[T, U]) ItemIds() []U

func (*Recommender[T, U]) ItemRecs

func (r *Recommender[T, U]) ItemRecs(itemId U, count int) []Rec[U]

func (*Recommender[T, U]) Predict

func (r *Recommender[T, U]) Predict(userId T, itemId U) float32

func (*Recommender[T, U]) Rmse

func (r *Recommender[T, U]) Rmse(data *Dataset[T, U]) float32

func (*Recommender[T, U]) SimilarUsers

func (r *Recommender[T, U]) SimilarUsers(userId T, count int) []Rec[T]

func (*Recommender[T, U]) UserFactors

func (r *Recommender[T, U]) UserFactors(userId T) []float32

func (*Recommender[T, U]) UserIds

func (r *Recommender[T, U]) UserIds() []T

func (*Recommender[T, U]) UserRecs

func (r *Recommender[T, U]) UserRecs(userId T, count int) []Rec[U]

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL