fuzzy

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 31, 2020 License: MIT Imports: 6 Imported by: 20

README

fuzzy

Build Status Go Report Card Codacy coverage GitHub licence GoDoc

Package fuzzy implements fuzzy matching/sorting of string slices and custom types (via fuzzy.Sortable interface).

Import path is go.deanishe.net/fuzzy

The fuzzy matching algorithm is based on Forrest Smith's reverse engineering of Sublime Text's search.

Features

  • Sublime Text-like fuzzy matching
  • Simple sorting of string slices via fuzzy.SortStrings()
  • Sorting of custom types via fuzzy.Sortable interface
  • Intelligent handling of diacritics

Usage

See Godoc for full documentation.

Slice of strings
actors := []string{"Tommy Lee Jones", "James Earl Jones", "Keanu Reeves"}
fuzzy.SortStrings(actors, "jej")
fmt.Println(actors[0])
// -> James Earl Jones
Custom types

To sort a custom type, it must implement fuzzy.Sortable:

type Sortable interface {
    sort.Interface
    // Keywords returns the string to compare to the sort query
    Keywords(i int) string
}

This is a superset of sort.Interface (i.e. your type must also implement sort.Interface). The string returned by Keywords() is compared to the search query using the fuzzy algorithm. The Less() function of sort.Interface is used as a tie-breaker if multiple items have the same fuzzy matching score.

// Player is a very simple data model.
type Player struct {
    Firstname string
    Lastname  string
}

// Name returns the full name of the Player.
func (p *Player) Name() string {
    return strings.TrimSpace(p.Firstname + " " + p.Lastname)
}

// Team is a collection of Player items. This is where fuzzy.Sortable
// must be implemented to enable fuzzy sorting.
type Team []*Player

// Default sort.Interface methods
func (t Team) Len() int      { return len(t) }
func (t Team) Swap(i, j int) { t[i], t[j] = t[j], t[i] }

// Less is used as a tie-breaker when fuzzy match score is the same.
func (t Team) Less(i, j int) bool { return t[i].Name() < t[j].Name() }

// Keywords implements Sortable.
// Comparisons are based on the the full name of the player.
func (t Team) Keywords(i int) string { return t[i].Name() }

// Fuzzy sort players by name.
var t = Team{
    &Player{"Alisson", "Becker"},
    &Player{Firstname: "Adrián"},
    &Player{"Andy", "Lonergan"},
    &Player{"Caoimhín", "Kelleher"},
    &Player{"Virgil", "van Dijk"},
    &Player{"Joe", "Gomez"},
    &Player{"Andy", "Robertson"},
    &Player{"Joel", "Matip"},
    &Player{"Ki-Jana", "Hoever"},
    &Player{"Trent", "Alexander-Arnold"},
    &Player{"Sepp", "van den Berg"},
    &Player{"Neco", "Williams"},
    &Player{Firstname: "Fabinho"},
    &Player{"Georginio", "Wijnaldum"},
    &Player{"James", "Milner"},
    &Player{"Naby", "Keita"},
    &Player{"Jordan", "Henderson"},
    &Player{"Alex", "Oxlade-Chamberlain"},
    &Player{"Xherdan", "Shaqiri"},
    &Player{"Curtis", "Jones"},
    &Player{"Harvel", "Elliott"},
    &Player{"Roberto", "Firmino"},
    &Player{"Sadio", "Mané"},
    &Player{"Mohamed", "Salah"},
    &Player{"Takumi", "Minamino"},
    &Player{"Divock", "Origi"},
}
// Unsorted
fmt.Println(t[0].Name()) // --> Alisson Becker

// Initials
Sort(t, "taa")
fmt.Println(t[0].Name()) // --> Trent Alexander-Arnold

// Initials beat start of string
Sort(t, "al")
fmt.Println(t[0].Name()) // --> Andy Lonergan

// Start of word
Sort(t, "ox")
fmt.Println(t[0].Name()) // --> Alex Oxlade-Chamberlain

// Earlier in string = better match
Sort(t, "x")
fmt.Println(t[0].Name()) // --> Xherdan Shaqiri

// Diacritics are considered
Sort(t, "ne")
fmt.Println(t[0].Name()) // --> Neco Williams
Sort(t, "né")
fmt.Println(t[0].Name()) // --> Sadio Mané

// But ignored if query is ASCII
Sort(t, "mane")
fmt.Println(t[0].Name()) // --> Sadio Mané

Licence

fuzzy is released under the MIT licence.

Documentation

Overview

Package fuzzy implements fuzzy matching and sorting.

Sort() and Match() implement fuzzy search, e.g. "of" will match "OmniFocus" and "got" will match "Game of Thrones".

Match() compares a query and a string, while Sort() sorts a type that implements fuzzy.Sortable. Both return Result structs for each compared string.

The algorithm is based on Forrest Smith's reverse engineering of Sublime Text's search: https://blog.forrestthewoods.com/reverse-engineering-sublime-text-s-fuzzy-match-4cffeed33fdb

It additionally strips diacritics from sort keys if the query is ASCII.

Index

Examples

Constants

View Source
const (
	DefaultAdjacencyBonus          = 5.0  // Bonus for adjacent matches
	DefaultSeparatorBonus          = 10.0 // Bonus if the match is after a separator
	DefaultCamelBonus              = 10.0 // Bonus if match is uppercase and previous is lower
	DefaultLeadingLetterPenalty    = -3.0 // Penalty applied for every letter in string before first match
	DefaultMaxLeadingLetterPenalty = -9.0 // Maximum penalty for leading letters
	DefaultUnmatchedLetterPenalty  = -1.0 // Penalty for every letter that doesn't match
	DefaultStripDiacritics         = true // Strip diacritics from sort keys if query is plain ASCII
)

Default bonuses and penalties for fuzzy sorting. To customise sorting behaviour, pass corresponding Options to New() or Sorter.Configure().

Variables

This section is empty.

Functions

This section is empty.

Types

type Option

type Option func(s *Sorter) Option

Option configures a Sorter. Pass one or more Options to New() or Sorter.Configure(). An Option returns another Option to revert the configuration to the previous state.

func AdjacencyBonus

func AdjacencyBonus(bonus float64) Option

AdjacencyBonus sets the bonus for adjacent matches.

func CamelBonus

func CamelBonus(bonus float64) Option

CamelBonus sets the bonus applied if match is uppercase and previous character is lowercase.

func LeadingLetterPenalty

func LeadingLetterPenalty(penalty float64) Option

LeadingLetterPenalty sets the penalty applied for every character before the first match.

func MaxLeadingLetterPenalty

func MaxLeadingLetterPenalty(penalty float64) Option

MaxLeadingLetterPenalty sets the maximum penalty for character preceding the first match.

func SeparatorBonus

func SeparatorBonus(bonus float64) Option

SeparatorBonus sets the bonus for matches directly after a separator.

func StripDiacritics

func StripDiacritics(on bool) Option

StripDiacritics sets whether diacritics should be stripped.

func UnmatchedLetterPenalty

func UnmatchedLetterPenalty(penalty float64) Option

UnmatchedLetterPenalty sets the penalty for characters that do not match.

type Result

type Result struct {
	// Match is whether or not the string matched the query,
	// i.e. if all characters in the query are present,
	// in order, in the string.
	Match bool
	// Query is the query that was matched against.
	Query string
	// Score is how well the string matched the query.
	// Higher is better.
	Score float64
	// SortKey is the string Query was compared to.
	SortKey string
}

Result stores the result of a single fuzzy ranking.

func Match

func Match(str, query string, opts ...Option) *Result

Match scores str against query using the specified sort options.

WARNING: Match creates a new Sorter for every call. Don't use this on large datasets.

func Sort

func Sort(data Sortable, query string) []*Result

Sort sorts data against query using a new default Sorter.

Example

Fuzzy sort players by name.

package main

import (
	"fmt"
	"strings"
)

// Player is a very simple data model.
type Player struct {
	Firstname string
	Lastname  string
}

// Name returns the full name of the Player.
func (p *Player) Name() string {
	return strings.TrimSpace(p.Firstname + " " + p.Lastname)
}

// Team is a collection of Player items. This is where fuzzy.Sortable
// must be implemented to enable fuzzy sorting.
type Team []*Player

// Default sort.Interface methods
func (t Team) Len() int      { return len(t) }
func (t Team) Swap(i, j int) { t[i], t[j] = t[j], t[i] }

// Less is used as a tie-breaker when fuzzy match score is the same.
func (t Team) Less(i, j int) bool { return t[i].Name() < t[j].Name() }

// Keywords implements Sortable.
// Comparisons are based on the the full name of the player.
func (t Team) Keywords(i int) string { return t[i].Name() }

// Fuzzy sort players by name.
func main() {
	var t = Team{
		&Player{"Alisson", "Becker"},
		&Player{Firstname: "Adrián"},
		&Player{"Andy", "Lonergan"},
		&Player{"Caoimhín", "Kelleher"},
		&Player{"Virgil", "van Dijk"},
		&Player{"Joe", "Gomez"},
		&Player{"Andy", "Robertson"},
		&Player{"Joel", "Matip"},
		&Player{"Ki-Jana", "Hoever"},
		&Player{"Trent", "Alexander-Arnold"},
		&Player{"Sepp", "van den Berg"},
		&Player{"Neco", "Williams"},
		&Player{Firstname: "Fabinho"},
		&Player{"Georginio", "Wijnaldum"},
		&Player{"James", "Milner"},
		&Player{"Naby", "Keita"},
		&Player{"Jordan", "Henderson"},
		&Player{"Alex", "Oxlade-Chamberlain"},
		&Player{"Xherdan", "Shaqiri"},
		&Player{"Curtis", "Jones"},
		&Player{"Harvel", "Elliott"},
		&Player{"Roberto", "Firmino"},
		&Player{"Sadio", "Mané"},
		&Player{"Mohamed", "Salah"},
		&Player{"Takumi", "Minamino"},
		&Player{"Divock", "Origi"},
	}
	// Unsorted
	fmt.Println(t[0].Name())

	// Initials
	Sort(t, "taa")
	fmt.Println(t[0].Name())

	// Initials beat start of string
	Sort(t, "al")
	fmt.Println(t[0].Name())

	// Start of word
	Sort(t, "ox")
	fmt.Println(t[0].Name())

	// Earlier in string = better match
	Sort(t, "x")
	fmt.Println(t[0].Name())

	// Diacritics ignored if query is ASCII
	Sort(t, "mane")
	fmt.Println(t[0].Name())

	// But not if query isn't
	Sort(t, "né")
	fmt.Println(t[0].Name())
	Sort(t, "ne")
	fmt.Println(t[0].Name())

}
Output:

Alisson Becker
Trent Alexander-Arnold
Andy Lonergan
Alex Oxlade-Chamberlain
Xherdan Shaqiri
Sadio Mané
Sadio Mané
Neco Williams

func SortStrings

func SortStrings(data []string, query string) []*Result

SortStrings fuzzy-sorts a slice of strings.

Example

Sort a slice of strings by fuzzy match.

squad := []string{
	"Alisson Becker",
	"Adrián",
	"Andy Lonergan",
	"Caoimhín Kelleher",
	"Virgil van Dijk",
	"Joe Gomez",
	"Andy Robertson",
	"Joel Matip",
	"Ki-Jana Hoever",
	"Trent Alexander-Arnold",
	"Sepp van den Berg",
	"Neco Williams",
	"Fabinho",
	"Georginio Wijnaldum",
	"James Milner",
	"Naby Keita",
	"Jordan Henderson",
	"Alex Oxlade-Chamberlain",
	"Xherdan Shaqiri",
	"Curtis Jones",
	"Harvel Elliott",
	"Roberto Firmino",
	"Sadio Mané",
	"Mohamed Salah",
	"Takumi Minamino",
	"Divock Origi",
}

// Unsorted
fmt.Println(squad[0])

// Initials
SortStrings(squad, "taa")
fmt.Println(squad[0])

// Initials beat start of string
SortStrings(squad, "al")
fmt.Println(squad[0])

// Start of word
SortStrings(squad, "ox")
fmt.Println(squad[0])

// Earlier in string = better match
SortStrings(squad, "x")
fmt.Println(squad[0])

// Diacritics ignored when query is ASCII
SortStrings(squad, "mane")
fmt.Println(squad[0])

// But not if query isn't
SortStrings(squad, "né")
fmt.Println(squad[0])
SortStrings(squad, "ne")
fmt.Println(squad[0])
Output:

Alisson Becker
Trent Alexander-Arnold
Andy Lonergan
Alex Oxlade-Chamberlain
Xherdan Shaqiri
Sadio Mané
Sadio Mané
Neco Williams

type Sortable

type Sortable interface {
	sort.Interface
	// Keywords returns the string to compare to the sort query
	Keywords(i int) string
}

Sortable makes the implementer fuzzy-sortable. It is a superset of sort.Interface (i.e. your struct must also implement sort.Interface).

type Sorter

type Sorter struct {
	Data                    Sortable // Data to sort
	AdjacencyBonus          float64  // Bonus for adjacent matches
	SeparatorBonus          float64  // Bonus if the match is after a separator
	CamelBonus              float64  // Bonus if match is uppercase and previous is lower
	LeadingLetterPenalty    float64  // Penalty applied for every letter in string before first match
	MaxLeadingLetterPenalty float64  // Maximum penalty for leading letters
	UnmatchedLetterPenalty  float64  // Penalty for every letter that doesn't match
	StripDiacritics         bool     // Strip diacritics from sort keys if query is plain ASCII
	// contains filtered or unexported fields
}

Sorter sorts Data based on the query passsed to Sorter.Sort().

func New

func New(data Sortable, opts ...Option) *Sorter

New creates a new Sorter for the given data.

func (*Sorter) Configure

func (s *Sorter) Configure(opts ...Option) Option

Configure applies one or more Options to Sorter.

func (*Sorter) Len

func (s *Sorter) Len() int

Len implements sort.Interface.

func (*Sorter) Less

func (s *Sorter) Less(i, j int) bool

Less implements sort.Interface.

func (*Sorter) Match

func (s *Sorter) Match(str string) *Result

Match scores str against Sorter's query using fuzzy matching.

func (*Sorter) Sort

func (s *Sorter) Sort(query string) []*Result

Sort sorts data against query.

func (*Sorter) Swap

func (s *Sorter) Swap(i, j int)

Swap implements sort.Interface.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL