Documentation ¶
Overview ¶
Package levenshtein provides functions for calculating the Levenshtein distance (a type of edit distance) between two strings, and for generating a minimal list of edit operations required to convert the source string into the target string. It does this by building an edit matrix according to the Wagner-Fischer Algorithm. Alternative insertion/removal/swap costs can be provided as options. The list of edit operations is retrieved via a recursive algorithm which reads off a backtrace of edit operations from the matrix.
More information about the Wagner-Fischer algorithm can be found here: https://en.wikipedia.org/wiki/Wagner%E2%80%93Fischer_algorithm
Example ¶
package main import ( "fmt" "github.com/nathanjcochran/levenshtein" ) func main() { matrix := levenshtein.Build("horse", "arose") fmt.Printf("Matrix:\n%s\n\n", matrix) fmt.Printf("Edit distance: %d\n", matrix.Distance()) fmt.Printf("Operations:\n") for _, op := range matrix.Operations() { fmt.Printf(" %s\n", op) } }
Output: Matrix: a r o s e 0 1 2 3 4 5 h 1 1 2 3 4 5 o 2 2 2 2 3 4 r 3 3 2 3 3 4 s 4 4 3 3 3 4 e 5 5 4 4 4 3 Edit distance: 3 Operations: swap a at index 0: aorse remove o at index 1: arse keep r at index 1: arse insert o at index 2: arose keep s at index 3: arose keep e at index 4: arose
Index ¶
Examples ¶
Constants ¶
const ( DefaultInsertCost = 1 DefaultRemoveCost = 1 DefaultSwapCost = 1 )
Default costs for inserting, removing, and swapping characters.
Variables ¶
This section is empty.
Functions ¶
func Distance ¶
Distance builds a matrix and returns the edit distance between the two strings - i.e. the minimum number of edits required to transform the source string into the target string. This method is a short-cut, useful in cases where you do not need to use the edit matrix for any other purpose. It is equivalent to: Build(source, target, options...).Distance()
Example ¶
package main import ( "fmt" "github.com/nathanjcochran/levenshtein" ) func main() { fmt.Println(levenshtein.Distance("horse", "arose")) }
Output: 3
Types ¶
type Matrix ¶
type Matrix struct {
// contains filtered or unexported fields
}
Matrix contains a two-dimensional matrix used for calculating edit distances between two strings, and for retrieving a minimal list of edit operations for converting the source string into the target string.
func Build ¶
Builds and fills a matrix which can be used to calculate the edit distance between the two strings, or to retrieve a list of edit operations required to transform the source string into the target string.
Example ¶
package main import ( "fmt" "github.com/nathanjcochran/levenshtein" ) func main() { fmt.Println(levenshtein.Build("horse", "arose")) }
Output: a r o s e 0 1 2 3 4 5 h 1 1 2 3 4 5 o 2 2 2 2 3 4 r 3 3 2 3 3 4 s 4 4 3 3 3 4 e 5 5 4 4 4 3
func (*Matrix) Distance ¶
Distance returns the edit distance between the two strings - i.e. the minimum number of edits required to transform the source string into the target string.
Example ¶
package main import ( "fmt" "github.com/nathanjcochran/levenshtein" ) func main() { matrix := levenshtein.Build("horse", "arose") fmt.Println(matrix.Distance()) }
Output: 3
func (*Matrix) Operations ¶
Operations returns a minimal list of edit operations required to transform the source string into the target string.
Example ¶
package main import ( "fmt" "github.com/nathanjcochran/levenshtein" ) func main() { matrix := levenshtein.Build("horse", "arose") for _, op := range matrix.Operations() { fmt.Println(op) } }
Output: swap a at index 0: aorse remove o at index 1: arse keep r at index 1: arse insert o at index 2: arose keep s at index 3: arose keep e at index 4: arose
type Operation ¶
Operation represents one of the operations performed on a source string during the process of converting it into a target string. Contains information about the type of operation, the character affected, the index at which the operation occured, and the intermediate result of performing this operation.
func Operations ¶
Operations builds a matrix and returns a minimal list of edit operations required to transform the source string into the target string. This method is a short-cut, useful in cases where you do not need to use the edit matrix for any other purpose. It is equivalent to: Build(source, target, options...).Distance()
Example ¶
package main import ( "fmt" "github.com/nathanjcochran/levenshtein" ) func main() { for _, op := range levenshtein.Operations("horse", "arose") { fmt.Println(op) } }
Output: swap a at index 0: aorse remove o at index 1: arse keep r at index 1: arse insert o at index 2: arose keep s at index 3: arose keep e at index 4: arose
type Option ¶
type Option func(m *Matrix)
An Option which can be applied when generating an edit matrix or calculating the edit distance between two strings - e.g. setting a non-default insert/removal/swap cost.
func SetInsertCost ¶
SetInsertCost is an option which allows you to set a custom insertion cost to use when calculating edit distances. If this option is not provided, DefaultInsertCost is used instead.
func SetRemoveCost ¶
SetRemoveCost is an option which allows you to set a custom removal cost to use when calculating edit distances. If this option is not provided, DefaultRemoveCost is used instead.
func SetSwapCost ¶
SetSwapCost is an option which allows you to set a custom swap cost to use when calculating edit distances. If this option is not provided, DefaultSwapCost is used instead.