strmet

package module
v0.0.0-...-2653f80 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 26, 2020 License: MIT Imports: 0 Imported by: 3

README

strmet

GoDoc Build Status Go Report Card

Fast and memory efficient string metric algorithms.

Available algorithms:

Example

package main

import (
    "fmt"
    "github.com/eskriett/strmet"
)

func main() {
    s1 := "baseball"
    s2 := "football"

    fmt.Printf("The Levenshtein distance between %s and %s is %d\n",
        s1, s2, strmet.Levenshtein(s1, s2, 10))
	// -> The Levenshtein distance between baseball and football is 4

    s1 = "salt"
    s2 = "slat"
    fmt.Printf("The Damerau–Levenshtein distance between %s and %s is %d\n",
        s1, s2, strmet.DamerauLevenshtein(s1, s2, 10))
	// -> The Damerau–Levenshtein distance between salt and slat is 1
}

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DamerauLevenshtein

func DamerauLevenshtein(str1, str2 string, maxDist int) int

DamerauLevenshtein distance is a string metric for measuring the edit distance between two sequences: https://en.wikipedia.org/wiki/Damerau%E3%80%93Levenshtein_distance

This implementation has been designed using the observations of Steve Hatchett: http://blog.softwx.net/2015/01/optimizing-damerau-levenshtein_15.html

Takes two strings and a maximum edit distance and returns the number of edits to transform one string to another, or -1 if the distance is greater than the maximum distance.

func DamerauLevenshteinRunes

func DamerauLevenshteinRunes(r1, r2 []rune, maxDist int) int

DamerauLevenshteinRunes is the same as DamerauLevenshtein but accepts runes instead of strings

func DamerauLevenshteinRunesBuffer

func DamerauLevenshteinRunesBuffer(r1, r2 []rune, maxDist int, x, y []int) int

DamerauLevenshteinRunesBuffer is the same as DamerauLevenshteinRunes but also accepts memory buffers x and y which should each be of size max(r1, r2).

func Levenshtein

func Levenshtein(str1, str2 string, maxDist int) int

Levenshtein distance between two strings is defined as the minimum number of edits needed to transform one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character https://en.wikipedia.org/wiki/Levenshtein_distance

This implementation has been designed using the observations of Steve Hatchett: https://blog.softwx.net/2014/12/optimizing-levenshtein-algorithm-in-c.html

Takes two strings and a maximum edit distance and returns the number of edits to transform one string to another, or -1 if the distance is greater than the maximum distance.

func LevenshteinRunes

func LevenshteinRunes(r1, r2 []rune, maxDist int) int

LevenshteinRunes is the same as Levenshtein but accepts runes instead of strings

func LevenshteinRunesBuffer

func LevenshteinRunesBuffer(r1, r2 []rune, maxDist int, x []int) int

LevenshteinRunesBuffer is the same as LevenshteinRunes but accepts a memory buffer x which should be of length max(r1, r2)

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL