textdistance

package module
v0.0.0-...-738b0ed Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 5, 2019 License: MIT Imports: 3 Imported by: 0

README

go-textdistance

Calculate various text distance with golang.

codecov

Implemented

How to Use

$ go get github.com/masatana/go-textdistance
package main

import (
	"fmt"

	"github.com/masatana/go-textdistance"
)

func main() {
	s1 := "this is a test"
	s2 := "that is a test"
	fmt.Println(textdistance.LevenshteinDistance(s1, s2))
	fmt.Println(textdistance.DamerauLevenshteinDistance(s1, s2))
	fmt.Println(textdistance.JaroDistance(s1, s2))
	fmt.Println(textdistance.JaroWinklerDistance(s1, s2))
}

How to test

$ go test
PASS
ok      github.com/masatana/go-textdistance     0.002s

License

This software is released under the MIT License, see LICENSE.txt.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DamerauLevenshteinDistance

func DamerauLevenshteinDistance(s1, s2 string) int

DamerauLevenshteinDistance calculates the damerau-levenshtein distance between s1 and s2. Reference: [Damerau-Levenshtein Distance](http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) Note that this calculation's result isn't normalized. (not between 0 and 1.) and if s1 and s2 are exactly the same, the result is 0.

func JaccardSimilarity

func JaccardSimilarity(s1, s2 string, f func(string) mapset.Set) float64

JaccardSimilarity, as known as the Jaccard Index, compares the similarity of sample sets. This doesn't measure similarity between texts, but if regarding a text as bag-of-word, it can apply.

func JaroDistance

func JaroDistance(s1, s2 string) (float64, int)

JaroDistance calculates jaro distance between s1 and s2. This implementation is influenced by an implementation of [lucene](http://lucene.apache.org/) Note that this calculation's result is normalized ( the result will be bewtwen 0 and 1) and if t1 and t2 are exactly the same, the result is 1.0. This function returns distance and prefix (for jaro-winkler distance)

func JaroWinklerDistance

func JaroWinklerDistance(s1, s2 string) float64

JaroWinklerDistance calculates jaro-winkler distance between s1 and s2. This implementation is influenced by an implementation of [lucene](http://lucene.apache.org/) Note that this calculation's result is normalized ( the result will be bewtwen 0 and 1) and if t1 and t2 are exactly the same, the result is 1.0.

func LevenshteinDistance

func LevenshteinDistance(s1, s2 string) int

LevenshteinDistance calculates the levenshtein distance between s1 and s2. Reference: [Levenshtein Distance](http://en.wikipedia.org/wiki/Levenshtein_distance) Note that this calculation's result isn't normalized. (not between 0 and 1.) and if s1 and s2 are exactly the same, the result is 0.

func Max

func Max(is ...int) int

Max returns the maximum number of passed int slices.

func Min

func Min(is ...int) int

Min returns the minimum number of passed int slices.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL