fuzmatch

package module
v0.0.0-...-32679cd Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 19, 2017 License: MIT Imports: 4 Imported by: 0

README

FuzMatch, an approximate string matching library in golang

Build Status

About

fuzmatch is a library inspired by the fuzzywuzzy python library. I've used the same function name so if you want to know more about how the function works, go see the SeatGeek blog post about fuzzywuzzy.

WARNING

The algorithm used in this library is the levenshtein distance but the fuzzywuzzy package use the python-Levenshtein who has a special function for the ratio where the replace operation costs 2 (in my algorithm it costs 1). So if you compare fuzzywuzzy and fuzmatch there mays be a few differences.

Installation
go get github.com/charlesvdv/fuzmatch

You need of course to set your GOPATH, otherwise you will have an error.

Usage

A simple ratio.

fuzmatch.Ratio("book", "back")
"75"

A partial ratio.

fuzmatch.PartialRatio("hello world!","hello")
"100"

A token sort ratio.

fuzmatch.TokenSortRatio("Rust vs Golang", "Golang vs Rust")
"100"

A token set ratio.

fuzmatch.TokenSetRatio("Rust from Mozilla vs Go from Google's employees", "Rust vs Go")
"100"

If you want more informations on how the function works, go see the SeatGeek blog.

To-Do
  • Benchmarks
  • More unit tests
Contribute

Pull requests, commits are welcome !

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func LevenshteinDistance

func LevenshteinDistance(s1, s2 string) int

LevenshteinDistance calculate the levenshtein distance between two strings. I use an algorithm from Sten Hjelmqvist. http://www.codeproject.com/Articles/13525/Fast-memory-efficient-Levenshtein-algorithm You can also see the algorithm on wikipedia : https://en.wikipedia.org/wiki/Levenshtein_distance It's the last code.

func PartialRatio

func PartialRatio(s1, s2 string) int

PartialRatio allow you to calculate the "best partial" ratio. It takes the smaller string and we compare the smaller with a partial string from the bigger one. Could be useful if you have to compare two strings with very different length

func Ratio

func Ratio(s1, s2 string) int

Ratio allow you to calculate the pourcentage of variance between two strings if the two strings are equals the function returns 1.

func TokenSetRatio

func TokenSetRatio(s1, s2 string) int

TokenSetRatio splits the strings in two groups : intersection and remainder and then we compare the group with each other.

func TokenSortRatio

func TokenSortRatio(s1, s2 string) int

TokenSortRatio allow you to compare two strings "ordered" alphabetically so if you have two strings not ordered. This function could be useful.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL