searchvietnamese

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 28, 2022 License: MIT Imports: 2 Imported by: 0

README

Search Vietnamese

Github action Coverage Status

Advance search tool for Vietnamese language. Able to match e and éèẻẽẹêếềểễệÉÈẺẼẸÊẾỀỂỄỆ

Installation

go get github.com/Nguyen-Hoang-Nam/search-vietnamese

Usage

package main

import (
    searchvietnamese "github.com/Nguyen-Hoang-Nam/search-vietnamese"
)

func main() {
    searchvietnamese.Contains("Nguyễn Hoàng Nam", "nguyen") // true
    searchvietnamese.Contains("Nguyễn Hoàng Nam", "nguyên") // true
    searchvietnamese.Contains("Nguyễn Hoàng Nam", "ngyên") // false

    index := searchvietnamese.Index("Nguyễn Hoàng Nam", "hoang") // 7
    index := searchvietnamese.Index("Nguyễn Hoàng Nam", "hang") // -1
}

Benchmark

goos: linux
goarch: amd64
pkg: github.com/Nguyen-Hoang-Nam/search-vietnamese
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkContain-8      	2194798	      530.8 ns/op
BenchmarkToAlphabet-8   	4899396	      239.9 ns/op

Case sensitive

Default behaviour of all functions is match with case-insensitive. To specific case-sensitive you may use function with postfix Sensitive

searchvietnamese.ContainsSensitive("Nguyễn Hoàng Nam", "nguyen") // false
searchvietnamese.ContainsSensitive("Nguyễn Hoàng Nam", "Nguyên") // true
searchvietnamese.ContainsSensitive("Nguyễn Hoàng Nam", "ngyên") // false

index := searchvietnamese.IndexSensitive("Nguyễn Hoàng Nam", "hoang") // -1
index := searchvietnamese.IndexSensitive("Nguyễn Hoàng Nam", "Hoang") // 7
index := searchvietnamese.IndexSensitive("Nguyễn Hoàng Nam", "hang") // -1

Strict mode

If you want more performance, this may help you by assuming that your text is valid Vietnamese text. This means there are only alphabet and Vietnamese character.

This may cause true negative if your text has any kind of UTF-8 other than I mention above.

searchvietnamese.StrictContains("Nguyễn Hoàng Nam", "nguyen") // true
searchvietnamese.StrictContains("Nguyễn Hoàng Nam", "nguyên") // true
searchvietnamese.StrictContains("Nguyễn Hoàng Nam", "ngyên") // false

index := searchvietnamese.StrictIndex("Nguyễn Hoàng Nam", "hoang") // 7
index := searchvietnamese.StrictIndex("Nguyễn Hoàng Nam", "hang") // -1
Compare normal mode and strict mode
goos: linux
goarch: amd64
pkg: github.com/Nguyen-Hoang-Nam/search-vietnamese
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkContain-8            	2276416	      519.3 ns/op
BenchmarkToAlphabet-8         	4838238	      246.9 ns/op
BenchmarkStrictContain-8      	2785992	      426.2 ns/op
BenchmarkStrictToAlphabet-8   	7058474	      167.2 ns/op

TODO

  • Convert text to regex to search Vietnamese

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Documentation

Overview

Package searchvietnamese provide tool to search Vietnamese text more pleasant.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Contains

func Contains(text, keyword string) bool

Contains check keyword in Vietnamese text.

func ContainsSensitive

func ContainsSensitive(text, keyword string) bool

ContainsSensitive check keyword in Vietnamese text with case-sensitive.

func Equal

func Equal(target, source string) bool

Equal check two Vietnamese string equal

func EqualSensitive

func EqualSensitive(target, source string) bool

EqualSensitive check two Vietnamese string equal with case-sensitive.

func FuzzyMatch

func FuzzyMatch(text, keyword string) bool

FuzzyMatch fuzzy check keyword in Vietnamese text. Credit https://github.com/lithammer/fuzzysearch/blob/master/fuzzy/fuzzy.go

func FuzzyMatchSensitive

func FuzzyMatchSensitive(text, keyword string) bool

FuzzyMatchSensitive fuzzy check keyword in Vietnamese text with case-sensitive. Credit https://github.com/lithammer/fuzzysearch/blob/master/fuzzy/fuzzy.go

func Index

func Index(text, keyword string) int

Index find keyword in Vietnamese text.

func IndexSensitive

func IndexSensitive(text, keyword string) int

IndexSensitive find keyword in Vietnamese text with case-sensitive.

func StrictContains

func StrictContains(text, keyword string) bool

StrictContains check keyword in Vietnamese text. This function assume that your text is valid Vietnamese paragraph. It mean there are no other UTF-8 characters than Vietnamese characters.

func StrictContainsSensitive

func StrictContainsSensitive(text, keyword string) bool

StrictContainsSensitive check keyword in Vietnamese text with case-sensitive. This function assume that your text is valid Vietnamese paragraph. It mean there are no other UTF-8 characters than Vietnamese characters.

func StrictIndex

func StrictIndex(text, keyword string) int

StrictIndex find keyword in Vietnamese text. This function assume that your text is valid Vietnamese paragraph. It mean there are no other UTF-8 characters than Vietnamese characters.

func StrictIndexSensitive

func StrictIndexSensitive(text, keyword string) int

StrictIndexSensitive find keyword in Vietnamese text with case-sensitive. This function assume that your text is valid Vietnamese paragraph. It mean there are no other UTF-8 characters than Vietnamese characters.

func StrictToAlphabetSensitive

func StrictToAlphabetSensitive(text []rune) string

StrictToAlphabetSensitive convert slice of rune to a string that replace all Vietnamese characters with alphabet characters, respectively. This function assume that your text is valid Vietnamese paragraph. It mean there are no other UTF-8 characters than Vietnamese characters.

func ToAlphabetSensitive

func ToAlphabetSensitive(text []rune) string

ToAlphabetSensitive convert slice of rune to a string that replace all Vietnamese characters with alphabet characters, respectively.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL