stringext

package
v0.0.0-...-86e9f11 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 7, 2024 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Overview

Package stringext defines extra string functions.

Index

Constants

View Source
const NoEscape = utf8.RuneError

NoEscape is the default escape for LIKE parameters; it signals no escape

Variables

This section is empty.

Functions

func DeEncodeBCD

func DeEncodeBCD(s string) (min, max [4]byte)

DeEncodeBCD is the dual of ToBCD

func EncodeContainsPatternCI

func EncodeContainsPatternCI(pattern *Pattern) string

EncodeContainsPatternCI encodes the provided string for usage with bcContainsPatternCi

func EncodeContainsPatternCS

func EncodeContainsPatternCS(pattern *Pattern) string

EncodeContainsPatternCS encodes the provided string for usage with bcContainsPatternCs

func EncodeContainsPatternUTF8CI

func EncodeContainsPatternUTF8CI(pattern *Pattern) string

EncodeContainsPatternUTF8CI encodes the provided string for usage with bcContainsPatternUTF8Ci

func EncodeContainsPrefixCI

func EncodeContainsPrefixCI(needle Needle) string

EncodeContainsPrefixCI encodes the provided string for usage with bcContainsPrefixCi

func EncodeContainsPrefixCS

func EncodeContainsPrefixCS(needle Needle) string

EncodeContainsPrefixCS encodes the provided string for usage with bcContainsPrefixCs

func EncodeContainsPrefixUTF8CI

func EncodeContainsPrefixUTF8CI(needle Needle) string

EncodeContainsPrefixUTF8CI encodes the provided string for usage with bcContainsPrefixUTF8Ci

func EncodeContainsSubstrCI

func EncodeContainsSubstrCI(needle Needle) string

EncodeContainsSubstrCI encodes the provided string for usage with bcContainsSubstrCi

func EncodeContainsSubstrCS

func EncodeContainsSubstrCS(needle Needle) string

EncodeContainsSubstrCS encodes the provided string for usage with bcContainsSubstrCs

func EncodeContainsSubstrUTF8CI

func EncodeContainsSubstrUTF8CI(needle Needle) string

EncodeContainsSubstrUTF8CI encodes the provided string for usage with bcContainsSubstrUTF8Ci

func EncodeContainsSuffixCI

func EncodeContainsSuffixCI(needle Needle) string

EncodeContainsSuffixCI encodes the provided string for usage with bcContainsSuffixCi

func EncodeContainsSuffixCS

func EncodeContainsSuffixCS(needle Needle) string

EncodeContainsSuffixCS encodes the provided string for usage with bcContainsSuffixCs

func EncodeContainsSuffixUTF8CI

func EncodeContainsSuffixUTF8CI(needle Needle) string

EncodeContainsSuffixUTF8CI encodes the provided string for usage with bcContainsSuffixUTF8Ci

func EncodeEqualStringCI

func EncodeEqualStringCI(needle Needle) string

EncodeEqualStringCI encodes the provided string for usage with bcStrCmpEqCi

func EncodeEqualStringCS

func EncodeEqualStringCS(needle Needle) string

EncodeEqualStringCS encodes the provided string for usage with bcStrCmpEqCs

func EncodeEqualStringUTF8CI

func EncodeEqualStringUTF8CI(needle Needle) string

EncodeEqualStringUTF8CI encodes the provided string for usage with bcStrCmpEqUTF8Ci

func EncodeFuzzyNeedleASCII

func EncodeFuzzyNeedleASCII(needle Needle) string

EncodeFuzzyNeedleASCII encode a needle (string) for fuzzy ASCII comparisons

func EncodeFuzzyNeedleUnicode

func EncodeFuzzyNeedleUnicode(needle Needle) string

EncodeFuzzyNeedleUnicode encode a needle (string) for fuzzy unicode comparisons

func EqualRuneFold

func EqualRuneFold(a, b rune) bool

func HasCaseSensitiveChar

func HasCaseSensitiveChar(str Needle) bool

HasCaseSensitiveChar returns true when the provided string contains a case-sensitive char

func HasNtnRune

func HasNtnRune(r rune) bool

HasNtnRune return true when the provided rune contains a non-trivial normalization; false otherwise

func HasNtnString

func HasNtnString(str Needle) bool

HasNtnString return true when the provided string contains a non-trivial normalization; false otherwise

func IndexRuneEscape

func IndexRuneEscape(runes []rune, r, escape rune) int

IndexRuneEscape returns the index of the first instance of the Unicode code point r, or -1 if rune is not present in runes; an escaped r is not matched.

func LastIndexRuneEscape

func LastIndexRuneEscape(runes []rune, r, escape rune) int

LastIndexRuneEscape returns the index of the last instance of r in runes, or -1 if c is not present in runes; an escaped r is not matched.

func LiteralPrefix

func LiteralPrefix(regex string, escape rune) string

LiteralPrefix returns the unicode substring that can be matched with a faster substring matcher. Similar to regexp.LiteralPrefix() TODO the proper prefix and necessary substring should be deduced from the DFA, which is much harder but will also allow to find the substring "abc" in regex "(abc|abcde)fgh"

func NormalizeRune

func NormalizeRune(r rune) rune

NormalizeRune normalizes the provided rune into the smallest and equal rune wrt case-folding. For ascii this normalization is equal to UPPER

func NormalizeString

func NormalizeString(str string) string

NormalizeString normalizes the provided string into a string with runes that are smallest and equal wrt case-folding. For ascii this normalization is equal to UPPER

func NormalizeStringASCIIOnly

func NormalizeStringASCIIOnly(bytes []byte) []byte

NormalizeStringASCIIOnly normalizes the provided string into a string with runes that are smallest and equal wrt case-folding, and leaves non-ASCII values unchanged.

func NormalizeStringASCIIOnlyString

func NormalizeStringASCIIOnlyString(str string) string

NormalizeStringASCIIOnlyString normalizes the provided string into a string with runes that are smallest and equal wrt case-folding, and leaves non-ASCII values unchanged.

func ToBCD

func ToBCD(min, max *[4]byte) string

ToBCD converts two byte arrays to byte sequence of binary coded digits, needed by opIsSubnetOfIP4. Create an encoding of an IP4 as 16 bytes that is convenient. eg., byte sequence [192,1,2,3] becomes byte sequence 2,9,1,0, 1,0,0,0, 2,0,0,0, 3,0,0,0

Types

type Data

type Data = string

Data string type to distinguish from the Needle string type

type LikeSegment

type LikeSegment struct {
	SkipMin int
	SkipMax int
	Pattern Pattern
}

LikeSegment is a number of character skips followed by a Pattern. The number of skips is defined by a minimum and maximum count. Eg, {SkipMin:1, SkipMax:1, Pattern:"abc"} states that the segment matches when one (and only one) character is skipped and pattern "abc" matches. SkipMax can be -1 which indicates any number of slips. E.g {SKipMin:1, SkipMax:-1, "a"} corresponds to '_%a' in A LIKE expression.

func SimplifyLikeExpr

func SimplifyLikeExpr(expr string, wc, ks, escape rune) []LikeSegment

SimplifyLikeExpr simplifies a LIKE expression into a minimal sequence of LikeSegment

func (LikeSegment) String

func (ls LikeSegment) String() string

type Needle

type Needle string

Needle string type to distinguish from the Data string type

func DeEncodeContainsPrefixCI

func DeEncodeContainsPrefixCI(needle string) Needle

DeEncodeContainsPrefixCI de-encodes the provided string for usage with bcContainsPrefixCi

func DeEncodeContainsPrefixCS

func DeEncodeContainsPrefixCS(needle string) Needle

DeEncodeContainsPrefixCS de-encodes the provided string for usage with bcContainsPrefixCs

func DeEncodeContainsPrefixUTF8CI

func DeEncodeContainsPrefixUTF8CI(needle string) Needle

DeEncodeContainsPrefixUTF8CI de-encodes the provided string for usage with bcContainsPrefixUTF8Ci

func DeEncodeContainsSubstrCI

func DeEncodeContainsSubstrCI(needle string) Needle

DeEncodeContainsSubstrCI de-encodes the provided string for usage with bcContainsSubstrCi

func DeEncodeContainsSubstrCS

func DeEncodeContainsSubstrCS(needle string) Needle

DeEncodeContainsSubstrCS de-encodes the provided string for usage with bcContainsSubstrCs

func DeEncodeContainsSubstrUTF8CI

func DeEncodeContainsSubstrUTF8CI(needle string) Needle

DeEncodeContainsSubstrUTF8CI de-encodes the provided string for usage with bcContainsSubstrUTF8Ci

func DeEncodeContainsSuffixCI

func DeEncodeContainsSuffixCI(needle string) Needle

DeEncodeContainsSuffixCI de-encodes the provided string for usage with bcContainsSuffixCi

func DeEncodeContainsSuffixCS

func DeEncodeContainsSuffixCS(needle string) Needle

DeEncodeContainsSuffixCS de-encodes the provided string for usage with bcContainsSuffixCs

func DeEncodeContainsSuffixUTF8CI

func DeEncodeContainsSuffixUTF8CI(needle string) Needle

DeEncodeContainsSuffixUTF8CI de-encodes the provided string for usage with bcContainsSuffixUTF8Ci

func DeEncodeEqualStringCI

func DeEncodeEqualStringCI(needle string) Needle

DeEncodeEqualStringCI de-encodes the provided string for usage with bcStrCmpEqCi

func DeEncodeEqualStringCS

func DeEncodeEqualStringCS(needle string) Needle

DeEncodeEqualStringCS de-encodes the provided string for usage with bcStrCmpEqCs

func DeEncodeEqualStringUTF8CI

func DeEncodeEqualStringUTF8CI(needle string) Needle

DeEncodeEqualStringUTF8CI de-encodes the provided string for usage with bcStrCmpEqUTF8Ci

func DeEncodeFuzzyNeedleASCII

func DeEncodeFuzzyNeedleASCII(needle string) Needle

DeEncodeFuzzyNeedleASCII de-encode a needle (string) for fuzzy ASCII comparisons

func DeEncodeFuzzyNeedleUnicode

func DeEncodeFuzzyNeedleUnicode(needle string) Needle

DeEncodeFuzzyNeedleUnicode de-encode a needle (string) for fuzzy unicode comparisons

type Pattern

type Pattern struct {
	WC          rune   // wildcard character of this pattern
	Escape      rune   // escape character of this pattern; if available
	Needle      Needle // NOTE: needle does not contain the Escape character
	Wildcard    []bool // for every rune in Needle exists a wildcard bool
	HasWildcard bool   // whether the Needle has at least one wildcard
}

Pattern is string literal (of type Needle) with wildcards which can be escaped by Escape, use NoEscape to signal the absence of an escape character.

func DeEncodeContainsPatternCI

func DeEncodeContainsPatternCI(pattern string) Pattern

DeEncodeContainsPatternCI de-encodes the provided string for usage with bcContainsPatternCi

func DeEncodeContainsPatternCS

func DeEncodeContainsPatternCS(pattern string) Pattern

DeEncodeContainsPatternCS de-encodes the provided string for usage with bcContainsPatternCs

func DeEncodeContainsPatternUTF8CI

func DeEncodeContainsPatternUTF8CI(pattern string) Pattern

DeEncodeContainsPatternUTF8CI de-encodes the provided string for usage with bcContainsPatternUTF8Ci

func NewPattern

func NewPattern(str string, wc, escape rune) Pattern

NewPattern creates a new Pattern for the provided string, wildcard and escape character. E.g. for "a@b_c@_d" with wildcard '_' and escape '@' a pattern is created with Pattern.Needle = "ab_c_d", and Pattern.Wildcard = [false, false, true, false, false, false]. Appreciate that the second wildcard is escaped and thus corresponds to the value true in the wildcard slice. NOTE: Pattern.WC and Pattern.Escape cannot be the same character.

func (Pattern) SplitWC

func (p Pattern) SplitWC() ([]Needle, [][]bool)

SplitWC splits the needle on wc and concatenates consecutive wildcards, e.g. "a__b", (with WC = '_') becomes ["a", "__", "b"], [[false], [true, true], [false]]

func (Pattern) String

func (p Pattern) String() string

type StrCmpType

type StrCmpType int
const (
	CS      StrCmpType = iota
	CiASCII            // case-insensitive on ASCII only
	CiUTF8             // case-insensitive all unicode code-points
)

func (StrCmpType) String

func (t StrCmpType) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL