gintersect

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 22, 2023 License: Apache-2.0 Imports: 3 Imported by: 12

README

glob-intersection

Go package to check if the set of non-empty strings matched by the intersection of two regexp-style globs is non-empty.

Examples
  • gintersect.NonEmpty("a.a.", ".b.b") is true because both globs match the string abab.
  • gintersect.NonEmpty("[a-z]+", "[0-9]*) is false because there are no non-empty strings that both globs match.
Limitations
  • It is assumed that all input is rooted at the beginning and the end, i.e, starts and ends with the regexp symbols ^ and $ respectively. This is done because any non-rooted expressions will always match a non-empty set of non-empty strings.
  • The only special symbols are:
    • . for any character.
    • + for 1 or more of the preceding expression.
    • * for 0 or more of the preceding expression.
    • [ and ] to define regexp-style character classes.
    • - to specify Unicode ranges inside character class definitions.
    • \ escapes any special symbol, including itself.
Complexity

Complexity is exponential in the number of flags (+ or *) present in the glob with the smaller flag count. Benchmarks (see non_empty_bench_test.go) reveal that inputs where one of the globs has <= 10 flags, and both globs have 100s of characters, will run in less than a nanosecond. This should be ok for most use cases.

Acknowledgements

This StackOverflow discussion for fleshing out the logic.

Documentation

Overview

Package gintersect provides methods to check whether the intersection of several globs matches a non-empty set of strings.

Index

Constants

View Source
const (
	FlagNone = iota
	FlagPlus
	FlagStar
)

Variables

View Source
var (

	// Errors.
	ErrInvalidInput = errors.New("the input provided is invalid")
)

Functions

func Match

func Match(t1 Token, t2 Token) bool

Match implements single-Token matching, ignoring flags. Example: [a-d] and [b-e] match, while [a-z] and [0-9] do not.

func NonEmpty

func NonEmpty(lhs string, rhs string) (bool, error)

NonEmpty is true if the intersection of lhs and rhs matches a non-empty set of non-empty str1ngs.

Types

type Flag

type Flag uint

Flag applies to a token.

func (Flag) String

func (f Flag) String() (s string)

type Glob

type Glob []Token

Glob represents a glob.

func NewGlob

func NewGlob(input string) (Glob, error)

NewGlob constructs a Glob from the given string by tokenizing and then simplifying it, or reports errors if any.

type Modifier

type Modifier uint

Modifier is a special character that affects lexical analysis.

const (
	ModifierBackslash Modifier = iota
)

type Token

type Token interface {
	Type() TokenType
	Flag() Flag
	SetFlag(Flag)
	// Equal describes whether the given Token is exactly equal to this one, barring differences in flags.
	Equal(Token) bool
	String() string
}

Token is the element that makes up a Glob.

func NewCharacter

func NewCharacter(r rune) Token

func NewDot

func NewDot() Token

func NewSet

func NewSet(runes []rune) Token

func Simplify

func Simplify(tokens []Token) []Token

Simplify accepts a Token slice and returns a equivalient Token slice that is shorter/simpler. The only simplification currently applied is removing redundant flagged Tokens. TODO: Remove unflagged Tokens next to equivalen Tokens with FlagPlus. Example: tt+t == t+

func Tokenize

func Tokenize(input []rune) ([]Token, error)

Tokenize converts a rune slice into a Token slice.

type TokenType

type TokenType uint

TokenType is the type of a Token.

const (
	TTCharacter TokenType = iota
	TTDot
	TTSet
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL