pcre

package module
v0.0.0-...-74594f6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 30, 2022 License: MIT Imports: 12 Imported by: 4

README

pcre

Go Reference

This package provides a CGo-free port of the PCRE2 regular expression library. The lib directory contains source code automatically translated from PCRE2's C source. This package wraps that code and provides an interface as close as possible to Go's stdlib regexp package


IMPORTANT NOTE!

Due to the use of PCRE2, this library contains extra features such as lookaheads/lookbehinds. The stdlib regex engine, RE2, left these features out for a reason. It's easy to create regular expressions with this library that have exponential runtime. This creates the possibility of a denial of service attack. Only use this library if the extra features are needed and the user providing the regex is trusted (such as if it's in a config file). Otherwise, use the standard library regexp package.


Supported GOOS/GOARCH:

  • linux/amd64
  • linux/386
  • linux/arm64
  • linux/arm
  • linux/riscv64
  • darwin/amd64
  • darwin/arm64

More OS support is planned.


How to transpile pcre2

In order to transpile pcre2, a Go and C compiler (preferably GCC) will be needed.

  • First, install ccgo

  • Then, download the pcre source code. It can be found here: https://github.com/PCRE2Project/pcre2.

  • Once downloaded, cd into the source directory

  • Run ./configure. If cross-compiling, provide the path to the cross-compiler in the CC variable, and set --target to the target architecture.

  • When it completes, there should be a Makefile in the directory.

  • Run ccgo -compiledb pcre.json make. Do not add -j arguments to the make command.

  • Run the following command (replace items in triangle brackets):

CC=/usr/bin/gcc ccgo -o pcre2_<os>_<arch>.go -pkgname lib -trace-translation-units -export-externs X -export-defines D -export-fields F -export-structs S -export-typedefs T pcre.json .libs/libpcre2-8.a
  • If cross-compiling, set the CCGO_CC variable to to path of the cross-compiler, and the CCGO_AR variable to the path of the cross-compiler's ar binary. Also, set TARGET_GOARCH to the GOARCH you're targeting and TARGET_GOOS to the OS you're targeting.

  • Once the command completes, two go files will be created. One will start with pcre2, the other with capi. Copy both of these to the lib directory in this repo.

Documentation

Overview

Package pcre is a library that provides pcre2 regular expressions in pure Go, allowing for features such as cross-compiling.

The lib directory contains source code automatically translated from pcre2's C source code for each supported architecture and/or OS. This package wraps the automatically-translated source to provide a safe interface as close to Go's regexp library as possible.

Index

Constants

View Source
const (
	Anchored           = CompileOption(lib.DPCRE2_ANCHORED)
	AllowEmptyClass    = CompileOption(lib.DPCRE2_ALLOW_EMPTY_CLASS)
	AltBsux            = CompileOption(lib.DPCRE2_ALT_BSUX)
	AltCircumflex      = CompileOption(lib.DPCRE2_ALT_CIRCUMFLEX)
	AltVerbnames       = CompileOption(lib.DPCRE2_ALT_VERBNAMES)
	AutoCallout        = CompileOption(lib.DPCRE2_AUTO_CALLOUT)
	Caseless           = CompileOption(lib.DPCRE2_CASELESS)
	DollarEndOnly      = CompileOption(lib.DPCRE2_DOLLAR_ENDONLY)
	DotAll             = CompileOption(lib.DPCRE2_DOTALL)
	DupNames           = CompileOption(lib.DPCRE2_DUPNAMES)
	EndAnchored        = CompileOption(lib.DPCRE2_ENDANCHORED)
	Extended           = CompileOption(lib.DPCRE2_EXTENDED)
	FirstLine          = CompileOption(lib.DPCRE2_FIRSTLINE)
	Literal            = CompileOption(lib.DPCRE2_LITERAL)
	MatchInvalidUTF    = CompileOption(lib.DPCRE2_MATCH_INVALID_UTF)
	MactchUnsetBackref = CompileOption(lib.DPCRE2_MATCH_UNSET_BACKREF)
	Multiline          = CompileOption(lib.DPCRE2_MULTILINE)
	NeverBackslashC    = CompileOption(lib.DPCRE2_NEVER_BACKSLASH_C)
	NeverUCP           = CompileOption(lib.DPCRE2_NEVER_UCP)
	NeverUTF           = CompileOption(lib.DPCRE2_NEVER_UTF)
	NoAutoCapture      = CompileOption(lib.DPCRE2_NO_AUTO_CAPTURE)
	NoAutoPossess      = CompileOption(lib.DPCRE2_NO_AUTO_POSSESS)
	NoDotStarAnchor    = CompileOption(lib.DPCRE2_NO_DOTSTAR_ANCHOR)
	NoStartOptimize    = CompileOption(lib.DPCRE2_NO_START_OPTIMIZE)
	NoUTFCheck         = CompileOption(lib.DPCRE2_NO_UTF_CHECK)
	UCP                = CompileOption(lib.DPCRE2_UCP)
	Ungreedy           = CompileOption(lib.DPCRE2_UNGREEDY)
	UseOffsetLimit     = CompileOption(lib.DPCRE2_USE_OFFSET_LIMIT)
	UTF                = CompileOption(lib.DPCRE2_UTF)
)

Compile option bits

Variables

This section is empty.

Functions

func ConvertGlob

func ConvertGlob(glob string) (string, error)

ConvertGlob converts the given glob into a pcre regular expression, and then returns the result.

func Glob

func Glob(glob string) ([]string, error)

Glob returns a list of matches for the given glob pattern. It returns nil if there was no match. If the glob contains "**", it will recurse through the directory, which may be extremely slow depending on which directory is being searched.

func Version

func Version() string

Version returns the version of pcre2 embedded in this library.

Types

type CompileOption

type CompileOption uint32

type PcreError

type PcreError struct {
	// contains filtered or unexported fields
}

PcreError represents errors returned by underlying pcre2 functions.

func (*PcreError) Error

func (pe *PcreError) Error() string

Error returns the string within the error, prepending the offset if it exists.

type Regexp

type Regexp struct {
	// contains filtered or unexported fields
}

Regexp represents a pcre2 regular expression

func Compile

func Compile(pattern string) (*Regexp, error)

Compile runs CompileOpts with no options.

Close() should be called on the returned expression once it is no longer needed.

func CompileGlob

func CompileGlob(glob string) (*Regexp, error)

CompileGlob is a convenience function that converts a glob to a pcre regular expression and then compiles it.

func CompileOpts

func CompileOpts(pattern string, options CompileOption) (*Regexp, error)

CompileOpts compiles the provided pattern using the given options.

Close() should be called on the returned expression once it is no longer needed.

func MustCompile

func MustCompile(pattern string) *Regexp

MustCompile compiles the given pattern and panics if there was an error

Close() should be called on the returned expression once it is no longer needed.

func MustCompileOpts

func MustCompileOpts(pattern string, options CompileOption) *Regexp

MustCompileOpts compiles the given pattern with the given options and panics if there was an error.

Close() should be called on the returned expression once it is no longer needed.

func (*Regexp) Close

func (r *Regexp) Close() error

Close frees resources used by the regular expression.

func (*Regexp) Find

func (r *Regexp) Find(b []byte) []byte

Find returns the leftmost match of the regular expression. A return value of nil indicates no match.

func (*Regexp) FindAll

func (r *Regexp) FindAll(b []byte, n int) [][]byte

FindAll returns all matches of the regular expression. A return value of nil indicates no match.

func (*Regexp) FindAllIndex

func (r *Regexp) FindAllIndex(b []byte, n int) [][]int

FindAll returns indices of all matches of the regular expression. A return value of nil indicates no match.

func (*Regexp) FindAllString

func (r *Regexp) FindAllString(s string, n int) []string

FinAllString is the String version of FindAll

func (*Regexp) FindAllStringIndex

func (r *Regexp) FindAllStringIndex(s string, n int) [][]int

FindAllStringIndex is the String version of FindIndex

func (*Regexp) FindAllStringSubmatch

func (r *Regexp) FindAllStringSubmatch(s string, n int) [][]string

FindAllStringSubmatch is the String version of FindAllSubmatch

func (*Regexp) FindAllStringSubmatchIndex

func (r *Regexp) FindAllStringSubmatchIndex(s string, n int) [][]int

FindAllStringSubmatchIndex is the String version of FindAllSubmatchIndex

func (*Regexp) FindAllSubmatch

func (r *Regexp) FindAllSubmatch(b []byte, n int) [][][]byte

FindAllSubmatch returns a slice of all matches and submatches of the regular expression. It will return no more than n matches. If n < 0, it will return all matches.

func (*Regexp) FindAllSubmatchIndex

func (r *Regexp) FindAllSubmatchIndex(b []byte, n int) [][]int

FindAllSubmatch returns a slice of all indeces representing the locations of matches and submatches, if any, of the regular expression. It will return no more than n matches. If n < 0, it will return all matches.

func (*Regexp) FindIndex

func (r *Regexp) FindIndex(b []byte) []int

FindIndex returns a two-element slice of integers representing the location of the leftmost match of the regular expression.

func (*Regexp) FindString

func (r *Regexp) FindString(s string) string

FindString is the String version of Find

func (*Regexp) FindStringIndex

func (r *Regexp) FindStringIndex(s string) []int

FindStringIndex is the String version of FindIndex

func (*Regexp) FindStringSubmatch

func (r *Regexp) FindStringSubmatch(s string) []string

FindStringSubmatch is the string version of FindSubmatch

func (*Regexp) FindStringSubmatchIndex

func (r *Regexp) FindStringSubmatchIndex(s string) []int

FindStringSubmatchIndex is the String version of FindSubmatchIndex

func (*Regexp) FindSubmatch

func (r *Regexp) FindSubmatch(b []byte) [][]byte

FindSubmatch returns a slice containing the match as the first element, and the submatches as the subsequent elements.

func (*Regexp) FindSubmatchIndex

func (r *Regexp) FindSubmatchIndex(b []byte) []int

FindSubmatchIndex returns a slice of index pairs representing the match and submatches, if any.

func (*Regexp) Match

func (r *Regexp) Match(b []byte) bool

Match reports whether b contains a match of the regular expression

func (*Regexp) MatchString

func (r *Regexp) MatchString(s string) bool

MatchString is the String version of Match

func (*Regexp) NumSubexp

func (r *Regexp) NumSubexp() int

NumSubexp returns the number of parenthesized subexpressions in the regular expression.

func (*Regexp) ReplaceAll

func (r *Regexp) ReplaceAll(src, repl []byte) []byte

ReplaceAll returns a copy of src, replacing matches of the regular expression with the replacement text repl. Inside repl, $ signs are interpreted as in Expand, so for instance $1 represents the text of the first submatch and $name would represent the text of the subexpression called "name".

func (*Regexp) ReplaceAllFunc

func (r *Regexp) ReplaceAllFunc(src []byte, repl func([]byte) []byte) []byte

ReplaceAllFunc returns a copy of src in which all matches of the regular expression have been replaced by the return value of function repl applied to the matched byte slice. The replacement returned by repl is substituted directly, without using Expand.

func (*Regexp) ReplaceAllLiteral

func (r *Regexp) ReplaceAllLiteral(src, repl []byte) []byte

ReplaceAllLiteral returns a copy of src, replacing matches of the regular expression with the replacement bytes repl. The replacement is substituted directly, without using Expand.

func (*Regexp) ReplaceAllLiteralString

func (r *Regexp) ReplaceAllLiteralString(src, repl string) string

ReplaceAllLiteralString is the String version of ReplaceAllLiteral

func (*Regexp) ReplaceAllString

func (r *Regexp) ReplaceAllString(src, repl string) string

ReplaceAllString is the String version of ReplaceAll

func (*Regexp) ReplaceAllStringFunc

func (r *Regexp) ReplaceAllStringFunc(src string, repl func(string) string) string

ReplaceAllStringFunc is the String version of ReplaceAllFunc

func (*Regexp) Split

func (r *Regexp) Split(s string, n int) []string

Split slices s into substrings separated by the expression and returns a slice of the substrings between those expression matches.

Example:

s := regexp.MustCompile("a*").Split("abaabaccadaaae", 5)
// s: ["", "b", "b", "c", "cadaaae"]

The count determines the number of substrings to return:

n > 0: at most n substrings; the last substring will be the unsplit remainder.
n == 0: the result is nil (zero substrings)
n < 0: all substrings

func (*Regexp) String

func (r *Regexp) String() string

String returns the text of the regular expression used for compilation.

func (*Regexp) SubexpIndex

func (r *Regexp) SubexpIndex(name string) int

SubexpIndex returns the index of the subexpression with the given name, or -1 if there is no subexpression with that name.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL