Documentation ¶
Index ¶
- Variables
- func ASCIIWhitespace(s *State)
- func DisableLogging()
- func DumpDebugStats()
- func EnableLogging(w io.Writer)
- func IsValidRegexpDelimiter(r rune) (bool, rune)
- func NoWhitespace(s *State)
- func Run(parser Parserish, input string, ws ...VoidParser) (result interface{}, err error)
- func UnicodeWhitespace(s *State)
- type Error
- type Parser
- func Any(parsers ...Parserish) Parser
- func Bind(parser Parserish, val interface{}) Parser
- func Chars(matcher string, repetition ...int) Parser
- func CustomRegexpMatchLiteral(isValid func(rune) (bool, rune), escapes map[rune]rune) Parser
- func CustomRegexpReplaceLiteral(isValid func(rune) (bool, rune), escapes map[rune]rune) Parser
- func CustomStringLiteral(isValid func(rune) (bool, rune), escapes map[rune]rune) Parser
- func Cut() Parser
- func Exact(match string) Parser
- func Many(parser Parserish, separator ...Parserish) Parser
- func Map(parser Parserish, f func(n *Result)) Parser
- func Maybe(parser Parserish) Parser
- func Merge(parser Parserish) Parser
- func NewParser(description string, p Parser) Parser
- func NoAutoWS(parser Parserish) Parser
- func NotChars(matcher string, repetition ...int) Parser
- func NumberLit() Parser
- func Parsify(p Parserish) Parser
- func ParsifyAll(parsers ...Parserish) []Parser
- func Regex(pattern string) Parser
- func Seq(parsers ...Parserish) Parser
- func Some(parser Parserish, separator ...Parserish) Parser
- func StringLit(allowedQuotes string) Parser
- func UnicodeRegexpMatchLiteral() Parser
- func UnicodeRegexpReplaceLiteral() Parser
- func UnicodeStringLiteral() Parser
- func Until(terminators ...string) Parser
- type Parserish
- type Result
- type State
- type UnparsedInputError
- type VoidParser
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var TrashResult = &Result{}
TrashResult is used in places where the result isnt wanted, but something needs to be passed in to satisfy the interface.
Functions ¶
func ASCIIWhitespace ¶
func ASCIIWhitespace(s *State)
ASCIIWhitespace matches any of the standard whitespace characters. It is faster than the UnicodeWhitespace parser as it does not need to decode unicode runes.
func DumpDebugStats ¶
func DumpDebugStats()
DumpDebugStats will print out the curring timings for each parser if built with -tags debug
func EnableLogging ¶
EnableLogging will write logs to the given writer as the next parse happens
func IsValidRegexpDelimiter ¶
IsValidRegexpDelimiter allows quote taken from the set of unicode punctuation characters, plus angle brackets (which are actually math symbols). It ensures that the closing quote character will be symmetrically paired with the opening character if possible.
func Run ¶
func Run(parser Parserish, input string, ws ...VoidParser) (result interface{}, err error)
Run applies some input to a parser and returns the result, failing if the input isnt fully consumed. It is a convenience method for the most common way to invoke a parser.
func UnicodeWhitespace ¶
func UnicodeWhitespace(s *State)
UnicodeWhitespace matches any unicode space character. Its a little slower than the ascii parser because it matches a rune at a time.
Types ¶
type Error ¶
type Error struct {
// contains filtered or unexported fields
}
Error represents a parse error. These will often be set, the parser will back up a little and find another viable path. In general when combining errors the longest error should be returned.
func (*Error) LocateError ¶
LocalError locates the error position in the input string s and returns the error description along with a cursor to the input.
type Parser ¶
Parser is the workhorse of parsify. A parser takes a State and returns a result, consuming some of the State in the process. Given state is shared there are a few rules that should be followed:
- A parser that errors must set state.Error
- A parser that errors must not change state.Pos
- A parser that consumed some input should advance state.Pos
func Bind ¶
Bind will set the node .Result when the given parser matches This is useful for giving a value to keywords and constant literals like true and false. See the json parser for an example.
func Chars ¶
Chars is the swiss army knife of character matches. It can match:
- ranges: Chars("a-z") will match one or more lowercase letter
- alphabets: Chars("abcd") will match one or more of the letters abcd in any order
- min and max: Chars("a-z0-9", 4, 6) will match 4-6 lowercase alphanumeric characters
the above can be combined in any order
func CustomStringLiteral ¶
CustomStringLiteral matches a quoted string and returns it in .Token. It may contain:
- unicode
- escaped characters, eg \", \n, \t
- unicode sequences, eg \uBEEF
The opening and closing quotes are validated by the isValid function you pass in. This function should return true if its argument is a valid opening quote character, plus the correct closing quote character that ends the string. See IsValidRegexpDelimiter.
The only valid escape characters are those defined in the escapes argument, plus one for the closer returned by isValid.
func Cut ¶
func Cut() Parser
Cut prevents backtracking beyond this point. Usually used after keywords when you are sure this is the correct path. Improves performance and error reporting.
Example ¶
// without a cut if the close tag is left out the parser will backtrack and ignore the rest of the string alpha := Chars("a-z") nocut := Many(Any(Seq("<", alpha, ">"), alpha)) _, err := Run(nocut, "asdf <foo") fmt.Println(err.Error()) // with a cut, once we see the open tag we know there must be a close tag that matches it, so the parser will error cut := Many(Any(Seq("<", Cut(), alpha, ">"), alpha)) _, err = Run(cut, "asdf <foo") fmt.Println(err.Error())
Output: left unparsed: <foo offset 9: expected >
func Exact ¶
Exact will fully match the exact string supplied, or error. The match will be stored in .Token
func Many ¶
Many matches one or more parsers and returns the value as .Child[n] an optional separator can be provided and that value will be consumed but not returned. Only one separator can be provided.
func Map ¶
Map applies the callback if the parser matches. This is used to set the Result based on the matched result.
func NewParser ¶
NewParser should be called around the creation of every Parser. It does nothing normally and should incur no runtime overhead, but when building with -tags debug it will instrument every parser to collect valuable timing information displayable with DumpDebugStats.
func NoAutoWS ¶
NoAutoWS disables automatically ignoring whitespace between tokens for all parsers underneath
func NotChars ¶
NotChars accepts the full range of input from Chars, but it will stop when any character matches. If you need to match until you see a sequence use Until instead
func NumberLit ¶
func NumberLit() Parser
NumberLit matches a floating point or integer number and returns it as a int64 or float64 in .Result
func Parsify ¶
Parsify takes a Parserish and makes a Parser out of it. It should be called by any Parser that accepts a Parser as an argument. It should never be called during instead call it during parser creation so there is no runtime cost.
See Parserish for details.
func ParsifyAll ¶
ParsifyAll calls Parsify on all parsers
func Some ¶
Some matches zero or more parsers and returns the value as .Child[n] an optional separator can be provided and that value will be consumed but not returned. Only one separator can be provided.
func StringLit ¶
StringLit matches a quoted string and returns it in .Token. It may contain:
- unicode
- escaped characters, eg \", \n, \t
- unicode sequences, eg \uBEEF
allowedQuotes is the list of allowed quote characters; both the opening and closing quotes will be the same character from this string
func UnicodeRegexpMatchLiteral ¶
func UnicodeRegexpMatchLiteral() Parser
func UnicodeRegexpReplaceLiteral ¶
func UnicodeRegexpReplaceLiteral() Parser
func UnicodeStringLiteral ¶
func UnicodeStringLiteral() Parser
UnicodeStringLiteral matches a quoted string and returns it in .Token. It may contain:
- unicode
- escaped characters, eg \", \n, \t
- unicode sequences, eg \uBEEF
The opening and closing quote character may be any matched pair of unicode characters from the Pi/Pf categories, or from the Ps/Pe categories, plus angle brackets, or if they may be a punctuation character, as long as they are the same punctuation character.
type Parserish ¶
type Parserish interface{}
Parserish types are any type that can be turned into a Parser by Parsify These currently include *Parser and string literals.
This makes recursive grammars cleaner and allows string literals to be used directly in most contexts. eg, matching balanced paren:
var group Parser group = Seq("(", Maybe(&group), ")")
vs
var group ParserPtr{} group.P = Seq(Exact("("), Maybe(group.Parse), Exact(")"))
type Result ¶
type Result struct { Token string Child []Result Result interface{} Input string Start int End int }
Result is the output of a parser. Usually only one of its fields will be set and should be though of more as a union type. having it avoids interface{} littered all through the parsing code and makes the it easy to do the two most common operations, getting a token and finding a child.
type State ¶
type State struct { // The full input string Input string // An offset into the string, pointing to the current tip Pos int // Do not backtrack past this point Cut int // Error is a secondary return channel from parsers, but used so heavily // in backtracking that it has been inlined to avoid allocations. Error Error // Called to determine what to ignore when WS is called, or when WS fires WS VoidParser }
State is the current parse state. It is entirely public because parsers are expected to mutate it during the parse.
type UnparsedInputError ¶
type UnparsedInputError struct {
Remaining string
}
UnparsedInputError is returned by Run when not all of the input was consumed. There may still be a valid result
func (UnparsedInputError) Error ¶
func (e UnparsedInputError) Error() string
Error satisfies the golang error interface
type VoidParser ¶
type VoidParser func(*State)
VoidParser is a special type of parser that never returns anything but can still consume input