peggysue

package module
v0.0.0-...-f257357 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 28, 2023 License: BSD-2-Clause Imports: 11 Imported by: 0

README

Peggysue GoDoc Report card Sourcegraph

A helpful, neighborhood PEG framework.

Usage

Peggysue implements a PEG framework as a package rather than being an external grammar. This makes it super easy to embed in existing applications to provide easy parsing of whatever you need.

Differences from other PEGs

Peggysue provides memoization of values (a critical function of Packrat parsing) but only memoizes Ref rules. This gives the programmer an easy way to trade off memory for time. Most larger grammars will contain the major rules of the the grammar with another set of helper rules. In most grammars, all rules regardless of use are memoized. This can create huge memory footprints for rules that would be better off just run again than saved.

Values

Each rule when executed against the input stream will produce a value when the rule matches successfully. The documentaiton for each rule creationing function documentments the value of the rule. Action, Transform, and Apply are rules that exist explicitly to return a value. The other rules have a heuristic about what value they return, with many returning nil.

Example

Here is the classic calculator example:

package main

import (
	"fmt"
	"strconv"

	p "github.com/lab47/peggysue"
)

func main() {
	var (
		expr = p.R("expr")
		term = p.R("term")
		num  = p.R("num")
	)

	expr.Set(
		p.Or(
			p.Action(p.Seq(p.Named("i", expr), p.S("+"), p.Named("j", term)), func(v p.Values) interface{} {
				i := v.Get("i").(int)
				j := v.Get("j").(int)
				return i + j
			}),
			p.Action(p.Seq(p.Named("i", expr), p.S("-"), p.Named("j", term)), func(v p.Values) interface{} {
				i := v.Get("x").(int)
				j := v.Get("y").(int)
				return i - j
			}),
			term,
		),
	)

	num.Set(
		p.Transform(p.Plus(p.Range('0', '9')), func(s string) interface{} {
			i, _ := strconv.Atoi(s)
			return i
		}),
	)

	term.Set(
		p.Or(
			p.Action(p.Seq(p.Named("i", term), p.S("*"), p.Named("j", num)), func(v p.Values) interface{} {
				i := v.Get("ii").(int)
				j := v.Get("jj").(int)
				return i * j
			}),
			p.Action(p.Seq(p.Named("i", term), p.S("/"), p.Named("j", num)), func(v p.Values) interface{} {
				i := v.Get("a").(int)
				j := v.Get("b").(int)
				return i / j
			}),
			num,
		),
	)

	val, ok, _ := p.New().Parse(expr, "3*1+2*2")
	if !ok {
		panic("failed")
	}

	fmt.Printf("=> %d\n", val)
}

License

BSD-2-Clause

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Print

func Print(n Rule) string

Print outputs either the rules name (if it has one) or a description of it's operations.

func Repr

func Repr(n Rule) string

Repr outputs a description of the rules operations.

Types

type BranchesBuilder

type BranchesBuilder interface {
	// Add another branch
	Add(name string, branch Rule) Rule
}

type ErrInputNotConsumed

type ErrInputNotConsumed struct {
	MaxPos  int
	MaxRule Rule
}

func (*ErrInputNotConsumed) Error

func (*ErrInputNotConsumed) Error() string

type Labels

type Labels interface {
	// Ref creates or returns a Ref of the given name.
	Ref(name string) Rule

	// Set assigns the given rule to a Ref of the given name.
	Set(name string, rule Rule) Ref
}

Labels provides a simple database for named refs. This is used to cleanup rule set creation.

func Refs

func Refs() Labels

Refs returns a Labels value.

type Option

type Option func(p *Parser)

func WithDebug

func WithDebug(on bool) Option

func WithLogger

func WithLogger(log hclog.Logger) Option

func WithPartial

func WithPartial(on bool) Option

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

Parser is the interface for running a rule against some input

func New

func New(opts ...Option) *Parser

New creates a new Parser value

func (*Parser) Parse

func (p *Parser) Parse(r Rule, input string) (val interface{}, matched bool, err error)

Parse attempts to match the given rule against the input string. If the rule matches, the value of the rule is returned. If the rule matches a portion of input, the ErrInputNotConsumed error is returned.

func (*Parser) ParseFile

func (p *Parser) ParseFile(r Rule, path string) (val interface{}, matched bool, err error)

ParseFile reads the data from the file at the path and parses it using the given Rule

type Ref

type Ref interface {
	Rule

	// Set assigns the given rule to the ref. When the ref is matched as a rule,
	// it will delegate the matching to this rule. The rule will not be invoked
	// the multiple times at the same input position as the Ref caches the result
	// of previous attempts. Thusly it's critical that the rule not depend on state
	// when calculate it's value.
	Set(r Rule)

	// Indicates if this reference has left recursive properties.
	LeftRecursive() bool
}

Ref is a type that provides a reference to a rule. This allows for creating recursive rule sets. Ref rules are memoized, meaning the value of the ref's rule and it's position in the stream are saved and returned to prevent constantly re-running the same rule on the same input. This is a key feature of Packrat parsing as it tames the time complexity of infinite backtracking to linear time.

func R

func R(name string) Ref

R returns a Ref type. These are used to create recursive rule sets where a rule is used before it's definition is created. Ref rules are memoized, meaning the value of the ref's rule and it's position in the stream are saved and returned to prevent constantly re-running the same rule on the same input. This is a key feature of Packrat parsing as it tames the time complexity of infinite backtracking to linear time.

The value of the match is the value of the sub-rule.

func SetRef

func SetRef(name string, rule Rule) Ref

type Rule

type Rule interface {
	Name() string
	SetName(name string)
	// contains filtered or unexported methods
}

Rule is the currency of peggysue. The parser provides the ability to match a Rule against an input stream. Rules are created with the functions on peggysue, like peggysue.S() to match a literal string.

func Action

func Action(r Rule, fn func(Values) interface{}) Rule

Action returns a rule that when it's given rule is matched, the given function is called. The return value of the function becomes the rule's value. The Values argument contains all rule values observed in the curent rule scope (toplevel or from invoking a Ref).

The value of the match is the return value of the given function.

func Any

func Any() Rule

Any returns a rule that will match one rune from the input stream of any value. In other words, it only fails if there is no more input.

The value of the match is nil.

func Apply

func Apply(rule Rule, v interface{}) Rule

Apply returns a rule that when it's given rule matches, it will make a new struct and attempt to assign rule values in scope to the members of the struct. The struct type is computed from the argument v and then saved to be used at match time. The value names should exactly match the struct fields OR if the struct fields use the `ast:` tag, it will be used to look for values.

For example, given:

type Node struct {
   Age int `ast:"age"`
}

Apply(Named("age", numberRule), Node{})

When the above Apply matches, the value from numberRule will be assigned to a new value of Node.

The value of the match is the newly created and populated struct.

func Branches

func Branches(name string, f func(b BranchesBuilder, r Rule)) Rule

Or returns a Rule that will try each of the given rules, completing when the first one successfully matches. This corresponds with a PEG's "ordered choice" operation.

The value of the match is the value of the sub-rule that matched correctly.

func Call

func Call(r Rule, fn func(Values) map[string]interface{}) Rule

Call returns a rule that when invoke, calls the given function which returns a set of values which are provided to the given rule as it's arguments. Arguments are available like Named'd values in Actions. This provides an interface for propegating values between Actions, allow them to parse based on some state.

The value of the match is the value of the given rule.

func Capture

func Capture(r Rule) Rule

Capture returns a Rule that attempts to match it's given rule. If it matches, the value of the rule is set to the section of the input stream that was matched. Said another way, Capture pulls the matched text up as a value.

The value of the match is portion of the input stream that matched the sub-rule.

func Check

func Check(rule Rule) Rule

Check returns a rule that will attempt to match it's given rule and returns it's given rules match result, but it does not consume an input on the stream. This corresponds with the and-predicate ("&e") in most PEGs.

The value of the match is the value of the sub-rule.

func CheckAction

func CheckAction(fn func(Values) bool) Rule

CheckAction returns a rule that when a match is attempted, calls the given function. If the function returns true, the match succeeds. This corresponds with check functions ("&{ a > 2}") style constructions in other PEGs. The Values argument provides access to the current scope.

The value of the match is nil.

func Count

func Count(rule Rule, num int) Rule

Count returns a rule that will attempt to match it's given rule exactly +num+ times. It corresponds to the count rule ("e{2}") in most PEGs.

The value of the match is the value of the last iteration of applying the sub rule.

func Debug

func Debug() Rule

EOS produces a rule that only matches when the input stream has been exhausted.

The value of the match is nil.

func EOS

func EOS() Rule

EOS produces a rule that only matches when the input stream has been exhausted.

The value of the match is nil.

func Many

func Many(rule Rule, min, max int, fn func(values []interface{}) interface{}) Rule

Many returns a rule that will attempt to match it's given rule at least `min` times, and at most `max` times. If max is -1, there is no maximum. Upon matching results, if `fn` is not nil, it wil be called and passed the results of each sub-rule value. If `fn` is nil, the result is the nil.

Important note: the results slice is reused to reduce allocations. The results should be copied out of the results slice if needed.

This is a general purpose form of Star and Plus that adds the ability to process the resulting values as a slice.

The value of the return value of `fn` or, if `fn` is nil, the slice the sub-rule values.

func Maybe

func Maybe(rule Rule) Rule

Maybe returns a rule that will allow it's rule to match, but will always return that it's successfully matched, regardless of what it's rule does. This corresponds with the question mark rule ("e?") in most PEGs.

The value of the match is the value of the sub-rule.

func Memo

func Memo(rule Rule) Rule

Memo creates a rule that perform memoization as part of matching. Memoization is used to speed up matching.

The value of the match is the value of the sub-rule.

func N

func N(name string, r Rule) Rule

N sets the name of the given rule and returns it.

func Named

func Named(name string, rule Rule) Rule

Named will assign the value of the given rule to the current rule scope if it matches.

The value of the match is the value of the sub-rule.

func Not

func Not(rule Rule) Rule

Not returns a rule that will attempt to match it's given rule. If the rule matches, it returns that it did not match (inverting the match result). Regardless of matching, it does not consume any input on the stream. This corresponds with the not-predicate ("!e") in most PEGs.

The value of the match is the value of the sub-rule.

func Or

func Or(rules ...Rule) Rule

Or returns a Rule that will try each of the given rules, completing when the first one successfully matches. This corresponds with a PEG's "ordered choice" operation.

The value of the match is the value of the sub-rule that matched correctly.

func Plus

func Plus(rule Rule) Rule

Plus returns a rule that attempts to match it's given rule as many times as possible. The rule requires that the given rule match at least once. It corresponds to the plus rule ("e+") in most PEGs.

The value of the match is the value of the last successful match of the sub-rule

func PlusCapture

func PlusCapture(rule Rule) Rule

func PrefixTable

func PrefixTable(entries ...interface{}) Rule

PrefixTable is a manual optimization rule. It is used to create a non-ordered choice to a set of a rules based on the next byte in the input sequence. The detection of the byte does not consume any input, which makes PrefixTable the equivalent of using Or(Seq(Check(rune), ...), ...). The arguments are alternating pairs of (byte|string, rule). If a string is passed, the first byte of the string is used only. This rule is used to optimize large a large Or() where each rule has a fixed byte prefix that determines if it should rune. An example is in toolkit/toolkit.go, in the escaped rule.

The value of the match is the value of the sub-rule that matched correctly.

func Range

func Range(start, end rune) Rule

Range returns a rule that will match the next rune in the input stream as being at least 'start', and at most 'end'. This corresponds with the regexp pattern `[A-Z]` but is much faster as it does not require any regexp tracking.

The value of the match is nil.

func Re

func Re(re string) Rule

Re returns a Rule that will match a regexp at the current input position. This regexp can only match at the beginning of the input, it does not search the input for a match.

The value of the match is nil.

func Rune

func Rune(fn func(rune) bool) Rule

Rune returns a rule that attempts to match the next rune in the input stream againnt a Go function. These are useful for be able utilizing existing logic to match rules (such as logic in the unicode package)

func S

func S(str string) Rule

S returns a Rule that will match a literal string exactly.

The value of the match is nil.

func Scan

func Scan(fn func(str string) int) Rule

Scan is a manual optimization rule. It returns a Rule that calls the given function, passing the current input. The return value is how much of the input sequence to consume. If -1 is returned, the rule fails, otherwise the requested about of input is consumed and the rule passes.

The value of the match is nil.

func Scope

func Scope(rule Rule) Rule

Scope introduces a new rule scope. This is generally not needed as most users will use the automatically created scope from an Action rule.

The value of the match is the value of the sub-rule.

func Seq

func Seq(rules ...Rule) Rule

Seq returns a rule that will attempt to match each of the given rules in order. It only matches successfully if each of it's rules match.

The value of the match is the value of the right most sub-rule that produced a non-nil value.

func Set

func Set(runes ...rune) Rule

Set returns a rule that will match the next rune in the input stream as one of the given runes. This corresponds with the regexp pattern `[abc]` but is much faster as it does not require any regexp tracking.

The value of the match is nil.

func Star

func Star(rule Rule) Rule

Star returns a rule that will attempt to match it's given rule as many times as possible. This rule always matches because it allows for zero matches. It corresponds to the star rule ("e*") in most PEGs.

The value of the match is the value of the last iteration of applying the sub rule.

func Transform

func Transform(r Rule, fn func(string) interface{}) Rule

Transform returns a Rule that invokes it's given rule and if it matches calls the given function, passing the section of the input stream that was matched. The return value becomes the value of the rule.

The value of the match is the return value of the given function.

type SetPositioner

type SetPositioner interface {
	SetPosition(start, end, line int, filename string)
}

SetPositioner is an optional interface. When values implement it, peggysue will call it with the information about the position of the value in the inputs tream.

type Values

type Values interface {
	Get(name string) interface{}
	// contains filtered or unexported methods
}

Values provides the same of rule values gathered. The names correspond to Named rules that were observed in the current scope.

Directories

Path Synopsis
examples

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL