parse

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 5, 2023 License: MIT Imports: 17 Imported by: 2

Documentation

Overview

Package parse provides language parsers for the ictiobus parser generator. These parsers implement Parser and are invoked by calling their Parse() method with a lex.TokenStream to get tokens from. The parser will read tokens from the stream and apply syntactic analysis to try and produce a parse tree, represented as a Tree.

This package currently provides an LL(1) parser, a Simple LR(1) parser, a Canonical LR(1) parser, and an LALR(1) parser, as well as the means to generate each from a context-free grammar describing the accepted language. The exact type of parser needed depends on the grammar.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func EncodeBytes added in v1.0.0

func EncodeBytes(p Parser) []byte

EncodeBytes takes a Parser and encodes it as a binary value (using an internal binary format called 'REZI').

func IsLL1 added in v0.8.0

func IsLL1(g grammar.CFG) bool

IsLL1 returns whether the grammar is LL(1).

func WriteFile added in v1.0.0

func WriteFile(p Parser, filename string) error

WriteFile stores the parser in a binary file (encoded using an internal format called 'REZI'). The Parser can later be retrieved from the file by calling ReadFile on it.

Types

type Algorithm added in v0.8.0

type Algorithm string

Algorithm is a classification of parsers in ictiobus.

const (
	LL1   Algorithm = "LL(1)"
	SLR1  Algorithm = "SLR(1)"
	CLR1  Algorithm = "CLR(1)"
	LALR1 Algorithm = "LALR(1)"
)

func ParseAlgorithm added in v0.8.0

func ParseAlgorithm(s string) (Algorithm, error)

ParseAlgorithm parses a string containing the name of an Algorithm.

func (Algorithm) String added in v0.8.0

func (pt Algorithm) String() string

String returns the string representation of a ParserType.

type Parser added in v0.8.0

type Parser interface {
	encoding.BinaryMarshaler
	encoding.BinaryUnmarshaler

	// Parse parses input text and returns the parse tree built from it, or a
	// SyntaxError with the description of the problem.
	Parse(stream lex.TokenStream) (Tree, error)

	// Type returns a string indicating what kind of parser was generated. This
	// will be "LL(1)", "SLR(1)", "CLR(1)", or "LALR(1)"
	Type() Algorithm

	// TableString returns the parsing table as a string.
	TableString() string

	// RegisterTraceListener sets up a function to call when an event occurs.
	// The events are determined by the individual parsers but involve
	// examination of the parser stack or other critical moments that may aid in
	// debugging.
	RegisterTraceListener(func(s string))

	// DFAString returns a string representation of the DFA for this parser, if one
	// so exists. Will return the empty string if the parser is not of the type
	// to have a DFA.
	DFAString() string

	// Grammar returns the grammar that this parser can parse.
	Grammar() grammar.CFG
}

A Parser represents an in-progress or ready-built parsing engine ready for use. It can be stored as a byte representation and retrieved from bytes as well.

func DecodeBytes added in v1.0.0

func DecodeBytes(data []byte) (p Parser, err error)

DecodeBytes takes a slice of bytes containing a Parser encoded as a binary value (using an internal binary format called 'REZI') and decodes it into the Parser it represents.

func EmptyCLR1Parser added in v0.3.0

func EmptyCLR1Parser() Parser

EmptyCLR1Parser returns a completely empty CLR1Parser, unsuitable for use. Generally this should not be used directly except for internal purposes; use GenerateCanonicalLR1Parser to generate one ready for use

func EmptyLALR1Parser added in v0.3.0

func EmptyLALR1Parser() Parser

EmptyLALR1Parser returns a completely empty LALR1Parser, unsuitable for use. Generally this should not be used directly except for internal purposes; use GenerateLALR1Parser to generate one ready for use

func EmptyLL1Parser added in v0.3.0

func EmptyLL1Parser() Parser

EmptyLL1Parser returns a completely empty LL1 parser, unsuitable for use. Generally this should not be used directly except for internal purposes; use GenerateLL1Parser to generate one ready for use.

func EmptySLR1Parser added in v0.3.0

func EmptySLR1Parser() Parser

EmptySLR1Parser returns a completely empty SLR1Parser, unsuitable for use. Generally this should not be used directly except for internal purposes; use GenerateSimpleLRParser to generate one ready for use

func GenerateCLR1Parser added in v0.8.0

func GenerateCLR1Parser(g grammar.CFG, allowAmbig bool) (Parser, []string, error)

GenerateCLR1Parser returns a parser that uses the set of canonical LR(1) items from g to parse input in language g. The provided language must be in LR(1) or else the a non-nil error is returned.

allowAmbig allows the use of ambiguous grammars; in cases where there is a shift-reduce conflict, shift will be preferred. If the grammar is detected as ambiguous, the 2nd arg 'ambiguity warnings' will be filled with each ambiguous case detected.

func GenerateLALR1Parser

func GenerateLALR1Parser(g grammar.CFG, allowAmbig bool) (Parser, []string, error)

GenerateLALR1Parser returns a parser that uses the set of canonical LR(1) items from g to parse input in language g. The provided language must be in LR(1) or else the a non-nil error is returned.

allowAmbig allows the use of ambiguous grammars; in cases where there is a shift-reduce conflict, shift will be preferred. If the grammar is detected as ambiguous, the 2nd arg 'ambiguity warnings' will be filled with each ambiguous case detected.

func GenerateLL1Parser

func GenerateLL1Parser(g grammar.CFG) (Parser, error)

GenerateLL1Parser generates a parser for LL1 grammar g. The grammar must already be LL1 or convertible to an LL1 grammar.

The returned parser parses the input using LL(k) parsing rules on the context-free Grammar g (k=1). The grammar must already be LL(1); it will not be forced to it.

func GenerateSLR1Parser added in v0.8.0

func GenerateSLR1Parser(g grammar.CFG, allowAmbig bool) (Parser, []string, error)

GenerateSLR1Parser returns a parser that uses SLR bottom-up parsing to parse languages in g. It will return an error if g is not an SLR(1) grammar.

allowAmbig allows the use of ambiguous grammars; in cases where there is a shift-reduce conflict, shift will be preferred. If the grammar is detected as ambiguous, the 2nd arg 'ambiguity warnings' will be filled with each ambiguous case detected.

func ReadFile added in v1.0.0

func ReadFile(filename string) (Parser, error)

ReadFile retrieves a Parser by reading a file containing one encoded as binary bytes. This will read files created with WriteFile.

type Tree added in v1.0.0

type Tree struct {
	// Terminal is whether thie node is for a terminal symbol.
	Terminal bool

	// Value is the symbol at this node.
	Value string

	// Source is only available when Terminal is true.
	Source lex.Token

	// Children is all children of the parse tree.
	Children []*Tree
}

Tree is a parse tree returned by a parser performing analysis on input source code.

func DeriveFullTree added in v0.8.0

func DeriveFullTree(g grammar.CFG, fakeValProducer ...map[string]func() string) ([]Tree, error)

DeriveFullTree derives a parse tree based on the grammar that is guaranteed to contain every rule at least once. It is *not* garunteed to have each rule only once, although it will make a best effort to minimize duplications.

The fakeValProducer map, if provided, will be used to assign a lexed value to each synthesized terminal node that has a class whose ID matches a key in it. If the map is not provided or does not contain a key for a token class matching a terminal being generated, a default string will be automatically generated for it.

func Leaf added in v1.0.0

func Leaf(term string, t ...lex.Token) *Tree

Leaf is a convenience function for creating a new Tree that represents a terminal symbol. The Source token may or may not be set as desired. Note that t's type being ...Token is simply to make it optional; only the first such provided t is examined.

func MustParseTreeFromDiagram added in v0.8.0

func MustParseTreeFromDiagram(s string) *Tree

MustParseTreeFromDiagram is the same as ParseTreeFromDiagram but panics if any error occurs.

func Node added in v1.0.0

func Node(nt string, children ...*Tree) *Tree

Node is a convenience function for creating a new Tree that represents a non-terminal symbol with minimal properties.

func ParseTreeFromDiagram added in v0.8.0

func ParseTreeFromDiagram(s string) (*Tree, error)

ParseTreeFromDiagram reads a diagram of a parse tree and returns a Tree that represents it. In the diagram string s, terminal nodes are enclosed in parenthesis brackets, while non-terminal nodes are enclosed in square brackets. The diagram is read from left to right, and all whitespace is ignored. If a literal parenthesis or square bracket is desired, it must be escaped with a backslash. literal backslashes must be escaped with another backslash.

For example, the following diagram:

 [S

			  [NUM
			    (-)
			    (2)
			  ]
			  (+)
			  [NUM
				(3)
			  ]

	    ]

func (Tree) Copy added in v1.0.0

func (pt Tree) Copy() Tree

Copy returns a duplicate, deeply-copied parse tree.

func (Tree) Equal added in v1.0.0

func (pt Tree) Equal(o any) bool

Equal returns whether the parseTree is equal to the given object. If the given object is not a parseTree, returns false, else returns whether the two parse trees have the exact same structure.

Does not consider the Source field, ergo only the structures of the trees are compared, not their contents.

Runs in O(n) time with respect to the number of nodes in the trees.

func (Tree) Follow added in v1.0.0

func (pt Tree) Follow(path []int) *Tree

Follow takes a path, denoted as a slice of indexes of children to follow, starting from the Tree it is called on, and returns the descendant tree it leads to.

func (Tree) IsSubTreeOf added in v1.0.0

func (pt Tree) IsSubTreeOf(t Tree) (contains bool, path []int)

IsSubTreeOf checks if this Tree is a sub-tree of the given parse tree t. Does not consider Source for its comparisons, ergo only the structure is examined.

This performs a depth-first traversal of t, checking if there is any sub-tree in t s.t. pt is exactly equal to that node. Runs in O(n^2) time with respect to the number of nodes in the trees.

Returns whether pt is a sub-tree of t, and if so, the path to the first node in t where this is the case. The path is represented as a slice of ints where each is the child index of the node to traverse to. If it is empty, then the root node is the first node where sub is a sub-tree; this is not necessarily the same as equality.

func (Tree) PathToDiff added in v1.0.0

func (pt Tree) PathToDiff(t Tree, ignoreShortCircuit bool) (path []int, diverges bool)

PathToDiff returns the point at which the two parse trees diverge, as well as whether they diverge at all. If they do not diverge, the returned path should not be used.

Finds the path to the point at which the two trees diverge. If set to ignore short-circuit, does not include any branches that were inserted via shortest-circuit, as detected with "__ICTIO__:SHORTCIRC:" in front of it.

If there are multiple nodes that satisfy the above definition, then the point of divergence is the first common ancestor of all such nodes.

This does not consider the Source field, ergo only the structures of the trees are compared, not their contents.

Runs in O(n) time with respect to the number of nodes in the trees.

func (Tree) String added in v1.0.0

func (pt Tree) String() string

String returns a prettified representation of the entire parse tree suitable for use in line-by-line comparisons of tree structure. Two parse trees are considered semantcally identical if they produce identical String() output.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL