ebnf

package module
v0.0.0-...-b10e257 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 21, 2015 License: BSD-3-Clause Imports: 9 Imported by: 2

README

Package ebnf is a library for EBNF grammars salvaged from pre Go 1 exp/ebnf as
other packages at github.com/cznic used to depend on it.

This fork adds literal regular expression support in the EBNF grammar, so that
a complete grammar with terminal definitions is possible.

Install: $ go get github.com/PuerkitoBio/ebnf
Godocs: http://godoc.org/github.com/PuerkitoBio/ebnf

Documentation

Overview

Package ebnf is a library for EBNF grammars. The input is utf8 text satisfying the following grammar (represented itself in EBNF):

Production = name "=" [ Expression ] "." . Expression = Alternative { "|" Alternative } . Alternative = Term { Term } . Term = name | regexp_lit | str_lit | char_lit [ "…" char_lit ] | Group | Option | Repetition . Group = "(" Expression ")" . Option = "[" Expression "]" . Repetition = "{" Expression "}" .

name = /[\pL_][\pL\pNd_]*/ . // unicode letter or underscore, then add unicode digits regexp_lit = /\/[^\n\/]+\// . // TODO, missing escaped slash str_lit = . // TODO, same as Go char_lit = . // TODO, same as Go

A name is a Go identifier, a token is a Go string, and comments and white space follow the same rules as for the Go language. Regular expression literals are between forward slashes, and are validated as valid expressions according to the Go language's regexp syntax when the Verify function is called on a parsed grammar. Production names starting with an uppercase Unicode letter denote non-terminal productions (i.e., productions which allow white-space and comments between tokens); all other production names denote lexical productions.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Verify

func Verify(grammar Grammar, start string) error

Verify checks that:

  • all productions used are defined
  • all productions defined are used when beginning at start
  • lexical productions refer only to other lexical productions
  • regular expression literals are valid

Position information is interpreted relative to the file set fset.

Types

type Alternative

type Alternative []Expression // x | y | z

An Alternative node represents a non-empty list of alternative expressions.

func (Alternative) Pos

func (x Alternative) Pos() scanner.Position

type Bad

type Bad struct {
	TokPos scanner.Position
	Error  string // parser error message
}

A Bad node stands for pieces of source code that lead to a parse error.

func (*Bad) Pos

func (x *Bad) Pos() scanner.Position

type Expression

type Expression interface {
	// Pos is the position of the first character of the syntactic construct
	Pos() scanner.Position
}

An Expression node represents a production expression.

type Grammar

type Grammar map[string]*Production

A Grammar is a set of EBNF productions. The map is indexed by production name.

func Parse

func Parse(filename string, src io.Reader) (Grammar, error)

Parse parses a set of EBNF productions from source src. It returns a set of productions. Errors are reported for incorrect syntax and if a production is declared more than once; the filename is used only for error positions.

type Group

type Group struct {
	Lparen scanner.Position
	Body   Expression // (body)
}

A Group node represents a grouped expression.

func (*Group) Pos

func (x *Group) Pos() scanner.Position

type Name

type Name struct {
	StringPos scanner.Position
	String    string
}

A Name node represents a production name.

func (*Name) Pos

func (x *Name) Pos() scanner.Position

type Option

type Option struct {
	Lbrack scanner.Position
	Body   Expression // [body]
}

An Option node represents an optional expression.

func (*Option) Pos

func (x *Option) Pos() scanner.Position

type Production

type Production struct {
	Name *Name
	Expr Expression
}

A Production node represents an EBNF production.

func (*Production) Pos

func (x *Production) Pos() scanner.Position

type Range

type Range struct {
	Begin, End *Token // begin ... end
}

A List node represents a range of characters.

func (*Range) Pos

func (x *Range) Pos() scanner.Position

type Repetition

type Repetition struct {
	Lbrace scanner.Position
	Body   Expression // {body}
}

A Repetition node represents a repeated expression.

func (*Repetition) Pos

func (x *Repetition) Pos() scanner.Position

type Sequence

type Sequence []Expression // x y z

A Sequence node represents a non-empty list of sequential expressions.

func (Sequence) Pos

func (x Sequence) Pos() scanner.Position

type Token

type Token struct {
	StringPos scanner.Position
	String    string
	Regexp    bool
}

A Token node represents a literal.

func (*Token) Pos

func (x *Token) Pos() scanner.Position

Directories

Path Synopsis
cmd
Package scanner provides a scanner and tokenizer for UTF-8-encoded text.
Package scanner provides a scanner and tokenizer for UTF-8-encoded text.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL