Documentation ¶
Overview ¶
Package parse - easy to use PEG implementation with Go.
This package contains PEG (Parsing Expressions Grammar) implementation that could be used with Go. This library is much different from other libraries because grammar mapped to Go types, so you don't need to use external grammar files nor expressions to specify one like with pyparsing or Boost.Spirit.
For example you can parse hello world using this structure:
type HelloWorld struct { Hello string `regexp:"[hH]ello"` _ string `literal:","` World string `regexp:"[a-zA-Z]+"` _ string `regexp:"!?"` }
And the only thing you need to do is call Parse function:
var hello HelloWorld newLocation, err := parse.Parse(&hello, []byte("Hello, World!"), nil)
You can also specify whitespace skipping function (default is to skip all spaces, tabulations, new-lines and carier returns) packrat using, grammar debugging options et. cetera.
One of the interesting features of this library is ability to parse Go base data types using Go grammar. For example you can simply parse int64 with Parse:
var i int64 newLocation, err := parse.Parse(&i, []byte("123"), nil)
If you need to parse variant types you need to insert FirstOf as first field in your structure:
type StringOrInt struct { FirstOf Str string Int int } newLocation, err := parse.Parse(new(StringOrInt), `"I can parse Go string!"`, nil)
Optional fields must be of pointer type and contain `optional:"true"` tag. You can use slices that will be parsed as ELEMENT* or ELEMENT+ (if `repeat:"+"` was set in tag). You can specify another tags and types listed bellow.
+-------------+-------------+----------------------------------------------------+ | Type | Tag | Description | +-------------+-------------+----------------------------------------------------+ | string | | Parse Go string. `string` and "string" are both | | | | supported. | +-------------+-------------+----------------------------------------------------+ | string | regexp | Parse regular expression in regexp module syntax. | +-------------+-------------+----------------------------------------------------+ | string | literal | Parse literal specified in tag. If there are both | | | | regexp and literal specified regexp will be used. | +-------------+-------------+----------------------------------------------------+ | int* | | Parse integer constant. Hexadecimal, Octal and | | | | decimal constants supported. int32 and rune types | | | | are the same type in Go, so int32 parse characters | | | | in Go syntax. | +-------------+-------------+----------------------------------------------------+ | int* | parse | If tag parse:"#" was set parser will save current | | | | location in this field and will not advance one. | +-------------+-------------+----------------------------------------------------+ | uint* | | Same as int* but unsigned constant. | +-------------+-------------+----------------------------------------------------+ | float* | | Parse floating point number. | +-------------+-------------+----------------------------------------------------+ | bool | | Parse boolean constant (true or false) | +-------------+-------------+----------------------------------------------------+ | []type | parse | Parse sequence of type. If parse is not specified | | | | or parse is '*' here could be zero or more | | | | elements. If parse is '+' here could be one or | | | | more elements. | +-------------+-------------+----------------------------------------------------+ | []type | delimiter | Parse list with delimiter literal. It is very | | | | common situation to have a DELIMITER b DELIMITER...| | | | like lists so I think that it is good idea to | | | | support such lists out of the box. | +-------------+-------------+----------------------------------------------------+ | *type | parse | Parse type. Element will be allocated or set to nil| | | | for optional elements that doesn't present. If | | | | parse was specified and set to '?' element is | | | | optional: if it is not present in the input field | | | | will be nil. | +-------------+-------------+----------------------------------------------------+ | any | parse | If parse == "skip" field will be skipped while | | | | parsing or encoding. If parse == "&" it is followed| | | | by element: it will be parsed but position will not| | | | be increased. If parse == "!" it is not predicate: | | | | element must not be present at this position. | +-------------+-------------+----------------------------------------------------+ | any | set | If present this tag contains name of the method to | | | | call after parsing of element. Method must have | | | | signature func (x element-type) error. | +-------------+-------------+----------------------------------------------------+
Parser supports left recursion out of the box so you can parse expressions without a problem. For example you can parse this grammar:
X <- E E <- X '-' Number / Number
with
type X struct { Expr E } type E struct { FirstOf Expr struct { Expr *X _ string `regexp:"-"` N uint64 } N uint64 }
Index ¶
- func Append(array []byte, value interface{}) ([]byte, error)
- func Parse(result interface{}, str []byte, params *Options) (newLocation int, err error)
- func SkipAdaComment(str []byte, loc int) int
- func SkipAll(str []byte, loc int, funcs ...func([]byte, int) int) int
- func SkipCComment(str []byte, loc int) int
- func SkipCPPComment(str []byte, loc int) int
- func SkipHTMLComment(str []byte, loc int) int
- func SkipLispComment(str []byte, loc int) int
- func SkipMultilineComment(str []byte, loc int, begin, end string, recursive bool) int
- func SkipOneLineComment(str []byte, loc int, begin string) int
- func SkipPascalComment(str []byte, loc int) int
- func SkipShellComment(str []byte, loc int) int
- func SkipSpaces(str []byte, loc int) int
- func SkipTeXComment(str []byte, loc int) int
- func Write(out io.Writer, value interface{}) error
- type Error
- type FirstOf
- type Options
- type Parser
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Parse ¶
Parse value from string and return position after parsing and error. This function parses value using PEG parser. Here: result is pointer to value, str is string to parse, params is parsing parameters. Function returns newLocation - location after the parsed string. On errors err != nil.
func SkipAdaComment ¶
SkipAdaComment skips Ada style comment: "-- .... \n"
func SkipCComment ¶
SkipCComment skips C style comment: "/* ..... */"
func SkipCPPComment ¶
SkipCPPComment skips C++ style comment: "// ..... \n"
func SkipHTMLComment ¶
SkipHTMLComment skips HTML style comment: "<!-- ... -->"
func SkipLispComment ¶
SkipLispComment skips Lisp style comment: "; .... \n"
func SkipMultilineComment ¶
SkipMultilineComment skips multiline comment that starts from begin and ends with end. If you are allowing nested comments recursive must be set to true.
func SkipOneLineComment ¶
SkipOneLineComment skips one-line comment that starts from begin and ends with newline or end of string
func SkipPascalComment ¶
SkipPascalComment skips Pascal style comment: "(* ... *)"
func SkipShellComment ¶
SkipShellComment skips shell style comment: "# .... \n"
func SkipSpaces ¶
SkipSpaces skips spaces, tabulations and newlines:
func SkipTeXComment ¶
SkipTeXComment skips TeX style comment: "% .... \n"
Types ¶
type Error ¶
type Error struct { // Original string Str []byte // Location of this error in the original string Location int // Error message Message string }
Error is parse error representation. Error implements error interface. Error message contains message, position information and marked error line.
type FirstOf ¶
type FirstOf struct { // Name of parsed field Field string }
FirstOf is structure that indicates that we need to parse first expression of the fields of structure. After pasring Field contains name of parsed field.
type Options ¶
type Options struct { // Function to skip whitespaces. If nil will not skip anything. SkipWhite func(str []byte, loc int) int // Flag to enable packrat parsing. If not set packrat table is used only for left recursion detection and processing. PackratEnabled bool // Enable grammar debugging messages. It is useful if you have some problems with grammar but produces a lot of output. Debug bool }
Options is structure containing parameters of the parsing process.
type Parser ¶
type Parser interface { // This function must parse value from buffer and return length or error ParseValue(buf []byte, loc int) (newLocation int, err error) // This function must write value into the output stream. WriteValue(out io.Writer) error }
Parser interface. Parser will call ParseValue method to parse values of this types.