parser

package
v1.6.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 1, 2023 License: AGPL-3.0 Imports: 12 Imported by: 0

Documentation

Overview

The `parser` package provides primitives that help with writing recursive descent parsers. This version is a golang port of the original Python implementation from https://tinyurl.com/rdescent

The `Parser` struct is supposed to be extended to support parsing a new language. Take a look at `lama2parser.go` for an example. Essentially the actual parsing begins from the `Start()` method.

Index

Constants

This section is empty.

Variables

View Source
var DataInputType string

Functions

func CustomPairMerge

func CustomPairMerge(destination, source interface{}) interface{}

CustomPairMerge uses a gabs feature to deal with merge conflicts. More here: https://github.com/HexmosTech/gabs/blob/master/gabs.go#L511

Types

type Lama2Parser

type Lama2Parser struct {
	*Parser
	Context   map[string]bool
	MarkRange map[string]int
}

func NewLama2Parser

func NewLama2Parser() *Lama2Parser

NewLama2Parser creates a new Lama2Parser and initializes it properly

func (*Lama2Parser) AnyType

func (p *Lama2Parser) AnyType() (*gabs.Container, error)

AnyType is the top-most element of a JSON structure It consists of Complex and Primitive Types

func (*Lama2Parser) Boolean

func (p *Lama2Parser) Boolean() (*gabs.Container, error)

func (*Lama2Parser) ComplexType

func (p *Lama2Parser) ComplexType() (*gabs.Container, error)

func (*Lama2Parser) DataHeader

func (p *Lama2Parser) DataHeader() (*gabs.Container, error)

func (*Lama2Parser) DataInput

func (p *Lama2Parser) DataInput() (*gabs.Container, error)

func (*Lama2Parser) Details

func (p *Lama2Parser) Details() (*gabs.Container, error)

func (*Lama2Parser) Digit

func (p *Lama2Parser) Digit() (*gabs.Container, error)

func (*Lama2Parser) Digits

func (p *Lama2Parser) Digits() (*gabs.Container, error)

func (*Lama2Parser) Exponent

func (p *Lama2Parser) Exponent() (*gabs.Container, error)

An Exponent consists of mandatory 'e' or 'E', optional Sign, followed by Digits

func (*Lama2Parser) FilesPair

func (p *Lama2Parser) FilesPair() (*gabs.Container, error)

FilesPair tries to match key and value separated by `@`. The key and value can either be a quoted string, or an unquoted Files Unquoted String. If there is no match for either, a ParseError is returned.

func (*Lama2Parser) FilesUnquoted

func (p *Lama2Parser) FilesUnquoted() (*gabs.Container, error)

FilesUnquoted matches a string of characters other than `@` and returns them as a String

func (*Lama2Parser) Form added in v1.6.0

func (p *Lama2Parser) Form() (*gabs.Container, error)

func (*Lama2Parser) Fraction

func (p *Lama2Parser) Fraction() (*gabs.Container, error)

func (*Lama2Parser) FractionRule1

func (p *Lama2Parser) FractionRule1() (*gabs.Container, error)

A Fraction consists of mandatory "." (dot), followed by Digits.

func (*Lama2Parser) HTTPVerb

func (p *Lama2Parser) HTTPVerb() (*gabs.Container, error)

func (*Lama2Parser) HeaderData

func (p *Lama2Parser) HeaderData() (*gabs.Container, error)

func (*Lama2Parser) HeaderPair

func (p *Lama2Parser) HeaderPair() (*gabs.Container, error)

func (*Lama2Parser) Headers

func (p *Lama2Parser) Headers() (*gabs.Container, error)

Headers detects HTTP headers; essentially strings separated by ":" character

func (*Lama2Parser) Integer

func (p *Lama2Parser) Integer() (*gabs.Container, error)

func (*Lama2Parser) IntegerRule1

func (p *Lama2Parser) IntegerRule1() (*gabs.Container, error)

InterRule1 matches a Digit

func (*Lama2Parser) IntegerRule2

func (p *Lama2Parser) IntegerRule2() (*gabs.Container, error)

IntegerRule2 matches 1-9 mandatorily, and then tries to follow it with Digits

func (*Lama2Parser) IntegerRule3

func (p *Lama2Parser) IntegerRule3() (*gabs.Container, error)

IntegerRule3 starts with a mandatory Sign, and follows with IntegerRule1 (Digit)

func (*Lama2Parser) IntegerRule4

func (p *Lama2Parser) IntegerRule4() (*gabs.Container, error)

IntegerRule4 starts with a mandatory Sign, and follows with IntegerRule2

func (*Lama2Parser) L2Variable added in v1.6.0

func (p *Lama2Parser) L2Variable() (*gabs.Container, error)

func (*Lama2Parser) Lama2File added in v1.0.5

func (p *Lama2Parser) Lama2File() (*gabs.Container, error)

func (*Lama2Parser) List

func (p *Lama2Parser) List() (*gabs.Container, error)

List is a slightly lenient version of standard JSON list. In Lama2 List, it is OK to have a trailing comma after the last element (whereas in strict JSON, it is not OK to have trailing comma)

func (*Lama2Parser) Map

func (p *Lama2Parser) Map() (*gabs.Container, error)

Map is a slightly lenient version of standard JSON map. In Lama2 Map, it is OK to have a trailing comma after the last element (whereas in strict JSON, it is not OK to have trailing comma)

func (*Lama2Parser) Multipart

func (p *Lama2Parser) Multipart() (*gabs.Container, error)

func (*Lama2Parser) Null

func (p *Lama2Parser) Null() (*gabs.Container, error)

func (*Lama2Parser) Number

func (p *Lama2Parser) Number() (*gabs.Container, error)

A Number consists of a mandatory integer part, and optional Fraction and Exponent parts. The Number method "collects" these three elements, converts them into a json.Number() type, and finally returns the Number wrapped within a gabs Container

func (*Lama2Parser) OneNine

func (p *Lama2Parser) OneNine() (*gabs.Container, error)

func (*Lama2Parser) Pair

func (p *Lama2Parser) Pair() (*gabs.Container, error)

func (*Lama2Parser) PrimitiveType

func (p *Lama2Parser) PrimitiveType() (*gabs.Container, error)

func (*Lama2Parser) Processor added in v1.3.0

func (p *Lama2Parser) Processor() (*gabs.Container, error)

func (*Lama2Parser) QuotedString

func (p *Lama2Parser) QuotedString() (*gabs.Container, error)

QuotedString accepts both single-quoted and double-quoted types of strings. Moreover, it can deal with unicode escape characters, control characters appropriately Ultimately, we get a string wrapped in a gabs container

func (*Lama2Parser) Requester added in v1.3.0

func (p *Lama2Parser) Requester() (*gabs.Container, error)

Requester applies the rule: HTTPVerb Multipart? TheURL Details?

func (*Lama2Parser) Separator added in v1.3.0

func (p *Lama2Parser) Separator() (*gabs.Container, error)

func (*Lama2Parser) Sign

func (p *Lama2Parser) Sign() (*gabs.Container, error)

func (*Lama2Parser) Start

func (p *Lama2Parser) Start() (*gabs.Container, error)

Start primarily calls the Lama2File method

func (*Lama2Parser) TheURL

func (p *Lama2Parser) TheURL() (*gabs.Container, error)

func (*Lama2Parser) Unquoted

func (p *Lama2Parser) Unquoted() (*gabs.Container, error)

func (*Lama2Parser) VarJSON

func (p *Lama2Parser) VarJSON() (*gabs.Container, error)

Method VarJSON behaves in two ways depending on whether `multipart` or `form` is true or not. If there is no multipart, then VarJSON tries to match one or more VarJSONPairs However, if there is multipart or form, we try to match zero or more VarJSON, followed by zero or more file fields (separated by `@`). If there is no match at all, we return a ParseError; otherwise the we return the parsed data.

func (*Lama2Parser) VarJSONPair

func (p *Lama2Parser) VarJSONPair() (*gabs.Container, error)

VarJSONPair tries to match key and value separated by `=`. The key and value can either be a quoted string, or an unquoted VarJSON unquoted string. If there is no match for either, a ParseError is returned.

func (*Lama2Parser) VarJSONUnquoted

func (p *Lama2Parser) VarJSONUnquoted() (*gabs.Container, error)

VarJSONUnquoted matches a string of characters other than `=` and returns them as a String

type MinimalParser

type MinimalParser interface {
	Start() (*gabs.Container, error)
}

MinimalParser enforces concrete Types to have a Start() method, from which parsing process begins. In the present case, `Lama2Parser` adds up dozens of of methods to implement `.l2` syntax

type Parser

type Parser struct {
	Text     []rune
	Pos      int
	TotalLen int

	Pm MinimalParser

	LineNum int
	// contains filtered or unexported fields
}

Struct Parser stores information about the parsing process throughout. 1. Text: Incoming text is stored as an array of runes, to correctly handle unicode characters 2. Pos: Indicates the index position in Text which has already been scanned; starts with -1 3. TotalLen: Number of runes in the input 4. Pm: Composing an external MinimalParser (such as Lama2Parser) which builds upon Parser to provide the new language recognition capabilities 5. ruleMethodMap: Scans through Pm, and creates a mapping from method name to method value through reflection 6. LineNum: Number of normalized newlines found till now. Used in providing useful context in error messages

func (*Parser) Char

func (p *Parser) Char() (rune, error)

func (*Parser) CharClass

func (p *Parser) CharClass(charClass string) (rune, error)

CharClass implements the familiar regex syntax for specifying ranges of characters that are deemed acceptable. A good description of CharClass is available here: Read the section "Processing Character Ranges" at https://www.booleanworld.com/building-recursive-descent-parsers-definitive-guide/

func (*Parser) Init

func (p *Parser) Init()

Method Init creates the most important data stucture for parsing: ruleMethodMap. We use reflection to create a mapping of each Pm.<method_name> to <method_value>

func (*Parser) Keyword

func (p *Parser) Keyword(kw string, eatWsStart bool, eatWsEnd bool, caseInsensitive bool) ([]rune, error)

Method Keyword is a versatile; it can eat whitespace before/after the expected string, and it can do an optionally case insensitive match for the keyword

func (*Parser) LookAhead added in v1.3.0

func (p *Parser) LookAhead(rules []string) bool

func (*Parser) Match

func (p *Parser) Match(rules []string) (*gabs.Container, error)

Method Match is the most important of all in the parser package. Match takes in a slice of rules (essentially method names), and then executes them one by one. On successful match, we return a gabs Container with `error` set to `nil` When a rule fails to match, we reset the scan position to initial position; moreover, we keep a continuous track of the farthest/longest match till present. The farthest match error is potentially the most useful error message to the user; thus, for error report, Match returns the farthest matching error

func (*Parser) MatchUntil added in v1.3.0

func (p *Parser) MatchUntil(end string) (*gabs.Container, error)

func (*Parser) Parse

func (p *Parser) Parse(text string) (*gabs.Container, error)

Method Parse normalizes newlines and then creates a rune version of the input data. The Start() method proceeds to process the rune version of data

func (*Parser) SetText

func (p *Parser) SetText(text string)

Method SetText is a utility used primarily in testing, when we don't want to call Start() automatically as in Parse

func (*Parser) SplitCharRanges

func (p *Parser) SplitCharRanges(charClass string) ([]string, error)

func (*Parser) Start

func (p *Parser) Start() *gabs.Container

Start() in Parser provides a dummy default implementation; the expectation is that the higher level Struct (Pm) will implement its own version

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL