fsplParser

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 31, 2024 License: GPL-3.0 Imports: 10 Imported by: 0

README

parser

Responsibilities

  • Define syntax tree type that contains entities
  • Turn streams of tokens into abstract syntax tree entities

Organization

The entry point for all logic defined in this package is the Tree type. On this type, the Parse() method is defined. This method creates a new parser, and uses it to parse a stream of tokens into the tree. The details of the parser are hidden from other packages, and instances of it only last for the duration of Tree.Parse().

Operation

The parser holds a pointer to the Tree that created it, as well as the lexer that was passed to it. Its parse() method attempts to consume all tokens produced by the lexer, parsing them into syntax entities which it places into the tree.

parser.parse() parses top level entities, which include functions, methods, and typedefs. For each top-level entity, the parser will call a specialized parsing routine to parse that entity depending on the current token's kind and value. These routines in turn call other routines, which call other routines, etc.

All parsing routines follow this general flow:

  • Start with the token already present in Parser.token. Do not get the token after it.
  • Use Parser.expect(), Parser.expectValue(), etc. to test whether the token is a valid start for the entity
  • If starting by calling another parsing method, trust that method to do this instead.
  • When getting new tokens, use Parser.expectNext(), Parser.expectNextDesc(), etc. Only use Parser.next() when getting a token right before calling another parsing method, or at the very end of the current method.
  • To terminate the method, get the next token and do nothing with it.
  • If terminating by calling another parsing method, trust that method to do this instead.

Remember that parsing routines always start with the current token, and end by getting a trailing token for the next method to start with. This makes it possible to reliably switch between parsing methods depending on the type or value of a token.

The parser must never backtrack or look ahead, but it may revise previous data it has output upon receiving a new token that comes directly after the last token of said previous data. For example:

  • X in XYZ may not be converted to A once the parser has seen Z, but
  • X in XYZ may be converted to A once the parser has seen Y.

This disallows complex and ambiguous syntax, but should allow things such as the very occasional infix operator (like . and =)

Expression Parsing

Expression notation is the subset of FSPL that is used to describe computations and data/control flow. The expression parsing routine is the most important part of the FSPL parser, and also the most complex. For each expression, the parser follows this decision tree to determine what to parse:

| +Ident =Variable
|        | 'true'       =LiteralBoolean
|        | 'false'      =LiteralBoolean
|        | 'nil'        =LiteralNil
|        | 'if'         =IfElse
|        | 'match'      =Match
|        | 'switch'     =Switch
|        | 'loop'       =Loop
|        | +Colon       =Declaration
|        | +DoubleColon =Call
|
| +LParen X =LiteralArray
|         | +Dot =LiteralStruct
|
| +LBracket | +Ident =Call
|           |        | 'break'  =Break
|           |        | 'return' =Return
|           | 
|           | +Dot +Expression =Dereference
|           |                  | +Expression =Subscript
|           | +Star =Operation
|           |
|           | +Symbol X
|                     | '\'      =Slice
|                     | '#'      =Length
|                     | '@'      =Reference
|                     | '~'      =ValueCast
|                     | '~~'     =BitCast
|                     | OPERATOR =Operation
|
| +LBrace =Block
| +Int    =LiteralInt
| +Float  =LiteralFloat
| +String =LiteralString

Each branch of the decision tree is implemented as a routine with one or more switch statements which call other routines, and each leaf is implemented as a normal entity parsing routine.

Expressions that are only detected after more than one token has been consumed have parsing routines with "-Core" appended to them. This means that they do not begin at the first token in the expression, but instead at the point where there is no longer any ambiguity as to what they are.

Type Parsing

Type notation is the subset of FSPL that is used to describe data types. Though it is not as complex as expression notation, it still needs a decision tree to determine what type to parse:

| +Ident     =TypeNamed
| +TypeIdent =TypeNamed
| +Int       =TypeArray
|
| +LParen X
|         | +Dot =TypeStruct
|         |
|         | +Symbol X
|                   | '&' =TypeInterface
|                   | '|' =TypeSum
|
| +Star =TypePointer
        | +Colon     =TypeSlice

Documentation

Overview

Package fsplParser implements the parsing stage of the FSPL compiler. The parser takes in a stream of tokens from the lexer and converts them into an abstract syntax tree (AST), which is assembled out of structures from the entity package. The parser only fills out fields on the structures that pertain to position and syntax.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Tree

type Tree struct {
	Declarations []entity.TopLevel
}

Tree represents a parsed abstract syntax tree. It has no constructor and its zero value can be used safely.

func (*Tree) AddDeclaration

func (this *Tree) AddDeclaration(topLevel ...entity.TopLevel)

AddDeclaration adds a top-level entity to the tree.

func (*Tree) Parse

func (this *Tree) Parse(lx lexer.Lexer) error

Parse parses the output of the given lexer into the tree.

func (*Tree) Skim

func (this *Tree) Skim(lx lexer.Lexer) error

Skim is like Parse, but does not keep code within functions and methods, instead marking them as external.

func (*Tree) String

func (this *Tree) String() string

String returns a string representation of the tree. The result of this will be syntactically valid.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL