lexer

package
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 12, 2023 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package lexer implements spok's semantic lexer.

Spok uses a concurrent, state-function based lexer similar to that described by Rob Pike in his talk "Lexical Scanning in Go", based on the implementation of template/text in the go std lib.

The lexer proceeds one utf-8 rune at a time until a particular lexical token is recognised, the token is then "emitted" over a channel where it may be consumed by a client e.g. the parser. The state of the lexer is maintained between token emits unlike a more conventional switch-based lexer that must determine it's current state from scratch in every loop.

This lexer uses "lexFunctions" to pass the state from one loop to an another. For example, if we're currently lexing a global variable ident, the next token must be a ':=' so we can go straight there without traversing the entire lexical state space first to determine "are we in a global variable definition?".

The lexer 'run' method consumes these "lexFunctions" which return states in a continual loop until nil is returned marking the fact that either "there is nothing more to lex" or "we've hit an error" at which point the lexer closes the tokens channel, which will be picked up by the parser as a signal that the input stream has ended.

In lexing/parsing, the error checking complexity is always kept somewhere. Spok has made the choice that the lexer should do much of the syntax error handling as it has the most direct access to the raw input as well as the positions, characters etc. The approach of stateful "lexFunctions" helps enable this as every lexing function "knows where it is" in the language, improving the quality of the error messages. The lexer handling most of the error complexity has helped to keep the parser very simple which I think is a good trade off and the test cases for the parser already far outweigh that of the lexer.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Lexer

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer is spok's semantic Lexer.

func New

func New(input string) *Lexer

New creates a new lexer for the input string and sets it off in a goroutine.

func (*Lexer) NextToken

func (l *Lexer) NextToken() token.Token

NextToken returns the next token from the input, generally called by the parser not the lexing goroutine.

type Tokeniser

type Tokeniser interface {
	// NextToken yields a single token from the input stream.
	NextToken() token.Token
}

Tokeniser represents anything capable of producing a token.Token when asked to by it's NextToken method, this includes our actual Lexer defined below but can be readily stubbed out for testing e.g. the parser.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL