lex

package
v0.0.0-...-a28e2a2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 30, 2017 License: MIT, BSD-3-Clause Imports: 6 Imported by: 2

Documentation

Overview

Tokenize UTF-8-encoded Prolog text. It takes an io.Reader providing the source, which then can be tokenized with the Scan function. For compatibility with existing tools, the NUL character is not allowed. If the first character in the source is a UTF-8 encoded byte order mark (BOM), it is discarded.

Basic usage pattern:

lexemes := lex.Scan(file)
for lexeme := range lexemes {
    // do something with lexeme
}

Index

Constants

View Source
const (
	EOF      = -(iota + 1) // reached end of source
	Atom                   // a Prolog atom, possibly quoted
	Comment                // a comment
	Float                  // a floating point number
	Functor                // an atom used as a predicate functor
	FullStop               // "." ending a term
	Int                    // an integer
	String                 // a double-quoted string
	Variable               // a Prolog variable
	Void                   // the special "_" variable
)

The result of Scan is one of these tokens or a Unicode character.

Variables

This section is empty.

Functions

func IsGraphic

func IsGraphic(ch rune) bool

True if the rune is a graphic token char per ISO §6.4.2

func Scan

func Scan(src io.Reader) <-chan *Eme

Scan tokenizes src in a separate goroutine sending lexemes down a channel as they become available. The channel is closed on EOF.

func TokenString

func TokenString(tok rune) string

TokenString returns a printable string for a token or Unicode character.

Types

type Eme

type Eme struct {
	Type    rune // EOF, Atom, Comment, etc.
	Content string
	Pos     *Position
}

A lex.Eme encapsulating its type and content

type List

type List struct {
	Value *Eme
	// contains filtered or unexported fields
}

An immutable list of lexemes which populates its tail by reading lexemes from a channel, such as that provided by Scan()

func NewList

func NewList(src <-chan *Eme) *List

NewLexemList returns a new lexeme list which pulls lexemes from the given source channel. Creating a new list consumes one lexeme from the source channel.

func (*List) Next

func (self *List) Next() *List

Next returns the next element in the lexeme list, pulling a lexeme from the source channel, if necessary

type Position

type Position struct {
	Filename string // filename, if any
	Offset   int    // byte offset, starting at 0
	Line     int    // line number, starting at 1
	Column   int    // column number, starting at 1 (character count per line)
}

A source position is represented by a Position value. A position is valid if Line > 0.

func (*Position) IsValid

func (pos *Position) IsValid() bool

IsValid returns true if the position is valid.

func (Position) String

func (pos Position) String() string

type Scanner

type Scanner struct {

	// Error is called for each error encountered. If no Error
	// function is set, the error is reported to os.Stderr.
	Error func(s *Scanner, msg string)

	// ErrorCount is incremented by one for each error encountered.
	ErrorCount int

	// Start position of most recently scanned token; set by Scan.
	// Calling Init or Next invalidates the position (Line == 0).
	// The Filename field is always left untouched by the Scanner.
	// If an error is reported (via Error) and Position is invalid,
	// the scanner is not inside a token. Call Pos to obtain an error
	// position in that case.
	Position
	// contains filtered or unexported fields
}

A Scanner implements reading of Unicode characters and tokens from an io.Reader.

func (*Scanner) Init

func (s *Scanner) Init(src io.Reader) *Scanner

Init initializes a Scanner with a new source and returns s. Error is set to nil, ErrorCount is set to 0

func (*Scanner) Next

func (s *Scanner) Next() rune

Next reads and returns the next Unicode character. It returns EOF at the end of the source. It reports a read error by calling s.Error, if not nil; otherwise it prints an error message to os.Stderr. Next does not update the Scanner's Position field; use Pos() to get the current position.

func (*Scanner) Peek

func (s *Scanner) Peek() rune

Peek returns the next Unicode character in the source without advancing the scanner. It returns EOF if the scanner's position is at the last character of the source.

func (*Scanner) Pos

func (s *Scanner) Pos() (pos Position)

Pos returns the position of the character immediately after the character or token returned by the last call to Next or Scan.

func (*Scanner) Scan

func (s *Scanner) Scan() rune

Scan reads the next token or Unicode character from source and returns it. It returns EOF at the end of the source. It reports scanner errors (read and token errors) by calling s.Error, if not nil; otherwise it prints an error message to os.Stderr.

func (*Scanner) TokenText

func (s *Scanner) TokenText() string

TokenText returns the string corresponding to the most recently scanned token. Valid after calling Scan().

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL