lex

package module
v0.0.0-...-ce0fb5e Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 22, 2018 License: BSD-3-Clause Imports: 13 Imported by: 1

README

github.com/cznic/lex has moved to modernc.org/lex (vcs).

Please update your import paths to modernc.org/lex.

This repo is now archived.

Documentation

Overview

Package lex provides support for a *nix (f)lex like tool on .l sources. The syntax is similar to a subset of (f)lex, see also: http://flex.sourceforge.net/manual/Format.html#Format

Changelog

2014-11-18: Add option for marking an accepting state. Required to support POSIX longest match.

Some feature examples:

/* Unindented multiline Go comments in the definitions section */

	Any indented text in the definitions section

%{
Any text in the definitions section within %{ and %}
%}

D [0-9]

%s non-exclusive-start-condition s2 s3

%x exclusive-start-condition e2

%yyt getTopState() // not required when only INITIAL start condition exists
%yyb last == '\n' || last = '\0'
%yyc getCurrentChar()
%yyn move() // get next character
%yym mark() // now in accepting state

%%
	Indented text before the first rule is presumably treated specially (renderer specific)

{D}+	return(INT)

{D}+\.{D}+
	return(FLOAT)

[a-z][a-z0-9]+
	/* identifier found */
	return(IDENT)

A"[foo]\"bar"Z println(`A[foo]"barZ`)

^bol|eol$

<non-exclusive-start-condition>foo
%{
	println("foo found")
%}

<s2,s3>bar

<INITIAL,e2>abc

<*>"always" println("active in all start conditions")

%%
The optional user code section. Possibly the place where a lexem recognition fail will
be handled (renderer specific).

Missing/differing functionality of the .l parser/FSM generator (compared to flex):

  • Trailing context (re1/re2).
  • No requirement of an action to start on the same line as the pattern.
  • Processing of actions enclosed in braces. This package mostly treats any non blank text following a pattern up to the next pattern as an action source code.
  • All flex % prefixed options except %s and %x.
  • Flex incompatible %yy* options
  • No cclasses ([[:digit:]]).
  • Anything special after '(?'.
  • Matching <<EOF>>. Still \0 is OK in a pattern.
  • And probably more.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type L

type L struct {
	// Source code lines for rendering from the definitions section
	DefCode []string
	// Names of declared start conditions with their respective numeric
	// identificators
	StartConditions map[string]int
	// Start conditions numeric identificators with their respective DFA
	// start state
	StartConditionsStates map[int]*lexer.NfaState
	// Beginnig of line start conditions numeric identificators with their
	// respective DFA start state
	StartConditionsBolStates map[int]*lexer.NfaState
	// Rule[0] is a pseudo rule. It's action contains the source code for
	// rendering from the rules section before firts rule
	Rules []Rule
	// The generated FSM
	Dfa lexer.Nfa
	// Accept states with their respective rule index
	Accepts map[*lexer.NfaState]int
	// Source code for rendering from the user code section
	UserCode string
	// Source code for rendering of get_current_start_condition. Set by
	// %yyt.
	YYT string
	// Source code for rendering of get_bol, i.e. if we are at the
	// beginning of line right now. Set by %yyb.
	YYB string
	// Source code for rendering of get_peek_char, i.e. the char the lexer
	// will now consider in making of a decision. Set by %yyc.
	YYC string
	// Source code for rendering of move_to_next_char, i.e. "consume" the
	// current peek char and go to the next one. Set by %yyn.
	YYN string
	// Source code for rendering of mark_accepting, support to accept
	// longest matching but reusing the "overflowed" input. Set by %yym.
	YYM string
}

L represents selected data structures found in / generated from a .l source. A [command line] tool using this package may then render L to some programming language source code and/or data table(s).

func NewL

func NewL(fname string, src io.RuneReader, unoptdfa, mode32 bool) (l *L, err error)

NewL parses a .l source fname from src, returns L or an error if any. Currently it is not reentrant and not invokable more than once in an application (which is assumed tolerable for a "lex" tool). The unoptdfa argument allows to disable optimization of the produced DFA. The mode32 parameter is not yet supported and must be false.

func (*L) DfaString

func (l *L) DfaString() string

DfaString returns the textual representation of the Dfa field.

func (*L) String

func (l *L) String() string

type Rule

type Rule struct {
	Conds   []string // Start conditions of the rule
	Pattern string   // Original rule's pattern
	BOL     bool     // Pattern starts with beginning of line assertion (^)
	EOL     bool     // Pattern ends wih end of line ($) assertion
	RE      string   // Pattern translated to a regular expression
	Action  string   // Rule's associated action source code
}

Rule represents data for a pattern/action

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL