yy

command module
v0.0.0-...-2a689d0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 22, 2018 License: BSD-3-Clause Imports: 19 Imported by: 0

README

github.com/cznic/yy has moved to modernc.org/yy (vcs).

Please update your import paths to modernc.org/yy.

This repo is now archived.

Documentation

Overview

Command yy processes yacc source code and produces three output files:

- A Go file containing definitions of AST nodes.

- A Go file containing documentation examples[0] of productions defined by the yacc grammar.

- A new yacc file with automatic actions instantiating the AST nodes.

Installation

To install yy

$ go get [-u] github.com/cznic/yy

Online documentation

http://godoc.org/github.com/cznic/yy

Usage

Invocation:

$ yy [options] <input.y>

Options

Flags handled by the yy command:

-ast string
      Output AST nodes definitions. (default "ast.go")
-astExamples string
      Output AST examples. (default "ast_test.go")
-astImport string
      Optional AST file imports.
-exampleAST string
      Fuction to call to produce example ASTs. (default "exampleAST")
-kind string
      Default node kind (rule case) field name. (default "kind")
-namedCases
      Generate typed and named case numbers.
-node string
      Default non terminal yacc type. (default "node")
-o string
      Output yacc file. (default "parser.y")
-pkg string
      Package name of generated Go files. Extract from input when blank.
-prettyString string
      Fuction to stringify things nicely. (default "prettyString")
-token string
      Default terminal yacc type. (default "Token")
-tokenSep string
      AST examples token separator string. (default " ")
-v string
      create grammar report (default "y.output")
-yylex string
      Type of yacc's yylex. (default "*lexer")

Changelog

2017-10-23: Added the case directive.

Examples

A partial example: see the testdata directory and files

input:	in.y
output:	ast.go
output:	ast_test.go
output:	out.y

The three output files were generated by

yy -o testdata/out.y -ast testdata/ast.go -astExamples testdata/ast_test.go testdata/in.y

A more complete, working project using yy can be found at http://godoc.org/github.com/cznic/pl0

Concepts

Every rule is turned into a definition of a struct type in ast.go (adjust using the -ast flag). The fields of the type are a sum of all productions (cases) of the rule.

Rule:
        Foo Bar // Case 0
|       Foo Baz // Case 1

The generated type will be something like

type Rule struct {
        Case in // In [0, 1].
        Bar  *Bar
        Baz  *Baz
        Foo  *Foo
}

In the above, Foo and Bar fields will be non nill when Case is 0 and Foo and Baz fields will be non nil when Case is 1.

The above holds when both Foo and Bar are non terminal symbols. If the production(s) contain also terminal symbols, all those symbols are turned into fields named Token with an optional numeric suffix when more than one non terminal appears in any of the production(s).

Rule:
        Foo '+' Bar
|       Foo '[' NUMBER ']' Bar

The generated type will be like

type Rule struct {
        Case   int	// In [0, 1].
        Bar    *Bar
        Baz    *Baz
        Foo    *Foo
        Token  MyTokenType
        Token2 MyTokenType
        Token3 MyTokenType
}

In the above, Token will capture '+' when Case is 0. For Case 1, Token will capture '[', Token2 NUMBER and Token3 ']'.

MyTokenType is the type defined in the yacc %union as in

%union {
        node    MyNodeType
        Token   MyTokenType
}

It is assumed that the lexer passed as an argument to yyParse instantiantes the lval.Token field with additional token information, like the lexeme value, starting position in the file etc.

Generated actions

There's a direct mapping, though not in the same order, of yacc pseudo variables $1, $2, ... and fields of the generated node types. For every production not disabled by the yy:ignore direction, yy injects code for instantiating the AST node when the production is reduced. For example, this rule from input.y

File:
        Prologue TopLevelDeclList

having no semantic action is turned into

File:
        Prologue TopLevelDeclList
        {
                $$ = &File{
                        Prologue:          $1.(*Prologue),
                        TopLevelDeclList:  $2.(*TopLevelDeclList).reverse(),
                }
        }

in output.y. The default yacc type of AST nodes is 'node' and can be changed using the -node flag.

Conventions

Option-like rules, for example as in

BlockOpt:
|       Block

are converted into

        BlockOpt:
                /* empty */
                {
                        $$ = (*BlockOpt)(nil)
                }
        |       Block
                {
                        $$ = &BlockOpt{
                                Block:  $1.(*Block),
                        }
                }

in output.y, ie. the empty case does not produce a &RuleOpt{}, but nil instead to conserve space.

Generated examples depend on an user supplied function, by default named exampleAST, with a signature

exampleAST(rule int, src string) interface{}

This function is called with the production number, as assigned by goyacc and an example string generated by yy. exampleAST should parse the example string and return the AST created when production rule is reduced.

When the project's parser is not yet working, a dummy exampleAST function returnin always nil is a workaround.

Magic names

yy inspects rule actions found in the input file. If the action code mentions identifier lx, yy asumes it refers to the yyLexer passed to yyParse. In that case code like

lx := yylex.(*lexer)

is injected near the beginning of the semantic action. The specific type into which the yylex parameter is type asserted is adjustable using the -yylex flag. Similarly, when identifier lhs is mentioned, a short variable definiton of variable lhs, like

lhs := &Foo{...}
$$ = lhs

is injected into the output.y action, replacing the default generated action (see "Concepts")

For example, an action in input.y

|       IdentifierList Type '=' ExpressionList
        {
                lhs.declare(lx.scope)
        }

Produces

{
        lx := yylex.(*lexer)
        lhs := &VarSpec{
                Case:            2,
                IdentifierList:  $1.(*IdentifierList).reverse(),
                Type:            $2.(*Type),
                Token:           $3,
                ExpressionList:  $4.(*ExpressionList).reverse(),
        }
        $$ = lhs
        lhs.declare(lx.scope)
}

in output.y.

The AST examples generator depends on presence of the yy:token directive for all non constant terminal symbols or the presence of the constant token value as in this example

%token /*yy:token "%c" */       IDENTIFIER      "identifier"
%token                          BREAK           "break"

Using fe

The AST examples yy generates must be post processed by using the fe command (http://godoc.org/github.com/cznic/fe), for example

$ go test -run ^Example[^_] | fe

One of the reasons why this is not done automatically by yy is that the above command will succeed only after your project has a _working_ scanner/parser combination. That's not the case in the early stages.

Directives

yy recognizes specially formatted comments within the input as directives. All directive have the format

   //yy:command argument

or

   /*yy:command argument */

Note that the directive must follow immediately the comment opening. There must be no empty line(s) between the directive and the production it aplies to.

Directive example

For example

//yy:example "foo * bar"
Rule:
        Foo '*' Bar
//yy:example "foo / bar"
|       Foo '/' Bar

The argument of the example directive is a doubly quoted Go string. The string is used instead of an automatically generated example.

Directive field

For example

//yy:field      count   int
//yy:field      flag    bool
Rule: Foo Bar

The argument of the field directive is the text up to the end of the comment. The argument is added to the automatically generated fields of the node type of Rule.

Directive ignore

For example

//yy:ignore
Rule: Foo Bar

The ignore directive has no arguments. The directive disables generating of the node type of Rule as well as generating code instantiating such node.

Directive list

For example

//yy:list
Rule:
        Item
|       Rule ',' Item

The list directive has no arguments. yy by default detects all left recursive rules. When such rule has name having suffix 'List', yy automatically generates proper reversing of the rule items. Using the list directive enables the same when such a left recursive rule does not have suffix 'List' in its name.

Directive token

For example

/*yy:token %c*/ IDENT
/*yy:token %d*/ NUMBER

The argument of the token directive is a doubly quoted Go string. The string is passed to a fmt.Sprinf call with an numeric argument chosen by yy that falls small ASCII letters. The resulting string is used to generate textual token values in examples.

Directive case

For example

//yy:case Foo
/*yy:case Bar */ NUMBER

The argument of the case directive is an identifier, which is appended to the rule name to produce a symbolic and typed case number value. The type name is <RuleName>Case.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL