srcanlzr: github.com/DevMine/srcanlzr/src Index | Files | Directories

package src

import "github.com/DevMine/srcanlzr/src"

Package src provides a set of structures for representing a project with its related source code independently of the language. In other words, it provides a generic representation (abstraction) of a source code.

Goal

The goal of this package is to provide a generic representation of a project that can be analyzed by the anlzr package as well as an API for encoding/decoding it to/from JSON.

A presentation video is available on the DevMine website:

http://devmine.ch/news/2015/06/08/srcanlzr-presentation/

Usage

There are two kinds of program that interact with a src.Project: language parsers and VCS support tools. The former visits all source files inside the project folder and parse every source file in order to fill the src.Project.Packages field (and few others). The latter read the VCS folder that contains VCS data and fill the src.Project.Repo structure. The next two chapters treat about them more in details.

Language parsers

Language parsers must output the same structure as defined by the src.Project type. They have to first parse a project in order to get the specific AST for that project. Then, they have to make that AST match with our generic AST defined in the package:

http://godoc.org/github.com/DevMine/srcanlzr/src/ast

To get more detail about how to write a language parser for srcanlzr, refer to that tutorial:

http://devmine.ch/news/2015/05/31/how-to-write-a-parser/

VCS support tools

Language parsers must not provide any information related to Version Control Systems (VCS). VCS metadata is the job of repotool:

http://devmine.ch/news/2015/06/01/repotool-presentation/
http://devmine.ch/doc/repotool/

Example

For the following Go source file (greet/main.go):

package main

import (
        "fmt"
)

func greet(name string) {
        fmt.Printf("Hello, %s!\n", name)
}

func main() {
        name := "World"
        greet(name)
}

The language parser must produce the following JSON output:

{
   "name": "greet",
   "loc": 5,
   "languages": [
      {
		"language": "go",
         "paradigms": [
            "compiled",
            "concurrent",
            "imperative",
            "structured"
         ]
      }
   ],
   "packages": [
      {
         "loc": 5,
         "name": "greet",
         "path": "/home/revan/go/src/foo/greet",
         "source_files": [
            {
               "functions": [
                  {
                     "body": [
                        {
                           "expression": {
                              "arguments": [
                                 {
                                    "expression_name": "BASIC_LIT",
                                    "kind": "STRING",
                                    "value": "Hello, %s!\\n"
                                 },
                                 {
                                    "expression_name": "IDENT",
                                    "name": "name"
                                 }
                              ],
                              "expression_name": "CALL",
                              "function": {
                                 "function_name": "Printf",
                                 "namespace": "fmt"
                              },
                              "line": 0
                           },
                           "statement_name": "EXPR"
                        }
                     ],
                     "loc": 0,
                     "name": "greet",
                     "type": {
                        "parameters": [
                           {
                              "name": "name",
                              "type": "string"
                           }
                        ]
                     },
                     "visibility": ""
                  },
                  {
                     "body": [
                        {
                           "left_hand_side": [
                              {
                                 "expression_name": "IDENT",
                                 "name": "name"
                              }
                           ],
                           "line": 1,
                           "right_hand_side": [
                              {
                                 "expression_name": "BASIC_LIT",
                                 "kind": "STRING",
                                 "value": "World"
                              }
                           ],
                           "statement_name": "ASSIGN"
                        },
                        {
                           "expression": {
                              "arguments": [
                                 {
                                    "expression_name": "IDENT",
                                    "name": "name"
                                 }
                              ],
                              "expression_name": "CALL",
                              "function": {
                                 "function_name": "greet",
                                 "namespace": ""
                              },
                              "line": 0
                           },
                           "statement_name": "EXPR"
                        }
                     ],
                     "loc": 0,
                     "name": "main",
                     "type": null,
                     "visibility": ""
                  }
               ],
               "imports": [
                  "fmt"
               ],
               "language": {
                  "language": "go",
                  "paradigms": [
                     "compiled",
                     "concurrent",
                     "imperative",
                     "structured"
                  ]
               },
               "loc": 5,
               "path": "/home/revan/go/src/foo/greet/main.go"
            }
         ]
      }
   ]
}

Lines of Code counting

The number of real lines of code must be precomputed by the language parsers. This is the only "feature" that must be precomputed because it may have multiple usages:

1. Eliminate empty projects

2. Evalutate project size

3. Verify that the decoding is correct

4. Normalize various counts

5. ...

Therefore, this count must be accurate and strictly follow the following rules:

We only count statements and declarations as a line of code. Comments, package declaration, imports, expression, etc. must not be taken into account. Since an exemple is worth more than a thousand words, let's consider the following snippet:

// Package doc (does not count as a line of code)
package main // does not count as a line of code

import "fmt" // does not count as a line of code

func main() { // count as 1 line of code
  fmt.Println(
     "Hello, World!
  ) // count as 1 line of code
}

The expected number of lines of code is 2: The main function declaration and the call to fmt.Println function.

Performance

DevMine project is dealing with Terabytes of source code, therefore the JSON decoding must be efficient. That is why we implemented our own JSON decoder that focuses on performance. To do so, we had to make some choices and add some constraints for language parsers in order to make this process as fast as possible.

JSON is usually unpredicatable which forces JSON parsers to be generic to deal with every possible kind of input. In DevMine, we have a well defined structure, thus instead of writting a generic JSON decoder we wrote one that decodes only src.Project objects. This really improves the performances since we don't need to use reflextion, generic types (interface{}) and type assertion. The drawback of this choice is that we have to update the decoder everytime we modify our structures.

Most JSON parsers assume that the JSON input is potentially invalid (ie. malformed). We don't. Unlike json.Unmarshal, we don't Check for well-formedness.

We also force the language parsers to put the "expression_name" and "statement_name" fields at the beginning of the JSON object. We use that convention to decode generic ast.Expr and ast.Stmt without reading the whole JSON object.

Besides, we restrict the supported JSON types to:

string
int64
float64
bool
object
array

All objects used (even inside an array) must absolutely be a pointer. This is required by the decoder generator.

The only officially supported encoding is UTF-8.

Index

Package Files

decode.go decode_ast.gen.go doc.go interface.go merge.go scanner.go src.go

Constants

const (
    Git = "git"
    Hg  = "mercurial"
    SVN = "subversion"
    Bzr = "bazaar"
    CVS = "cvs"
)

Supported VCS (Version Control System)

const (
    Go     = "go"
    Ruby   = "ruby"
    Python = "python"
    C      = "c"
    Java   = "java"
    Scala  = "scala"
)

Supported programming languages

const (
    Structured     = "structured"
    Imperative     = "imperative"
    Procedural     = "procedural"
    Compiled       = "compiled"
    Concurrent     = "concurrent"
    Functional     = "functional"
    ObjectOriented = "object oriented"
    Generic        = "generic"
    Reflective     = "reflective"
)

Supported paradigms

type Language Uses

type Language struct {
    // The programming language name (e.g. go, ruby, java, etc.)
    //
    // The name must match one of the supported programming languages defined in
    // the constants.
    Lang string `json:"language"` // TODO rename into name

    // The paradigms of the programming language (e.g. structured, imperative,
    // object oriented, etc.)
    //
    // The name must match one of the supported paradigms defined in the
    // constants.
    Paradigms []string `json:"paradigms"`
}

A Language represents a programming language.

type Package Uses

type Package struct {
    // The package documentation, or nil.
    // TODO support docucmentation for multiple languages.
    Doc []string `json:"doc,omitempty"`

    // The package name. This should be the name of the parent folder.
    Name string `json:"name"`

    // The full path of the package. The path must be relative to the root of
    // the project and never be an absolute path.
    Path string `json:"path"`

    // The list of all source files contained in the package.
    SrcFiles []*SrcFile `json:"source_files"`

    // The total number of lines of code of the package.
    LoC int64 `json:"loc"`
}

Package holds information about a package, which is, basically, just a folder that contains at least one source file.

type Project Uses

type Project struct {
    // The name of the project. Since it may be something really difficult to
    // guess, it should generally be the name of the folder containing the
    // project.
    Name string `json:"name"`

    // The repository in which the project is hosted, or nil. This field is not
    // meant to be filled by one of the language parsers. Only repotool should
    // take care of it. For more details, see:
    //    https://github.com/DevMine/repotool
    //
    // Since this field uses an external type, it is not unmarshalled by
    // src.Unmarshal itself but by the standard json.Unmarshal function.
    // To do so, its unmarshalling is defered using json.RawMessage.
    // See the RepoRaw field.
    Repo *model.Repository `json:"repository,omitempty"`

    // The list of all programming languages used by the project. Each language
    // must be added by the corresponding language parsers if and only if the
    // project contains at least one line of code written in this language.
    Langs []*Language `json:"languages"`

    // List of all packages of the project. We call "package" every folder that
    // contains at least one source file.
    Packages []*Package `json:"packages"`

    // The total number of lines of code in the whole project, independently of
    // the language.
    LoC int64 `json:"loc"`
}

Project is the root of the src API and therefore it must be at the root of the JSON.

It contains the metadata of a project and the list of all packages.

func Decode Uses

func Decode(r io.Reader) (*Project, error)

Decode decodes a JSON encoded src.Project read from r.

func DecodeFile Uses

func DecodeFile(path string) (*Project, error)

DecodeFile decodes a JSON encoded src.Project read from a given file.

func Merge Uses

func Merge(p1, p2 *Project) *Project

Merge merges two project. See MergeAll for more details.

func MergeAll Uses

func MergeAll(ps ...*Project) (*Project, error)

MergeAll merges a list of projects.

There must be at least one project. In this case, it just returns a copy of the project. Moreover, the projects must be distinct.

The merge only performs shallow copies, which means that if the field value is a pointer it copies the memory address and not the value pointed.

func (*Project) Encode Uses

func (p *Project) Encode(w io.Writer) error

Encode writes JSON representation of the project into w.

For now, encoding still make use of the json package of the standard libary.

func (*Project) EncodeToFile Uses

func (p *Project) EncodeToFile(path string) error

EncodeToFile writes JSON representation of the project into a file located at path.

type SrcFile Uses

type SrcFile struct {
    // The path of the source file, relative to the root of the project.
    Path string `json:"path"`

    // Programming language used.
    Lang *Language `json:"language"`

    // List of the imports used by the srouce file.
    Imports []string `json:"imports,omitempty"`

    // Types definition
    TypeSpecs []*ast.TypeSpec `json:"type_specifiers,omitempty"`

    // Structures definition
    // TODO rename JSON key into structures
    Structs []*ast.StructType `json:"structs,omitempty"`

    // List of constants defined at the file level (e.g. global constants)
    Constants []*ast.GlobalDecl `json:"constants,omitempty"`

    // List of variables defined at the file level (e.g. global variables)
    Vars []*ast.GlobalDecl `json:"variables,omitempty"`

    // List of functions
    Funcs []*ast.FuncDecl `json:"functions,omitempty"`

    // List of interfaces
    Interfaces []*ast.Interface `json:"interfaces,omitempty"`

    // List of classes
    Classes []*ast.ClassDecl `json:"classes,omitempty"`

    // List of enums
    Enums []*ast.EnumDecl `json:"enums,omitempty"`

    // List of traits
    // See http://en.wikipedia.org/wiki/Trait_%28computer_programming%29
    Traits []*ast.Trait `json:"traits,omitempty"`

    // The total number of lines of code.
    LoC int64 `json:"loc"`
}

SrcFile holds information about a source file.

Directories

PathSynopsis
astPackage ast represents a language agnostic Abstract Syntax Tree (AST).
gen
token

Package src imports 9 packages (graph) and is imported by 3 packages. Updated 2016-07-22. Refresh now. Tools for package owners.