glocc

package module
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 23, 2017 License: BSD-3-Clause Imports: 8 Imported by: 0

README

glocc

GoDoc Go Report Card

glocc is a package implementing a relatively fast, parallel counter of lines of code in files and directories.

It also includes a command line tool, glocc, which is handy for performing such counting and pretty (brief or extensive) printing of the results.

glocc is an aggressively parallel solution to an embarrassingly parallel problem. The count of every file and every subdirectory is assigned to a separate goroutine. All spawned goroutines are properly synchronized and their independent results are merged later, on a higher level (level = on a per-subdirectory basis).

It was originally written for use with personal projects and small codebases, and also to get in touch with the Go programming language. Performance-wise, it can be further improved (and hopefully will be, when I have more time).

Contents

Command line tool

Simply run it with any number of files or directories as command line arguments:

$ glocc ~/foo src/bar

By default, only a summary of all counted lines is printed to the standard output. To print the results extensively in a tree-like format, it can be executed with the -a flag:

$ glocc -a baz.go ~/src/foo

The results can be printed in YAML (default) or JSON format, using the -o flag:

$ glocc -o json ~/bar

Running it with the -h flag shows all options available.

Installation

For both the package and the command line tool to be installed, assuming Go is properly installed, it should be as easy as:

$ go get -u github.com/ckatsak/glocc/...

Platforms

Until now, it has been tested only under go version go1.9.1 linux/amd64.

Supported Languages

  • Ada
  • Assembly
  • AWK
  • C
  • C++
  • C#
  • D (not the ddoc comments)
  • Delphi
  • Dockerfile
  • Eiffel
  • Elixir
  • Erlang
  • Go
  • Haskell
  • HTML
  • Java
  • Javascript
  • JSON
  • Kotlin
  • Lisp
  • Makefile
  • Matlab
  • OCaml
  • Perl (not __END__ comments)
  • PHP
  • PowerShell
  • Python
  • R
  • Ruby (not __END__ comments)
  • Rust
  • Scala
  • Scheme
  • shell scripts
  • SQL
  • Standard ML
  • TeX
  • Tcl
  • YAML

Using the glocc package

For use as a package, glocc exports func CountLoc(root string) DirResult, which, given a root directory, returns a struct of type DirResult, a custom (recursive) type that contains the results of counting all lines of code under this root directory.

It also exports EnableLogging() and DisableLogging() functions, to enable and disable verbose logging to standard error, respectively, using a package-level logger. Note that verbose logging includes details about every line of every file visited, which might be quite ...verbose, and not that useful.

Known Issue

For now, really huge source trees, like the Linux kernel source tree, might rarely cause glocc to crash, due the big number of blocked OS threads trying to handle the huge number of goroutines spawned. To be more precise, the exact problem is reported as:

$ glocc ./linux
runtime: program exceeds 10000-thread limit
fatal error: thread exhaustion

It cannot occur in small and medium-sized codebases, and it's also unlikely to occur in bigger ones too. Just be warned. I plan to hack around this problem once I have the time; maybe using some kind of pool or something, or by spawning the goroutines in some clever way. As long as this note is here though, the bug is probably still around. Theoretically, a quick and dirty solution would be to increase the number of operating system threads that a Go program can use, using the SetMaxThreads() function in package runtime/debug; the default value is set to 10000 threads. However, mind that (quoted from the official documentation):

SetMaxThreads is useful mainly for limiting the damage done by programs that create an unbounded number of threads. The idea is to take down the program before it takes down the operating system.

Documentation

Overview

Package glocc implements a relatively fast, parallel counter of lines of code in files and directories.

It also includes a command line tool, glocc, which is handy for performing such counting and pretty printing (brief or extensive) of the results.

glocc is an aggressively parallel solution to an embarrassingly parallel problem. The count for every file and every subdirectory is assigned to a separate goroutine. All spawned goroutines are properly synchronized and their independent results are merged later, on a higher level (level = on a per-subdirectory basis).

It was originally written for use with personal projects and small codebases, and also to get in touch with the Go programming language. Performance-wise, it can be further improved (and hopefully will be, when I have more time).

Command line tool

Simply run it with any number of files or directories as command line arguments:

$ glocc ~/foo src/bar

By default, only a summary of all counted lines is printed to the standard output. To print the results extensively in a tree-like format, it can be executed with the -a flag:

$ glocc -a baz.go ~/src/foo

The results can be printed in YAML (default) or JSON format, using the -o flag:

$ glocc -o json ~/bar

Running it with the -h flag shows all options available.

Using the glocc package

For use as a package, glocc exports `func CountLoc(root string) DirResult`, which, given a root directory, returns a struct of type DirResult, a custom (recursive) type that contains the results of counting all lines of code under this root directory.

It also exports EnableLogging() and DisableLogging() functions, to enable and disable verbose logging to standard error, respectively, using a package-level logger. Note that verbose logging includes details about every line of every file visited, which might be quite ...verbose, and not that useful.

Known Issue

For now, really huge source trees, like the Linux kernel source tree, might rarely cause glocc to crash, due the big number of blocked OS threads trying to handle the huge number of goroutines spawned. To be more precise, the exact problem is reported as:

$ glocc ./linux
runtime: program exceeds 10000-thread limit
fatal error: thread exhaustion

It cannot occur in small and medium-sized codebases, and it's also unlikely to occur in bigger ones too. Just be warned. I plan to hack around this problem once I have the time; maybe using some kind of pool or something, or by spawning the goroutines in some clever way. As long as this note is here though, the bug is probably still around. Theoretically, a quick and dirty solution would be to increase the number of operating system threads that a Go program can use, using the SetMaxThreads() function in runtime/debug; the default value is set to 10000 threads. However, mind that (quoted from https://golang.org/pkg/runtime/debug/#SetMaxThreads):

SetMaxThreads is useful mainly for limiting the damage done by programs
that create an unbounded number of threads. The idea is to take down
the program before it takes down the operating system.

Supported Languages

Ada, assembly, AWK, C, C++, C#, D (not the ddoc comments), Delphi, Dockerfile, Eiffel, Elixir, Erlang, Go, Haskell, HTML, Java, Javascript, JSON, Kotlin, Lisp, Makefile, Matlab, OCaml, Perl (not __END__ comments), PHP, PowerShell, Python, R, Ruby (not __END__ comments), Rust, Scala, Scheme, shell scripts, SQL, Standard ML, TeX, Tcl, YAML.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DisableLogging

func DisableLogging()

DisableLogging disables verbose logging to standard error stream using the package-level logger.

func EnableLogging

func EnableLogging()

EnableLogging enables verbose logging to standard error stream using a package-level logger. This might be useful for debugging.

Types

type DirResult

type DirResult struct {
	Name    string         `json:"name" yaml:"Name"`
	Subdirs DirResults     `json:"subdirs,omitempty" yaml:"subdirs,omitempty"`
	Files   []FileResult   `json:"files,omitempty" yaml:"files,omitempty"`
	Summary map[string]int `json:"summary" yaml:"Summary"`
}

DirResult is a tree-like (thus recursive) data structure used to store the results of the count for all files and subdirectories that live under the directory associated with a directory.

A DirResult contains the following fields:

- Name is the full name of the subdirectory it represents, as a string.

- Subdirs is a slice of DirResult. Each element in the slice, represents the results of counting lines of code in a subdirectory under the directory associated with this DirResult.

- Files is a slice of FileResult. Each element in the slice represents the results of counting lines of code in a file living under the directory associated with this DirResult.

- Summary provides a summary of the results of the counting.

func CountLoc

func CountLoc(root string) DirResult

CountLoc is the main exported interface of glocc package, meant to be called once for each top-level directory in which counting lines of code is needed. It returns a DirResult that contains the results of the counting.

type DirResults

type DirResults []DirResult

DirResults is a slice of DirResult.

type FileResult

type FileResult struct {
	Name string         `json:"name" yaml:"Name"`
	Loc  map[string]int `json:"loc" yaml:"loc"`
}

FileResult is a simple data structure used to store the results of a single file's count. FileResult structs typically live inside DirResult structs.

type LocCounter

type LocCounter struct {
	// contains filtered or unexported fields
}

LocCounter is the core entity of the package, which initiates and later holds the state of the counting for a single file. It is associated to the counting of a single file, and created in the goroutine that is assigned to count the file.

func NewLocCounter

func NewLocCounter(file *os.File, ext string) (lc *LocCounter, err error)

NewLocCounter returns a new LocCounter, properly initialized to count the lines of code in a specific file of a specific language. Returns an error if a supported language cannot be detected.

func (*LocCounter) Count

func (lc *LocCounter) Count() (int, error)

Count is the only exported method of LocCounter. It basically reads (line by line) the content of the file associated with the LocCounter, and performs the counting. It is implemented using the State design pattern.

Directories

Path Synopsis
cmd
glocc
The glocc command line tool.
The glocc command line tool.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL