gospell

command module
v0.0.0-...-b2127f6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 12, 2016 License: MIT Imports: 13 Imported by: 0

README

******************************** WORK IN PROGRESS ****************************
******************************************************************************

This project attempts to automate spell checking for comments in Go code. Most
modern languages rely on comments for documentation, and I was curious to see
to what degree this can be automated. It's split into a few packages:

1) Package "check" contains generic logic for spell checking given an alphabet
   and dictionary. It works as a standalone spell checking package.
2) Package "lang" store dictionary data and currently only supports
   English(US).
3) Package "scrape" has logic to scrape data from Merriam-Webster's online
   dictionary.
4) Package "main" builds a binary to run predefined or tunable spell checkers
   against specific files or recursively on a directory. Java, C, C++, and
   Scala files are also supported, with the default being Go.

So what's considered a misspelling?

Since comments are often code expressions and not valid grammar, it's not
rational to simply check each space-delimited string against a dictionary.
The default behavior classifies a misspelled word if it:

    - has at least 5 characters
        and
            - differs by 1 character insertion
            or
            - differs by 1 character deletion
            or
            - differs by a single consecutive character swap

To run the classifier on a Go project:

    $ ./gospell .

There's minimal tuning support. To classify against words that:

    - are at least 4 characters long
    and
    - differ by at most 2 insertions:

Try:

    $ ./gospell -ml=4 -mi=2 .

This is a decent start. Results often need pruning by a human eye. It may be
worth exploring the following features:

      -- the frequency of a misspelled word wherein some threshold
         declassifies the misspelling
      -- adapt the insertion, deletion, and swap restrictions
         based on the size of the word,
         so longer words can differ by more changes.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL