spellingcorrector

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 23, 2022 License: GPL-3.0 Imports: 6 Imported by: 0

README

Spelling Corrector

Tests Coverage Status Go Reference Go Report Card

A spelling corrector for the Spanish language or create your own.

The solution for this project was based on the proposal made on the following website: http://norvig.com/spell-correct.html and some ideas from https://cxwangyi.wordpress.com/2012/02/15/peter-norvigs-spelling-corrector-in-go/ as well.

The built-in data was trained using the Spanish language.

Try it

Use it now with a Docker instance. It will open the 8080 port to access the service.

docker pull jorelosorio/spellingcorrector:latest

docker run --name spellingcorrector -d -p 8080:80 -t jorelosorio/spellingcorrector:latest

Try it using the following example:

http://localhost:8080/spelling?word=espanol

Tools

Development

This project contains a Dockerfile file with all required dependencies to run it using Visual Studio Code + Remote - Containers extension. However, if you want to make it run locally in your development machine, please follow the instructions below.

Install Go

Install it from https://go.dev/dl/

Build the Example/Service

Make sure the port 80 is currently free. Optionally could be changed in the code!

go build -o ./bin/ ./examples/service.go

Then run the service

./bin/service ./dictionaries/es.dic
Example of correction

Simple usage example of the Corrector function.

package main

import (
	"fmt"

	sc "github.com/jorelosorio/spellingcorrector"
)

func main() {
	spelling, _ := sc.NewSpelling("{YOUR_PATH_TO_DICTIONARY}")
	correctedWord := spelling.Correction("espanol")
	fmt.Println(correctedWord)
}

NewSpelling functions returns (Spelling, error), make sure to handle errors when creating a new object.

Training

Most of the training was made using free versions of books in Spanish. However, if you like to train for a new language you can use the following functions

package main

import (
    sc "github.com/jorelosorio/spellingcorrector"
)

func main() {
    dic, _ := sc.NewDictionary("{YOUR_PATH_TO_DICTIONARY}", sc.ESAlphabet) // Or ENAlphabet
    dic.TrainFromTextFile("{YOUR_INPUT_TEXT}")
}

Call TrainFromTextFile function as many times you wish with different inputs.

NewDictionary functions returns (Dictionary, error), make sure to handle errors when creating a new dictionary.

Build Docker

To build the docker image use .dockers/Dockerfile.deploy and the command

docker build -f Dockerfile.deploy -t jorelosorio/spellingcorrector:latest .

To run the docker image

docker run --name spellingcorrector -d -p 8080:80 -t jorelosorio/spellingcorrector:latest

Test the spelling corrector from the docker image

http://localhost:8080/spelling?word=espanol

Documentation

Index

Constants

View Source
const (
	// ESAlphabet represents the alphabet of the Spanish vocabulary.
	ESAlphabet = "abcdefghijklmnopqrstuvwxyzñáéíóúü"
	// ENAlphabet represents the alphabet of the English vocabulary.
	ENAlphabet = "abcdefghijklmnopqrstuvwxyz"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Dictionary

type Dictionary struct {
	// Alphabet of the dictionary.
	Alphabet string
	// Words is a map that contains the word and the frequency number on texts,
	// it will help to calculate the most probable correction.
	Words map[string]int
	// contains filtered or unexported fields
}

Dictionary object is the main structure of the algorithm and it contains the alphabet in and the words.

func LoadDictionary

func LoadDictionary(filePath string) (*Dictionary, error)

LoadDictionary loads a dictionary in the specified file path It returns a new Dictionary structure and any read error encountered.

func NewDictionary

func NewDictionary(filePath string, alphabet string) (*Dictionary, error)

NewDictionary creates a new dictionary file at the specified path and the alphabet that correspond to it. It returns a new Dictionary structure and any write error encountered.

func (*Dictionary) TrainFromTextFile

func (d *Dictionary) TrainFromTextFile(textFilePath string) error

TrainFromTextFile reads all the words that can be found in the text file specified path, those will be used to train the dictionary. It returns any read/write errors encountered.

type Spelling

type Spelling struct {
	// contains filtered or unexported fields
}

Spelling object contains a dictionary object The main purpose of this struct is to provide actions/features that require process the dictionary data.

func NewSpelling

func NewSpelling(dicFilePath string) (*Spelling, error)

NewSpelling creates a new structure that contains a dictionary inside and it gets as a parameter the file path that points to the required dictionary. It returns a new Spelling structure and any read error encountered.

func (*Spelling) Correction

func (s *Spelling) Correction(word string) string

Correction select the best possible correction for the specified word. Returns the correction if there was one.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL