gobergamot

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 15, 2024 License: Apache-2.0 Imports: 15 Imported by: 0

README

gobergamot Go Report Card Go Reference

Implementation of local text translator (for i18n) with Bergamot Translator project compiled to WebAssembly with Emscripten via Wazero.

Usage example

Using single Translator to translate Spanish text to an English one.

modelFile, err := os.Open("model.esen.intgemm.alphas.bin")
handleError(err)
defer modelFile.Close()
shortlistFile, err := os.Open("lex.50.50.esen.s2t.bin")
handleError(err)
defer shortlistFile.Close()
vocabularyFile, err := os.Open("vocab.esen.spm")
handleError(err)
defer vocabularyFile.Close()

cfg := gobergamot.Config{FilesBundle: gobergamot.FilesBundle{modelFile, shortlistFile, vocabularyFile}}
translator, err := gobergamot.New(ctx, cfg)
handleError(err)

englishText, err := translator.Translate(ctx, gobergamot.TranslationRequest{Text: "¡Hola, Mundo!"})
handleError(err)

// Hello, World
fmt.Println(englishText)

// Release resources associated with translator and WASM module
handleError(translator.Close(ctx))

Using a pool of Translators for concurrent translating.

cfg := gobergamot.PoolConfig{
  FilesBundle: filesBundle,
  PoolSize: 5,
}

pool, err := gobergamot.NewPool(ctx, cfg)
handleError(err)

translatedText, err := pool.Translate(ctx, gobergamot.TranslationRequest{Text: originalText})
handleError(err)

// releasing pool resources
handleError(pool.Close(ctx))

Where do I find files for models, shortlists and vocabularies?

Files for many languages are available at Firefox translation models.

How do I recompile WebAssembly Bergamot module?

There is a Makefile target for this - make recompile-bergamot.

Gratitudes

Thanks to Bergamot Project for awesome idea of local translation.

Thanks to Danlock for brilliant Gogosseract project which served as an example of binding C++ projects, WebAssembly and Go.

Thanks to Jerbob for great tool wazero-emscripten-embind which connects Wazero and Embind.

Thanks to Tetrate Labs for zero-dependency WebAssembly implementation in Go - Wazero.

And, finally, thanks to Eliah for giving me the opportunity to make the project!

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrModelMissing            = errors.New("model is required")
	ErrVocabularyMissing       = errors.New("vocabulary is required")
	ErrLexicalShortlistMissing = errors.New("lexical shortlist is required")
)
View Source
var ErrClosed = errors.New("pool closed")

Functions

func DefaultBergamotOptions

func DefaultBergamotOptions() map[string]any

DefaultBergamotOptions provides default options for WASM Bergamot translator worker like in https://github.com/browsermt/bergamot-translator/blob/v0.4.5/wasm/node-test.js#L66

Types

type Config

type Config struct {
	wasm.CompileConfig

	// From Bergamot sources:
	//
	// Size in History items to be stored in the cache. A value of 0 means no caching. Loosely corresponds to sentences
	// to cache in the real world. Note that cache has a random-eviction policy. The peak storage at full occupancy is
	// controlled by this parameter. However, whether we attain full occupancy or not is controlled by random factors -
	// specifically how uniformly the hash distributes.
	CacheSize uint

	// Data to load into translator
	FilesBundle

	// From Bergamot sources:
	//
	// Equivalent to options based constructor, where `options` is parsed from string configuration. Configuration can be
	// JSON or YAML. Keys expected correspond to those of `marian-decoder`, available at
	// https://marian-nmt.github.io/docs/cmd/marian-decoder/
	BergamotOptions map[string]any

	WASMCache wazero.CompilationCache

	// WASMUseContext defines if WASM functions execution must be canceled upon context.Context cancellation.
	// Equivalent to wazero.RuntimeConfig WithCloseOnContextDone method parameter.
	WASMUseContext bool
}

func (Config) Validate

func (cfg Config) Validate() error

type FilesBundle

type FilesBundle struct {
	// Byte array of model. Required
	Model io.Reader
	// Byte array of shortlist. Required
	LexicalShortlist io.Reader
	// Byte array of vocabulary to translate between source and target languages. Required
	Vocabulary io.Reader
}

type Pool

type Pool struct {
	// contains filtered or unexported fields
}

func NewPool

func NewPool(ctx context.Context, cfg PoolConfig) (*Pool, error)

NewPool compiles Translator instances and runs them as workers.

func (*Pool) Close

func (p *Pool) Close(ctx context.Context) error

Close closes existing Translator instances and waits for their completion

func (*Pool) Translate

func (p *Pool) Translate(ctx context.Context, request TranslationRequest) (string, error)

Translate is similar to Translator.Translate except the request is asynchronously given to any free worker in the pool.

func (*Pool) TranslateMultiple

func (p *Pool) TranslateMultiple(ctx context.Context, requests ...TranslationRequest) ([]string, error)

TranslateMultiple is similar to Translator.TranslateMultiple except the requests are asynchronously given to any free worker in the pool.

type PoolConfig

type PoolConfig struct {
	Config
	PoolSize uint
}

func (PoolConfig) Validate

func (cfg PoolConfig) Validate() error

type TranslationOptions

type TranslationOptions struct {
	// HTML defines if the Translator should remove HTML tags from text and insert them in output.
	HTML bool
}

TranslationOptions are equivalent to ResponseOptions in Bergamot. From sources:

ResponseOptions dictate how to construct a Response for an input string of text to be translated.

type TranslationRequest

type TranslationRequest struct {
	// Text to be translated
	Text string

	// Options for translation
	Options TranslationOptions
}

type Translator

type Translator struct {
	// contains filtered or unexported fields
}

Translator represents a Bergamot translator worker in Go.

func New

func New(ctx context.Context, cfg Config) (*Translator, error)

New compiles Bergamot module and creates TranslationModel and BlockingService instances to be used in Translator

func (*Translator) Close

func (t *Translator) Close(ctx context.Context) error

Close deletes created objects and stops the WASM runtime

func (*Translator) Translate

func (t *Translator) Translate(ctx context.Context, request TranslationRequest) (string, error)

Translate translates text provided in the request into a model target language.

func (*Translator) TranslateMultiple

func (t *Translator) TranslateMultiple(ctx context.Context, requests ...TranslationRequest) ([]string, error)

TranslateMultiple translates a batch of text provided in the requests into a model target language.

Directories

Path Synopsis
internal
gen
Code generated by wazero-emscripten-embind, DO NOT EDIT.
Code generated by wazero-emscripten-embind, DO NOT EDIT.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL