converter

package
v0.0.0-...-262ae63 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 21, 2023 License: MIT Imports: 10 Imported by: 0

README

Documentation

Overview

Package converter is used to convert between symbol sets from different languages.

Each converter is defined in a .cnv file including symbol set names and conversion rules. The rules are either (1) simple symbol mapping; or (2) regular expression rules (using the https://github.com/dlclark/regexp2 implementation)

Tests can also be added to verify how the conversion works. Fields are separated by tab.

Sample file for US English Sampa to Swedish Sampa (defined in test_data/enusampa_svsampa.cnv):

FROM	en-us_ws-sampa
TO	sv-se_ws-sampa

RE	^T	t

SYMBOL	dZ	d j
SYMBOL	tS	t rs
SYMBOL	i	I
SYMBOL	D	d
SYMBOL	T	t
SYMBOL	S	rs
SYMBOL	z	s
SYMBOL	Z	s
SYMBOL	w	v
SYMBOL	A	a
SYMBOL	u	U
SYMBOL	V	a
SYMBOL	r=	@ r
SYMBOL	aU	au
SYMBOL	OI	O j
SYMBOL	@U	u:
SYMBOL	EI	e j
SYMBOL	AI	a j
SYMBOL	'	"

TEST	T i s	t I s
TEST	D i s	d I s

For real world examples (used for unit tests), see the test_data folder: https://github.com/stts-se/symbolset/tree/master/test_data

To test a single .cnv file from the command line, use symbolset/converter/cmd/converter.

Index

Constants

This section is empty.

Variables

View Source
var Suffix = ".cnv"

Suffix defines the suffix string for converter files (.cnv)

Functions

func LoadFile

func LoadFile(symbolSets map[string]symbolset.SymbolSet, fName string) (Converter, TestResult, error)

LoadFile loads a converter file and runs the specified tests

func LoadFromDir

func LoadFromDir(symbolSets map[string]symbolset.SymbolSet, dirName string) (map[string]Converter, map[string]TestResult, error)

LoadFromDir loads a converters from the specified folder (all files with .cnv extension)

Types

type Converter

type Converter struct {
	Name  string
	From  symbolset.SymbolSet
	To    symbolset.SymbolSet
	Rules []Rule
}

Converter is used to convert between symbol sets from different languages.

func (Converter) Convert

func (c Converter) Convert(trans string) (string, error)

Convert : converts the input transcription string

func (Converter) Test

func (c Converter) Test(tests []test) (TestResult, error)

Test runs the input tests and returns a test result

type RegexpRule

type RegexpRule struct {
	From *regexp2.Regexp
	To   string
}

RegexpRule is used to convert from one symbol set to another using regular expressions

func (RegexpRule) Convert

func (r RegexpRule) Convert(trans string, symbolset symbolset.SymbolSet) (string, error)

Convert is used to execute the conversion for this rule

func (RegexpRule) FromString

func (r RegexpRule) FromString() string

FromString returns a string representation of the rule's input field

func (RegexpRule) String

func (r RegexpRule) String() string

String returns a tab separated string representation of the rule

func (RegexpRule) ToString

func (r RegexpRule) ToString() string

ToString returns a string representation of the rule's output field

func (RegexpRule) Type

func (r RegexpRule) Type() string

Type returns the rule type (SYMBOL or RE)

type Rule

type Rule interface {

	// FromString returns a string representation of the rule's input field
	FromString() string

	// ToString returns a string representation of the rule's output field
	ToString() string

	// Type returns the rule type (SYMBOL or RE)
	Type() string

	// Convert is used to execute the conversion for this rule
	Convert(trans string, symbolset symbolset.SymbolSet) (string, error)

	// String returns a tab separated string representation of the rule
	String() string
}

Rule is a rule interface for transcription converters

type SymbolRule

type SymbolRule struct {
	From string
	To   string
}

SymbolRule is a simple rule that maps from one phoneme symbol to another

func (SymbolRule) Convert

func (r SymbolRule) Convert(trans string, symbolset symbolset.SymbolSet) (string, error)

Convert is used to execute the conversion for this rule

func (SymbolRule) FromString

func (r SymbolRule) FromString() string

FromString returns a string representation of the rule's input field

func (SymbolRule) String

func (r SymbolRule) String() string

String returns a tab separated string representation of the rule

func (SymbolRule) ToString

func (r SymbolRule) ToString() string

ToString returns a string representation of the rule's output field

func (SymbolRule) Type

func (r SymbolRule) Type() string

Type returns the rule type (SYMBOL or RE)

type TestResult

type TestResult struct {
	OK     bool
	Errors []string
}

TestResult a test result container

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL