Documentation ¶
Overview ¶
Package converter is used to convert between symbol sets from different languages.
Each converter is defined in a .cnv file including symbol set names and conversion rules. The rules are either (1) simple symbol mapping; or (2) regular expression rules (using the https://github.com/dlclark/regexp2 implementation)
Tests can also be added to verify how the conversion works. Fields are separated by tab.
Sample file for US English Sampa to Swedish Sampa (defined in test_data/enusampa_svsampa.cnv):
FROM en-us_ws-sampa TO sv-se_ws-sampa RE ^T t SYMBOL dZ d j SYMBOL tS t rs SYMBOL i I SYMBOL D d SYMBOL T t SYMBOL S rs SYMBOL z s SYMBOL Z s SYMBOL w v SYMBOL A a SYMBOL u U SYMBOL V a SYMBOL r= @ r SYMBOL aU au SYMBOL OI O j SYMBOL @U u: SYMBOL EI e j SYMBOL AI a j SYMBOL ' " TEST T i s t I s TEST D i s d I s
For real world examples (used for unit tests), see the test_data folder: https://github.com/stts-se/symbolset/tree/master/test_data
To test a single .cnv file from the command line, use symbolset/converter/cmd/converter.
Index ¶
- Variables
- func LoadFile(symbolSets map[string]symbolset.SymbolSet, fName string) (Converter, TestResult, error)
- func LoadFromDir(symbolSets map[string]symbolset.SymbolSet, dirName string) (map[string]Converter, map[string]TestResult, error)
- type Converter
- type RegexpRule
- type Rule
- type SymbolRule
- type TestResult
Constants ¶
This section is empty.
Variables ¶
var Suffix = ".cnv"
Suffix defines the suffix string for converter files (.cnv)
Functions ¶
Types ¶
type Converter ¶
Converter is used to convert between symbol sets from different languages.
func (Converter) Test ¶
func (c Converter) Test(tests []test) (TestResult, error)
Test runs the input tests and returns a test result
type RegexpRule ¶
RegexpRule is used to convert from one symbol set to another using regular expressions
func (RegexpRule) FromString ¶
func (r RegexpRule) FromString() string
FromString returns a string representation of the rule's input field
func (RegexpRule) String ¶
func (r RegexpRule) String() string
String returns a tab separated string representation of the rule
func (RegexpRule) ToString ¶
func (r RegexpRule) ToString() string
ToString returns a string representation of the rule's output field
func (RegexpRule) Type ¶
func (r RegexpRule) Type() string
Type returns the rule type (SYMBOL or RE)
type Rule ¶
type Rule interface { // FromString returns a string representation of the rule's input field FromString() string // ToString returns a string representation of the rule's output field ToString() string // Type returns the rule type (SYMBOL or RE) Type() string // Convert is used to execute the conversion for this rule Convert(trans string, symbolset symbolset.SymbolSet) (string, error) // String returns a tab separated string representation of the rule String() string }
Rule is a rule interface for transcription converters
type SymbolRule ¶
SymbolRule is a simple rule that maps from one phoneme symbol to another
func (SymbolRule) FromString ¶
func (r SymbolRule) FromString() string
FromString returns a string representation of the rule's input field
func (SymbolRule) String ¶
func (r SymbolRule) String() string
String returns a tab separated string representation of the rule
func (SymbolRule) ToString ¶
func (r SymbolRule) ToString() string
ToString returns a string representation of the rule's output field
func (SymbolRule) Type ¶
func (r SymbolRule) Type() string
Type returns the rule type (SYMBOL or RE)
type TestResult ¶
TestResult a test result container