Documentation ¶
Overview ¶
Package espeak is a wrapper around espeak-ng that works both natively and in gopherjs with the same API. espeak-ng is an open source text to speech library that has over one hundred voices and languages and supports speech synthesis markup language (SSML).
Example (Ssml) ¶
const ssml = `<?xml version="1.0"?> <speak version="1.1" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis11/synthesis.xsd" xml:lang="en-US"> <!-- Dialogue from Confessor's Stronghold, part of Guild Wars 2 Living World Season 3 Episode 1. Used for demonstration purposes only. Copyright 2016 ArenaNet LLC. --> <voice gender="male" languages="en:en-GB"> <s>Nice work<sub alias="">,</sub> Commander. Now if I could persuade you to take care of the others...</s> </voice> <voice gender="female"><prosody pitch="-30%"> <s>Interesting. <break/> Countering the magic of these bloodstones returns whatever magical properties it absorbed.</s> <s>I know a certain <prosody rate="85%">big-eared asura</prosody> who'd <emphasis level="strong">love</emphasis> to be here to study this...</s> </prosody></voice> <voice gender="female" variant="2"> <s>I <prosody pitch="+65%">heard</prosody> <prosody pitch="+50%" range="-90%" rate="+50%">that!</prosody></s> </voice> </speak>` var ctx espeak.Context ctx.SynthesizeText(ssml) f, _ := os.Create("example-ssml.wav") defer f.Close() ctx.WriteTo(f)
Output:
Index ¶
- func SampleRate() int
- type Context
- func (ctx *Context) Pitch() int
- func (ctx *Context) Range() int
- func (ctx *Context) Rate() int
- func (ctx *Context) SetPitch(pitch int)
- func (ctx *Context) SetRange(tone int)
- func (ctx *Context) SetRate(wpm int)
- func (ctx *Context) SetVoice(name string) error
- func (ctx *Context) SetVoiceProperties(name, language string, gender Gender, age, variant uint8) error
- func (ctx *Context) SetVolume(percentage int)
- func (ctx *Context) SynthesizeText(text string) error
- func (ctx *Context) Volume() int
- func (ctx *Context) WriteTo(w io.Writer) (int64, error)
- type Error
- type Gender
- type Language
- type SynthEvent
- type SynthEventType
- type Voice
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func SampleRate ¶
func SampleRate() int
SampleRate returns the number of samples per second in audio generated by this package.
Types ¶
type Context ¶
type Context struct { // Samples is a slice of audio samples in PCM format. Use the WriteTo method on the context to // encode Samples as a wav file. Samples []int16 // Events are generated along with Samples and contain information about placement of words and // sentences, which may be useful, for example, when generating real time subtitles. Events []*SynthEvent // contains filtered or unexported fields }
Context contains the current state of text to speech data. Multiple Contexts may exist simultaneously, but each Context should only be accessed from one goroutine at a time. The zero value of a Context is empty with default values for rate, volume, pitch, and tone.
func (*Context) Pitch ¶
Pitch returns the highness or lowness of the voice.
The default pitch for the voice is represented by 50. Higher numbers are higher pitch.
func (*Context) Range ¶
Range returns the pitch range of speech.
The default tone is 50. A tone of 0 is a monotonic voice.
func (*Context) Rate ¶
Rate returns the current speed of speech in words per minute.
The default rate is 175 words per minute.
func (*Context) SetPitch ¶
SetPitch changes the highness or lowness of the voice for future Synthesize calls.
Allowed values range from 0 (very low) to 100 (very high), with the original pitch for the voice being 50.
func (*Context) SetRange ¶
SetRange changes the pitch range of the voice for future Synthesize calls.
Allowed values range from 0 (monotone) to 100 (sing-songy), with the original range for the voice being 50.
func (*Context) SetRate ¶
SetRate changes the speed of speech for future Synthesize calls to the given number of words per minute.
The number of words per minute must be between 80 and 450, inclusive.
func (*Context) SetVoiceProperties ¶
func (ctx *Context) SetVoiceProperties(name, language string, gender Gender, age, variant uint8) error
SetVoiceProperties sets the voice for future calls to Synthesize. Any or all of the arguments can be set to their zero values, in which case they will be ignored. Variant differentiates between multiple voices if more than one voice is matched by the other arguments.
func (*Context) SetVolume ¶
SetVolume changes the loudness of the voice for future Synthesize calls to a percentage of the default.
The percentage must not be negative. Percentages over 100 may cause distortion or clipping.
func (*Context) SynthesizeText ¶
SynthesizeText converts the given text to speech.
Some SSML tags are accepted. All other XML tags are ignored.
type Error ¶
type Error struct { Code uint32 // Code associated with this error type in the espeak-ng C API. Message string // Message intended to be read by humans. }
Error is the error type from espeak-ng.
type Language ¶
type Language struct { // Priority of the voice for this language. A low number indicates a more preferred voice, and // a higher number indicates a less preferred voice. Priority uint8 // The name of the language, which may be in BCP47 format, but is not required to be. Name string }
Language supported by a voice.
type SynthEvent ¶
type SynthEvent struct { // Type of the event. Type SynthEventType // TextPosition in characters from the start of the string. Unlike Go indexes, this starts at 1. TextPosition int // Length of the word, in characters. (for EventWord) Length int // AudioPosition is the time within the generated speech output data. AudioPosition time.Duration Number int // Number is used for EventWord and EventSentence Name string // Name is used for EventMark and EventPlay Phoneme string // Phoneme is used for EventPhoneme }
SynthEvent gives additional information about the generated speech.
type SynthEventType ¶
type SynthEventType uint8
SynthEventType is the type of a SynthEvent.
const ( // EventWord is the start of a word. EventWord SynthEventType = 1 // EventSentence is the start of a sentence. EventSentence SynthEventType = 2 // EventMark is a <mark/> element in SSML. EventMark SynthEventType = 3 // EventPlay is an <audio/> element in SSML. EventPlay SynthEventType = 4 // EventEnd is the end of a sentence or clause. EventEnd SynthEventType = 5 // EventMsgTerminated is the end of the synthesized message. EventMsgTerminated SynthEventType = 6 // EventPhoneme is emitted for each phoneme if enabled. EventPhoneme SynthEventType = 7 )
type Voice ¶
type Voice struct { // Name for this voice (unique) Name string // Languages and priorities. Lower numbers mean this voice is more likely to be used for the language. Languages []Language // Identifier is the filename for this voice within espeak-ng-data/voices. Identifier string // Gender of voice. Gender Gender // Age in years, or 0 if not specified. Age uint8 }
Voice is a voice supported by espeak.
func ListVoices ¶
func ListVoices() []*Voice
ListVoices returns the complete list of voices supported by espeak. The returned slice is not shared, and callers may modify it without any side effects.