espeak

package module

v2.0.0-...-97aeeb5 Latest Latest Go to latest Published: Nov 24, 2018 License: GPL-3.0 Imports: 8 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/BenLubar/espeak

Links

Open Source Insights

README ¶

espeak

Package espeak is a wrapper around espeak-ng that works both natively and in gopherjs with the same API. espeak-ng is an open source text to speech library that has over one hundred voices and languages and supports speech synthesis markup language (SSML).

To download this package:

go get -u gopkg.in/BenLubar/espeak.v2

Looking for an older version?

The original implementation of this package from 2015 is still available at gopkg.in/BenLubar/espeak.v1.

Special thanks

espeak-ng (text to speech)
emscripten (C to JavaScript)
gopherjs (Go to JavaScript)

Want to repurpose my code?

You may reuse any code in this repository for any purpose, with the exception of libespeak-ng.inc.js, which is a compiled version of GPLv3-licensed code from espeak-ng.

Compiled versions of this package use GPLv3 code and therefore must be used under a GPLv3-compatible license.

Documentation ¶

Rendered for

Overview ¶

Package espeak is a wrapper around espeak-ng that works both natively and in gopherjs with the same API. espeak-ng is an open source text to speech library that has over one hundred voices and languages and supports speech synthesis markup language (SSML).

Example (Ssml) ¶

const ssml = `<?xml version="1.0"?>
<speak version="1.1"
	xmlns="http://www.w3.org/2001/10/synthesis"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
		http://www.w3.org/TR/speech-synthesis11/synthesis.xsd"
	xml:lang="en-US">
	<!--
	Dialogue from Confessor's Stronghold, part of Guild Wars 2 Living World Season 3 Episode 1.
	Used for demonstration purposes only. Copyright 2016 ArenaNet LLC.
	-->
	<voice gender="male" languages="en:en-GB">
		<s>Nice work<sub alias="">,</sub> Commander. Now if I could persuade you to take care of the others...</s>
	</voice>
	<voice gender="female"><prosody pitch="-30%">
		<s>Interesting. <break/> Countering the magic of these bloodstones returns whatever magical properties it absorbed.</s>
		<s>I know a certain <prosody rate="85%">big-eared asura</prosody> who'd <emphasis level="strong">love</emphasis> to be here to study this...</s>
	</prosody></voice>
	<voice gender="female" variant="2">
		<s>I <prosody pitch="+65%">heard</prosody> <prosody pitch="+50%" range="-90%" rate="+50%">that!</prosody></s>
	</voice>
</speak>`

var ctx espeak.Context
ctx.SynthesizeText(ssml)

f, _ := os.Create("example-ssml.wav")
defer f.Close()
ctx.WriteTo(f)

Output:

Examples ¶

Package (Ssml)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func SampleRate ¶

func SampleRate() int

SampleRate returns the number of samples per second in audio generated by this package.

Types ¶

type Context ¶

type Context struct {
	// Samples is a slice of audio samples in PCM format. Use the WriteTo method on the context to
	// encode Samples as a wav file.
	Samples []int16
	// Events are generated along with Samples and contain information about placement of words and
	// sentences, which may be useful, for example, when generating real time subtitles.
	Events []*SynthEvent
	// contains filtered or unexported fields
}

Context contains the current state of text to speech data. Multiple Contexts may exist simultaneously, but each Context should only be accessed from one goroutine at a time. The zero value of a Context is empty with default values for rate, volume, pitch, and tone.

func (*Context) Pitch ¶

func (ctx *Context) Pitch() int

Pitch returns the highness or lowness of the voice.

The default pitch for the voice is represented by 50. Higher numbers are higher pitch.

func (*Context) Range ¶

func (ctx *Context) Range() int

Range returns the pitch range of speech.

The default tone is 50. A tone of 0 is a monotonic voice.

func (*Context) Rate ¶

func (ctx *Context) Rate() int

Rate returns the current speed of speech in words per minute.

The default rate is 175 words per minute.

func (*Context) SetPitch ¶

func (ctx *Context) SetPitch(pitch int)

SetPitch changes the highness or lowness of the voice for future Synthesize calls.

Allowed values range from 0 (very low) to 100 (very high), with the original pitch for the voice being 50.

func (*Context) SetRange ¶

func (ctx *Context) SetRange(tone int)

SetRange changes the pitch range of the voice for future Synthesize calls.

Allowed values range from 0 (monotone) to 100 (sing-songy), with the original range for the voice being 50.

func (*Context) SetRate ¶

func (ctx *Context) SetRate(wpm int)

SetRate changes the speed of speech for future Synthesize calls to the given number of words per minute.

The number of words per minute must be between 80 and 450, inclusive.

func (*Context) SetVoice ¶

func (ctx *Context) SetVoice(name string) error

SetVoice sets a voice by name.

func (*Context) SetVoiceProperties ¶

func (ctx *Context) SetVoiceProperties(name, language string, gender Gender, age, variant uint8) error

SetVoiceProperties sets the voice for future calls to Synthesize. Any or all of the arguments can be set to their zero values, in which case they will be ignored. Variant differentiates between multiple voices if more than one voice is matched by the other arguments.

func (*Context) SetVolume ¶

func (ctx *Context) SetVolume(percentage int)

SetVolume changes the loudness of the voice for future Synthesize calls to a percentage of the default.

The percentage must not be negative. Percentages over 100 may cause distortion or clipping.

func (*Context) SynthesizeText ¶

func (ctx *Context) SynthesizeText(text string) error

SynthesizeText converts the given text to speech.

Some SSML tags are accepted. All other XML tags are ignored.

func (*Context) Volume ¶

func (ctx *Context) Volume() int

Volume returns the current loudness of speech as a percent of the default volume.

func (*Context) WriteTo ¶

func (ctx *Context) WriteTo(w io.Writer) (int64, error)

WriteTo writes the Samples in this Context to an io.Writer in WAV format.

type Error ¶

type Error struct {
	Code    uint32 // Code associated with this error type in the espeak-ng C API.
	Message string // Message intended to be read by humans.
}

Error is the error type from espeak-ng.

func (*Error) Error ¶

func (err *Error) Error() string

Error implements the error interface.

type Gender ¶

type Gender uint8

Gender of a voice.

const (
	Unknown Gender = 0
	Male    Gender = 1
	Female  Gender = 2
	Neutral Gender = 3
)

Voice genders

type Language ¶

type Language struct {
	// Priority of the voice for this language. A low number indicates a more preferred voice, and
	// a higher number indicates a less preferred voice.
	Priority uint8

	// The name of the language, which may be in BCP47 format, but is not required to be.
	Name string
}

Language supported by a voice.

type SynthEvent ¶

type SynthEvent struct {
	// Type of the event.
	Type SynthEventType

	// TextPosition in characters from the start of the string. Unlike Go indexes, this starts at 1.
	TextPosition int

	// Length of the word, in characters. (for EventWord)
	Length int

	// AudioPosition is the time within the generated speech output data.
	AudioPosition time.Duration

	Number  int    // Number is used for EventWord and EventSentence
	Name    string // Name is used for EventMark and EventPlay
	Phoneme string // Phoneme is used for EventPhoneme
}

SynthEvent gives additional information about the generated speech.

type SynthEventType ¶

type SynthEventType uint8

SynthEventType is the type of a SynthEvent.

const (
	// EventWord is the start of a word.
	EventWord SynthEventType = 1

	// EventSentence is the start of a sentence.
	EventSentence SynthEventType = 2

	// EventMark is a <mark/> element in SSML.
	EventMark SynthEventType = 3

	// EventPlay is an <audio/> element in SSML.
	EventPlay SynthEventType = 4

	// EventEnd is the end of a sentence or clause.
	EventEnd SynthEventType = 5

	// EventMsgTerminated is the end of the synthesized message.
	EventMsgTerminated SynthEventType = 6

	// EventPhoneme is emitted for each phoneme if enabled.
	EventPhoneme SynthEventType = 7
)

type Voice ¶

type Voice struct {
	// Name for this voice (unique)
	Name string

	// Languages and priorities. Lower numbers mean this voice is more likely to be used for the language.
	Languages []Language

	// Identifier is the filename for this voice within espeak-ng-data/voices.
	Identifier string

	// Gender of voice.
	Gender Gender

	// Age in years, or 0 if not specified.
	Age uint8
}

Voice is a voice supported by espeak.

func ListVoices ¶

func ListVoices() []*Voice

ListVoices returns the complete list of voices supported by espeak. The returned slice is not shared, and callers may modify it without any side effects.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL