Documentation ¶
Overview ¶
Package goEagi of vosk.go provides a simplified interface for calling Vosk Server's speech to text service. It provides flexibility to the callers and allow them to set their desired configuration.
Index ¶
- func ComputeAmplitude(sample []byte) (float64, error)
- func GenerateAudio(sample []byte, audioDirectory string, audioName string) (string, error)
- func StreamAudio(ctx context.Context) <-chan AudioResult
- type AudioResult
- type Eagi
- type GoogleResult
- type GoogleService
- type GoogleTTS
- type Vad
- type VadResult
- type VoskResult
- type VoskService
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ComputeAmplitude ¶
ComputeAmplitude analyzes the amplitude of a sample slice of bytes.
func GenerateAudio ¶
GenerateAudio writes a sample slice of bytes into an audio file. It returns a location path of an audio which passed in the function parameters. Please note that only wav extension is supported.
func StreamAudio ¶
func StreamAudio(ctx context.Context) <-chan AudioResult
StreamAudio launches a new goroutine for audio streaming via file descriptor 3.
Types ¶
type AudioResult ¶
type GoogleResult ¶
type GoogleResult struct { Result *speechpb.StreamingRecognitionResponse Error error Reinitialized bool ReinitializedInfo string }
GoogleResult is a struct that contains transcription result from Google Speech to Text service.
type GoogleService ¶
GoogleService is used to stream audio data to Google Speech to Text service.
func NewGoogleService ¶
func NewGoogleService(privateKeyPath string, languageCode string, speechContext []string) (*GoogleService, error)
NewGoogleService creates a new GoogleService instance, it takes a privateKeyPath and set it in environment with key GOOGLE_APPLICATION_CREDENTIALS, a languageCode, example ["en-GB", "en-US", "ch", ...], see (https://cloud.google.com/speech-to-text/docs/languages), and a speech context, see (https://cloud.google.com/speech-to-text/docs/speech-adaptation).
func (*GoogleService) ReinitializeClient ¶
func (g *GoogleService) ReinitializeClient() error
ReinitializeClient reinitializes the Google client.
func (*GoogleService) SpeechToTextResponse ¶
func (g *GoogleService) SpeechToTextResponse(ctx context.Context) <-chan GoogleResult
SpeechToTextResponse sends the transcription response from Google's SpeechToText.
func (*GoogleService) StartStreaming ¶
func (g *GoogleService) StartStreaming(ctx context.Context, stream <-chan []byte) <-chan error
StartStreaming takes a reading channel of audio stream and sends it as a gRPC request to Google service through the initialized client.
type GoogleTTS ¶
func NewGoogleTTS ¶
type Vad ¶
type Vad struct {
AmplitudeDetectionThreshold float64
}
type VoskResult ¶
type VoskResult struct { Result []struct { Conf float64 End float64 Start float64 Word string } Text string Partial string }
VoskResult is the response from Vosk Speech Recognizer.
type VoskService ¶
type VoskService struct { PhraseList []string `json:"phrase_list"` Words bool `json:"words"` Client *websocket.Conn `json:"-"` // contains filtered or unexported fields }
VoskService is the client for Vosk Speech Recognizer.
func NewVoskService ¶
func NewVoskService(host string, port string, phraseList []string) (*VoskService, error)
NewVoskService creates a new VoskService.
func (*VoskService) Close ¶
func (v *VoskService) Close() error
Close the websocket connection to Vosk service.
func (*VoskService) SpeechToTextResponse ¶
func (v *VoskService) SpeechToTextResponse(ctx context.Context) <-chan VoskResult
SpeechToTextResponse sends the transcription response from Vosk's SpeechToText.
func (*VoskService) StartStreaming ¶
func (v *VoskService) StartStreaming(ctx context.Context, stream <-chan []byte) <-chan error
StartStreaming starts the streaming to Vosk speech to text service. It takes a reading channel of audio stream and sends it as a websocket binary message to Vosk service.