go-astideepspeech: github.com/asticode/go-astideepspeech Index | Files | Directories

package astideepspeech

import "github.com/asticode/go-astideepspeech"

Index

Package Files

deepspeech.go

func PrintVersions Uses

func PrintVersions()

PrintVersions Print version of this library and of the linked TensorFlow library.

type Metadata Uses

type Metadata C.struct_Metadata

Metadata represents a DeepSpeech metadata output

func (*Metadata) Close Uses

func (m *Metadata) Close() error

Close frees the Metadata structure properly

func (*Metadata) Items Uses

func (m *Metadata) Items() []MetadataItem

func (*Metadata) NumItems Uses

func (m *Metadata) NumItems() int32

func (*Metadata) Probability Uses

func (m *Metadata) Probability() float64

type MetadataItem Uses

type MetadataItem C.struct_MetadataItem

func (*MetadataItem) Character Uses

func (mi *MetadataItem) Character() string

func (*MetadataItem) StartTime Uses

func (mi *MetadataItem) StartTime() float32

func (*MetadataItem) Timestep Uses

func (mi *MetadataItem) Timestep() int

type Model Uses

type Model struct {
    // contains filtered or unexported fields
}

Model represents a DeepSpeech model

func New Uses

func New(modelPath string, nCep, nContext int, alphabetConfigPath string, beamWidth int) *Model

New creates a new Model

modelPath The path to the frozen model graph. nCep The number of cepstrum the model was trained with. nContext The context window the model was trained with. alphabetConfigPath The path to the configuration file specifying the alphabet used by the network. beamWidth The beam width used by the decoder. A larger beam width generates better results at the cost of decoding time.

func (*Model) Close Uses

func (m *Model) Close() error

Close closes the model properly

func (*Model) EnableDecoderWithLM Uses

func (m *Model) EnableDecoderWithLM(alphabetConfigPath, lmPath, triePath string, lmWeight, validWordCountWeight float64)

EnableDecoderWithLM enables decoding using beam scoring with a KenLM language model.

alphabetConfigPath The path to the configuration file specifying the alphabet used by the network. lmPath The path to the language model binary file. triePath The path to the trie file build from the same vocabulary as the language model binary. lmWeight The weight to give to language model results when scoring. validWordCountWeight The weight (bonus) to give to beams when adding a new valid word to the decoding.

func (*Model) SpeechToText Uses

func (m *Model) SpeechToText(buffer []int16, bufferSize, sampleRate uint) string

SpeechToText uses the DeepSpeech model to perform Speech-To-Text. buffer A 16-bit, mono raw audio signal at the appropriate sample rate. bufferSize The number of samples in the audio signal. sampleRate The sample-rate of the audio signal.

func (*Model) SpeechToTextWithMetadata Uses

func (m *Model) SpeechToTextWithMetadata(buffer []int16, bufferSize, sampleRate uint) *Metadata

SpeechToTextWithMetadata uses the DeepSpeech model to perform Speech-To-Text. buffer A 16-bit, mono raw audio signal at the appropriate sample rate. bufferSize The number of samples in the audio signal. sampleRate The sample-rate of the audio signal.

type Stream Uses

type Stream struct {
    // contains filtered or unexported fields
}

Stream represent a streaming state

func SetupStream Uses

func SetupStream(mw *Model, preAllocFrames uint, sampleRate uint) *Stream

SetupStream creates a new audio stream

mw The DeepSpeech model to use preAllocFrames Number of timestep frames to reserve. One timestep

is equivalent to two window lengths (20ms). If set to
0 we reserve enough frames for 3 seconds of audio (150).

aSampleRate The sample-rate of the audio signal.

func (*Stream) DiscardStream Uses

func (s *Stream) DiscardStream()

Destroy a streaming state without decoding the computed logits. This can be used if you no longer need the result of an ongoing streaming inference and don't want to perform a costly decode operation.

func (*Stream) FeedAudioContent Uses

func (s *Stream) FeedAudioContent(buffer []int16, bufferSize uint)

FeedAudioContent Feed audio samples to an ongoing streaming inference. aBuffer An array of 16-bit, mono raw audio samples at the appropriate sample rate. aBufferSize The number of samples in @p aBuffer.

func (*Stream) FinishStream Uses

func (s *Stream) FinishStream() string

FinishStream Signal the end of an audio signal to an ongoing streaming inference, returns the STT result over the whole audio signal.

func (*Stream) FinishStreamWithMetadata Uses

func (s *Stream) FinishStreamWithMetadata() *Metadata

FinishStreamWithMetadata Signal the end of an audio signal to an ongoing streaming inference, returns extended metadata.

func (*Stream) IntermediateDecode Uses

func (s *Stream) IntermediateDecode() string

IntermediateDecode Compute the intermediate decoding of an ongoing streaming inference. This is an expensive process as the decoder implementation isn't currently capable of streaming, so it always starts from the beginning of the audio.

Directories

PathSynopsis
deepspeech

Package astideepspeech imports 2 packages (graph) and is imported by 1 packages. Updated 2019-09-30. Refresh now. Tools for package owners.