vad

package
v0.0.0-...-a4649ec Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 1, 2024 License: MIT Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func VAD

func VAD(frame []float32, energyThresh, silenceThresh float32) (bool, float32, float32)

NOTE This is a very rough implemntation. We should improve it :D VAD performs voice activity detection on a frame of audio data.

Types

type Agent

type Agent struct {
	// contains filtered or unexported fields
}

func New

func New(config Config) *Agent

func (*Agent) HandleEvent

func (a *Agent) HandleEvent(annot tracks.Event)

func (*Agent) Window

func (a *Agent) Window(name string) *Window

type Config

type Config struct {
	// // This is determined by the hyperparameter configuration that whisper was trained on.
	// // See more here: https://github.com/ggerganov/whisper.cpp/issues/909
	SampleRate int //   = 16000 // 16kHz
	// sampleRateMs = SampleRate / 1000
	// // This determines how much audio we will be passing to whisper inference.
	// // We will buffer up to (whisperSampleWindowMs - pcmSampleRateMs) of old audio and then add
	// // audioSampleRateMs of new audio onto the end of the buffer for inference
	SampleWindow time.Duration // = 24000 // 24 second sample window

}

type Window

type Window struct {
	// contains filtered or unexported fields
}

func (*Window) Push

func (w *Window) Push(pcm []float32, end tracks.Timestamp) (start tracks.Timestamp, ok bool)

pushes audio and returns edge==true on activity changes,

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL