livecaption

command

v0.0.0-...-8bfd5b2 Latest Latest Go to latest Published: Feb 2, 2021 License: Apache-2.0 Imports: 7 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/shinfan/google-dca-test

README ¶

Google Cloud Speech API Go example

Authentication

Create a project with the Google Cloud Console, and enable the Speech API.
From the Cloud Console, create a service account, download its json credentials file, then set the GOOGLE_APPLICATION_CREDENTIALS environment variable:
```
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your-project-credentials.json
```

Run the sample

Before running any example you must first install the Speech API client:

go get -u cloud.google.com/go/speech/apiv1

To run the example with a local file:

go build
cat ../testdata/audio.raw | livecaption

Capturing audio from the mic

Alternatively, gst-launch can be used to capture audio from the mic. For example:

gst-launch-1.0 -v pulsesrc ! audioconvert ! audioresample ! audio/x-raw,channels=1,rate=16000 ! filesink location=/dev/stdout | livecaption

In order to discover your recording device you may use the gst-device-monitor-1.0 command line tool. For example:

$ gst-device-monitor-1.0
Probing devices...


Device found:

	name  : Built-in Output
	class : Audio/Sink
	caps  : audio/x-raw, format=(string)F32LE, layout=(string)interleaved, rate=(int)44100, channels=(int)2, channel-mask=(bitmask)0x0000000000000003;
	        audio/x-raw, format=(string){ S8, U8, S16LE, S16BE, U16LE, U16BE, S24_32LE, S24_32BE, U24_32LE, U24_32BE, S32LE, S32BE, U32LE, U32BE, S24LE, S24BE, U24LE, U24BE, S20LE, S20BE, U20LE, U20BE, S18LE, S18BE, U18LE, U18BE, F32LE, F32BE, F64LE, F64BE }, layout=(string)interleaved, rate=(int)[ 1, 2147483647 ], channels=(int)2, channel-mask=(bitmask)0x0000000000000003;
	        audio/x-raw, format=(string){ S8, U8, S16LE, S16BE, U16LE, U16BE, S24_32LE, S24_32BE, U24_32LE, U24_32BE, S32LE, S32BE, U32LE, U32BE, S24LE, S24BE, U24LE, U24BE, S20LE, S20BE, U20LE, U20BE, S18LE, S18BE, U18LE, U18BE, F32LE, F32BE, F64LE, F64BE }, layout=(string)interleaved, rate=(int)[ 1, 2147483647 ], channels=(int)1;
	gst-launch-1.0 ... ! osxaudiosink device=46


Device found:

	name  : Built-in Microph
	class : Audio/Source
	caps  : audio/x-raw, format=(string)F32LE, layout=(string)interleaved, rate=(int)44100, channels=(int)2, channel-mask=(bitmask)0x0000000000000003;
	        audio/x-raw, format=(string){ S8, U8, S16LE, S16BE, U16LE, U16BE, S24_32LE, S24_32BE, U24_32LE, U24_32BE, S32LE, S32BE, U32LE, U32BE, S24LE, S24BE, U24LE, U24BE, S20LE, S20BE, U20LE, U20BE, S18LE, S18BE, U18LE, U18BE, F32LE, F32BE, F64LE, F64BE }, layout=(string)interleaved, rate=(int)44100, channels=(int)2, channel-mask=(bitmask)0x0000000000000003;
	        audio/x-raw, format=(string){ S8, U8, S16LE, S16BE, U16LE, U16BE, S24_32LE, S24_32BE, U24_32LE, U24_32BE, S32LE, S32BE, U32LE, U32BE, S24LE, S24BE, U24LE, U24BE, S20LE, S20BE, U20LE, U20BE, S18LE, S18BE, U18LE, U18BE, F32LE, F32BE, F64LE, F64BE }, layout=(string)interleaved, rate=(int)44100, channels=(int)1;
	gst-launch-1.0 osxaudiosrc device=39 ! ...

In the above example the recording device (Built-In Microphone) is osxaudiosrc device=39, so in order to run the example you would need to adapt the command-line accordingly:

gst-launch-1.0 -v osxaudiosrc device=39 ! audioconvert ! audioresample ! audio/x-raw,channels=1,rate=16000 ! filesink location=/dev/stdout | livecaption

Content Limits

The Speech API contains the following limits on the size of content (and are subject to change):

Content Limit	Audio Length
Synchronous Requests	~1 Minute
Asynchronous Requests	~180 Minutes
Streaming Requests	~1 Minute

Please note that each StreamingRecognize session is considered a single request even though it includes multiple frames of StreamingRecognizeRequest audio within the stream.

For more information, please refer to https://cloud.google.com/speech/limits#content.

Documentation ¶

Overview ¶

Command livecaption pipes the stdin audio data to Google Speech API and outputs the transcript.

As an example, gst-launch can be used to capture the mic input:

$ gst-launch-1.0 -v pulsesrc ! audioconvert ! audioresample ! audio/x-raw,channels=1,rate=16000 ! filesink location=/dev/stdout | livecaption

Source Files ¶

View all Source files

livecaption.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL