audio

package

v0.3.0-beta Latest Latest Go to latest Published: Jan 14, 2024 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/Kardbord/gopenai

README ¶

Audio

Bindings for the audio endpoint.

Example

See audio-example.go.

Documentation ¶

Overview ¶

Package audio provides bindings for the audio endpoint. Converts audio into text.

Index ¶

Constants
func MakeSpeechRequest(request *SpeechRequest, organizationID *string) ([]byte, error)
type Response
- func MakeTranscriptionRequest(request *TranscriptionRequest, organizationID *string) (*Response, error)
- func MakeTranslationRequest(request *TranslationRequest, organizationID *string) (*Response, error)
type ResponseFormat
type SpeechRequest
type TranscriptionRequest
type TranslationRequest

Constants ¶

View Source

const (
	BaseEndpoint         = common.BaseURL + "audio/"
	TransciptionEndpoint = BaseEndpoint + "transcriptions"
	TranslationEndpoint  = BaseEndpoint + "translations"
	SpeechEndpoint       = BaseEndpoint + "speech"
)

View Source

const (
	// TODO: Support non-json return formats.
	ResponseFormatJSON = "json"
	// [deprecated]: Use ResponseFormatJSON instead
	JSONResponseFormat = ResponseFormatJSON
)

View Source

const (
	VoiceAlloy   = "alloy"
	VoiceEcho    = "echo"
	VoiceFable   = "fable"
	VoiceOnyx    = "onyx"
	VoiceNova    = "nova"
	VoiceShimmer = "shimmer"

	SpeechFormatMp3  = "mp3"
	SpeechFormatOpus = "opus"
	SpeechFormatAac  = "aac"
	SpeechFormatFlac = "flac"
)

Variables ¶

This section is empty.

Functions ¶

func MakeSpeechRequest ¶

func MakeSpeechRequest(request *SpeechRequest, organizationID *string) ([]byte, error)

Types ¶

type Response ¶

type Response struct {
	Text  string                `json:"text"`
	Usage common.ResponseUsage  `json:"usage"`
	Error *common.ResponseError `json:"error,omitempty"`
}

Response structure for both Transcription and Translation requests.

func MakeTranscriptionRequest ¶

func MakeTranscriptionRequest(request *TranscriptionRequest, organizationID *string) (*Response, error)

func MakeTranslationRequest ¶

func MakeTranslationRequest(request *TranslationRequest, organizationID *string) (*Response, error)

type ResponseFormat ¶

type ResponseFormat = string

type SpeechRequest ¶

type SpeechRequest struct {
	// One of the available TTS models.
	Model string `json:"model"`

	// The text to generate audio for. The maximum length is 4096 characters.
	Input string `json:"input"`

	// The voice to use when generating the audio.
	Voice string `json:"voice"`

	// The format to audio in.
	ResponseFormat ResponseFormat `json:"response_format,omitempty"`

	// The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.
	Speed float64 `json:"speed,omitempty"`
}

Request structure for the create speech endpoint.

type TranscriptionRequest ¶

type TranscriptionRequest struct {
	// The audio file to transcribe, in one of these formats:
	// mp3, mp4, mpeg, mpga, m4a, wav, or webm.
	// This can be a file path or a URL.
	File string `json:"file"`

	// ID of the model to use. You can use the List models API
	// to see all of your available models, or see our Model
	// overview for descriptions of them.
	Model string `json:"model"`

	// An optional text to guide the model's style or continue a
	// previous audio segment. The prompt should match the audio language.
	Prompt string `json:"prompt,omitempty"`

	// The format of the transcript output, in one of these options:
	// json, text, srt, verbose_json, or vtt.
	ResponseFormat ResponseFormat `json:"response_format,omitempty"`

	// The sampling temperature, between 0 and 1. Higher values like 0.8 will
	// make the output more random, while lower values like 0.2 will make it
	// more focused and deterministic. If set to 0, the model will use log
	// probability to automatically increase the temperature until certain
	// thresholds are hit.
	Temperature *float64 `json:"temperature,omitempty"`

	// The language of the input audio. Supplying the input language in
	// ISO-639-1 format will improve accuracy and latency.
	Language string `json:"language,omitempty"`
}

Request structure for the transcription endpoint.

type TranslationRequest ¶

type TranslationRequest struct {
	// The audio file to transcribe, in one of these formats:
	// mp3, mp4, mpeg, mpga, m4a, wav, or webm.
	// This can be a file path or a URL.
	File string `json:"file"`

	// ID of the model to use. You can use the List models API
	// to see all of your available models, or see our Model
	// overview for descriptions of them.
	Model string `json:"model"`

	// An optional text to guide the model's style or continue a
	// previous audio segment. The prompt should be in English.
	Prompt string `json:"prompt,omitempty"`

	// The format of the transcript output, in one of these options:
	// json, text, srt, verbose_json, or vtt.
	ResponseFormat ResponseFormat `json:"response_format,omitempty"`

	// The sampling temperature, between 0 and 1. Higher values like 0.8 will
	// make the output more random, while lower values like 0.2 will make it
	// more focused and deterministic. If set to 0, the model will use log
	// probability to automatically increase the temperature until certain
	// thresholds are hit.
	Temperature *float64 `json:"temperature,omitempty"`
}

Request structure for the Translations endpoint.

Source Files ¶

View all Source files

audio.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL