nanoda

package module

v0.0.2 Latest Latest Go to latest Published: Nov 2, 2023 License: MIT Imports: 11 Imported by: 4

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/aethiopicuschan/nanoda

Links

Open Source Insights

README ¶

nanoda

nanodaはVOICEVOX COREの動的ライブラリをGolangから叩くためのライブラリです。cgoではなくebitengine/puregoを利用しているため、簡単に使用することが可能です。

VOICEVOXについて

サポートするVOICEVOX COREのバージョンは 0.15 としており、開発は 0.15.0-preview.13 を元にしています。

nanoda自体はMITライセンスですが、利用に際してはVOICEVOXやOpenJTalkの利用規約に則る必要があることに注意してください。

使い方

go get github.com/aethiopicuschan/nanoda@latest

もっとも簡単な例は以下のようになります。

v, _ := nanoda.NewVoicevox("voicevox_core/libvoicevox_core.dylib", "voicevox_core/open_jtalk_dic_utf_8-1.11", "voicevox_core/model")
s, _ := v.NewSynthesizer()
s.LoadAllModels()
wav, _ := s.Tts("ずんだもんなのだ！", 3)
defer wav.Close()
f, _ := os.Create("output.wav")
defer f.Close()
io.Copy(f, wav)

その他 examples ディレクトリにサンプルコードを置いていますので、ご活用ください。

必要なファイルについて

動作には以下のものが必要です。

コアライブラリ
OpenJTalk
音声モデル

VOICEVOX COREのREADMEに従って用意してください。ただし、上述のバージョンに対応するものを利用するようにしてください。

開発方針

以下の理由からなるべくnanoda側で処理を受け持ったり抽象化したりして機能を提供することを目指しています。

使いやすさの向上
メモリまわりの安全性要件の確保
VOICEVOXとアプリケーション間の密結合を避け、APIの変更等に強くする

テスト

TODOです。ありません。

対応状況

以下は内部的に利用している関数のリストであり、必ずしも一致する形で公開されているわけではありません。

voicevox_create_supported_devices_json
voicevox_error_result_to_message
voicevox_get_version
voicevox_json_free
voicevox_make_default_initialize_options
voicevox_make_default_synthesis_options
voicevox_make_default_tts_options
voicevox_open_jtalk_rc_delete
voicevox_open_jtalk_rc_new
voicevox_open_jtalk_rc_use_user_dict
voicevox_synthesizer_create_accent_phrases
voicevox_synthesizer_create_accent_phrases_from_kana
voicevox_synthesizer_create_audio_query
voicevox_synthesizer_create_audio_query_from_kana
voicevox_synthesizer_create_metas_json
voicevox_synthesizer_delete
voicevox_synthesizer_is_gpu_mode
voicevox_synthesizer_is_loaded_voice_model
voicevox_synthesizer_load_voice_model
voicevox_synthesizer_new_with_initialize
voicevox_synthesizer_replace_mora_data
voicevox_synthesizer_replace_mora_pitch
voicevox_synthesizer_replace_phoneme_length
voicevox_synthesizer_synthesis
voicevox_synthesizer_tts
voicevox_synthesizer_tts_from_kana
voicevox_synthesizer_unload_voice_model
voicevox_user_dict_add_word
voicevox_user_dict_delete
voicevox_user_dict_import
voicevox_user_dict_load
voicevox_user_dict_new
voicevox_user_dict_remove_word
voicevox_user_dict_save
voicevox_user_dict_to_json
voicevox_user_dict_update_word
voicevox_user_dict_word_make
voicevox_voice_model_delete
voicevox_voice_model_get_metas_json
voicevox_voice_model_id
voicevox_voice_model_new_from_path
voicevox_wav_free

Documentation ¶

Rendered for

Index ¶

func WithAccelerationMode(mode AccelerationMode) func(*SynthesizerOption)
func WithAccentType(at uint64) func(*Word)
func WithCpuNumThreads(num uint16) func(*SynthesizerOption)
func WithEnableInterrogativeUpspeak() func(*TtsOptions)
func WithEnableKana() func(*TtsOptions)
func WithPriority(p uint32) func(*Word)
func WithWordType(wt WordType) func(*Word)
type AccelerationMode
type AccentPhrase
type AudioQuery
type Error
- func (e Error) Error() string
type Meta
type Mora
type ResultCode
type SpeakerId
type Style
type StyleId
type SupportedDevices
type Synthesizer
- func (s *Synthesizer) Close()
- func (s *Synthesizer) CreateAccentPhrases(text string, styleID StyleId) (a []AccentPhrase, err error)
- func (s *Synthesizer) CreateAccentPhrasesFromKana(text string, styleID StyleId) (a []AccentPhrase, err error)
- func (s *Synthesizer) CreateAudioQuery(text string, styleID StyleId) (a AudioQuery, err error)
- func (s *Synthesizer) CreateAudioQueryFromKana(text string, styleID StyleId) (a AudioQuery, err error)
- func (s *Synthesizer) GetMetas() (metas []Meta, err error)
- func (s *Synthesizer) IsGpuMode() bool
- func (s *Synthesizer) LoadAllModels() (err error)
- func (s *Synthesizer) LoadModelsFromSpeakerId(speakerId SpeakerId) (err error)
- func (s *Synthesizer) LoadModelsFromStyleId(styleId StyleId) (err error)
- func (s *Synthesizer) Replace(ap []AccentPhrase, styleID StyleId) (a []AccentPhrase, err error)
- func (s *Synthesizer) ReplaceOnlyMoraPitch(ap []AccentPhrase, styleID StyleId) (a []AccentPhrase, err error)
- func (s *Synthesizer) ReplaceOnlyPhonemeLength(ap []AccentPhrase, styleID StyleId) (a []AccentPhrase, err error)
- func (s *Synthesizer) Synthesis(aq AudioQuery, styleId StyleId) (io.ReadCloser, error)
- func (s *Synthesizer) SynthesisWithoutInterrogativeUpspeak(aq AudioQuery, styleId StyleId) (io.ReadCloser, error)
- func (s *Synthesizer) Tts(text string, styleID StyleId, options ...func(*TtsOptions)) (io.ReadCloser, error)
- func (s *Synthesizer) UnloadAllModels() (err error)
- func (s *Synthesizer) UnloadModelsFromSpeakerId(speakerId SpeakerId) (err error)
- func (s *Synthesizer) UnloadModelsFromStyleId(styleId StyleId) (err error)
type SynthesizerOption
type TtsOptions
type UserDict
- func (ud *UserDict) AddWord(word Word) (id string, err error)
- func (ud *UserDict) Close()
- func (ud *UserDict) Import(other *UserDict) (err error)
- func (ud *UserDict) Load(path string) (err error)
- func (ud *UserDict) RemoveWord(id string) (err error)
- func (ud *UserDict) Save(path string) (err error)
- func (ud *UserDict) ToJson() (j string, err error)
- func (ud *UserDict) UpdateWord(id string, word Word) (err error)
- func (ud *UserDict) Use() (err error)
type Voicevox
- func NewVoicevox(corePath string, openJtalkPath string, modelPath string) (v *Voicevox, err error)
- func (v *Voicevox) GetMessageFromResult(code ResultCode) string
- func (v *Voicevox) GetMetas() []Meta
- func (v *Voicevox) GetStyles() []Style
- func (v *Voicevox) GetVersion() string
- func (v *Voicevox) NewSynthesizer(options ...func(*SynthesizerOption)) (s Synthesizer, err error)
- func (v *Voicevox) NewUserDict() (ud *UserDict)
- func (v *Voicevox) SupportedDevices() (sd SupportedDevices, err error)
type Wav
- func (w *Wav) Close() error
- func (w *Wav) Read(p []byte) (n int, err error)
type Word
- func NewWord(surface, pronunciation string, options ...func(*Word)) (w Word)
type WordType

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func WithAccelerationMode ¶

func WithAccelerationMode(mode AccelerationMode) func(*SynthesizerOption)

ハードウェアアクセラレーションモードを設定する

func WithAccentType ¶

func WithAccentType(at uint64) func(*Word)

NewWordを呼ぶ際にアクセント型を指定する

func WithCpuNumThreads ¶

func WithCpuNumThreads(num uint16) func(*SynthesizerOption)

CPU利用数を設定する 0の場合は環境に合わせてCPUが利用される

func WithEnableInterrogativeUpspeak ¶

func WithEnableInterrogativeUpspeak() func(*TtsOptions)

疑問文の調整を有効にする

func WithEnableKana ¶

func WithEnableKana() func(*TtsOptions)

AquesTalk風記法を有効にする

func WithPriority ¶

func WithPriority(p uint32) func(*Word)

NewWordを呼ぶ際に単語の種類を指定する

func WithWordType ¶

func WithWordType(wt WordType) func(*Word)

NewWordを呼ぶ際に単語の種類を指定する

Types ¶

type AccelerationMode ¶

type AccelerationMode int32

ハードウェアアクセラレーションモードを設定する設定値

const (
	ACCELERATION_MODE_AUTO AccelerationMode = iota // 実行環境に合った適切なハードウェアアクセラレーションモードを選択する
	ACCELERATION_MODE_CPU                          // ハードウェアアクセラレーションモードを"CPU"に設定する
	ACCELERATION_MODE_GPU                          // ハードウェアアクセラレーションモードを"GPU"に設定する
)

type AccentPhrase ¶

type AccentPhrase struct {
	Moras           []Mora `json:"moras"`
	Accent          int    `json:"accent"`
	PauseMora       *Mora  `json:"pause_mora"`
	IsInterrogative bool   `json:"is_interrogative"`
}

アクセント句

type AudioQuery ¶

type AudioQuery struct {
	AccentPhrases      []AccentPhrase `json:"accent_phrases"`
	SpeedScale         float64        `json:"speed_scale"`
	PitchScale         float64        `json:"pitch_scale"`
	IntonationScale    float64        `json:"intonation_scale"`
	VolumeScale        float64        `json:"volume_scale"`
	PrePhonemeLength   float64        `json:"pre_phoneme_length"`
	PostPhonemeLength  float64        `json:"post_phoneme_length"`
	OutputSamplingRate int            `json:"output_sampling_rate"`
	OutputStereo       bool           `json:"output_stereo"`
	Kana               string         `json:"kana"`
}

音声合成用のクエリ

type Error ¶

type Error struct {
	Code ResultCode
	Msg  string
}

func (Error) Error ¶

func (e Error) Error() string

type Meta ¶

type Meta struct {
	Name      string
	Styles    []Style
	SpeakerId SpeakerId
}

キャラクター単位のメタ情報

type Mora ¶

type Mora struct {
	Text            string   `json:"text"`
	Consonant       *string  `json:"consonant"`
	ConsonantLength *float64 `json:"consonant_length"`
	Vowel           string   `json:"vowel"`
	VowelLength     float64  `json:"vowel_length"`
	Pitch           float64  `json:"pitch"`
}

モーラ（子音＋母音）

type ResultCode ¶

type ResultCode int32

処理結果を示す結果コード

const (
	VOICEVOX_RESULT_OK                               ResultCode = 0  // 成功
	VOICEVOX_RESULT_NOT_LOADED_OPENJTALK_DICT_ERROR  ResultCode = 1  // open_jtalk辞書ファイルが読み込まれていない
	VOICEVOX_RESULT_GET_SUPPORTED_DEVICES_ERROR      ResultCode = 3  // サポートされているデバイス情報取得に失敗した
	VOICEVOX_RESULT_GPU_SUPPORT_ERROR                ResultCode = 4  // GPUモードがサポートされていない
	VOICEVOX_RESULT_STYLE_NOT_FOUND_ERROR            ResultCode = 6  // スタイルIDに対するスタイルが見つからなかった
	VOICEVOX_RESULT_MODEL_NOT_FOUND_ERROR            ResultCode = 7  // 音声モデルIDに対する音声モデルが見つからなかった
	VOICEVOX_RESULT_INFERENCE_ERROR                  ResultCode = 8  // 推論に失敗した
	VOICEVOX_RESULT_EXTRACT_FULL_CONTEXT_LABEL_ERROR ResultCode = 11 // コンテキストラベル出力に失敗した
	VOICEVOX_RESULT_INVALID_UTF8_INPUT_ERROR         ResultCode = 12 // 無効なutf8文字列が入力された
	VOICEVOX_RESULT_PARSE_KANA_ERROR                 ResultCode = 13 // AquesTalk風記法のテキストの解析に失敗した
	VOICEVOX_RESULT_INVALID_AUDIO_QUERY_ERROR        ResultCode = 14 // 無効なAudioQuery
	VOICEVOX_RESULT_INVALID_ACCENT_PHRASE_ERROR      ResultCode = 15 // 無効なAccentPhrase
	VOICEVOX_RESULT_OPEN_ZIP_FILE_ERROR              ResultCode = 16 // ZIPファイルを開くことに失敗した
	VOICEVOX_RESULT_READ_ZIP_ENTRY_ERROR             ResultCode = 17 // ZIP内のファイルが読めなかった
	VOICEVOX_RESULT_MODEL_ALREADY_LOADED_ERROR       ResultCode = 18 // すでに読み込まれている音声モデルを読み込もうとした
	VOICEVOX_RESULT_STYLE_ALREADY_LOADED_ERROR       ResultCode = 26 // すでに読み込まれているスタイルを読み込もうとした
	VOICEVOX_RESULT_INVALID_MODEL_DATA_ERROR         ResultCode = 27 // 無効なモデルデータ
	VOICEVOX_RESULT_LOAD_USER_DICT_ERROR             ResultCode = 20 // ユーザー辞書を読み込めなかった
	VOICEVOX_RESULT_SAVE_USER_DICT_ERROR             ResultCode = 21 // ユーザー辞書を書き込めなかった
	VOICEVOX_RESULT_USER_DICT_WORD_NOT_FOUND_ERROR   ResultCode = 22 // ユーザー辞書に単語が見つからなかった
	VOICEVOX_RESULT_USE_USER_DICT_ERROR              ResultCode = 23 // OpenJTalkのユーザー辞書の設定に失敗した
	VOICEVOX_RESULT_INVALID_USER_DICT_WORD_ERROR     ResultCode = 24 // ユーザー辞書の単語のバリデーションに失敗した
	VOICEVOX_RESULT_INVALID_UUID_ERROR               ResultCode = 25 // UUIDの変換に失敗した
)

type SpeakerId ¶

type SpeakerId string

話者のID

type Style ¶

type Style struct {
	Name string  `json:"name"`
	Id   StyleId `json:"id"`
}

スタイル情報

type StyleId ¶

type StyleId uint32

スタイルのID

type SupportedDevices ¶

type SupportedDevices struct {
	Cpu  bool `json:"cpu"`
	Cuda bool `json:"cuda"`
	Dml  bool `json:"dml"`
}

利用可能なデバイスの情報。あくまでVOICEVOX COREライブラリが対応しているかどうかであることに注意すること。

type Synthesizer ¶

type Synthesizer struct {
	// contains filtered or unexported fields
}

音声シンセナイザ Voicevox.NewSynthesizerで作成する

func (*Synthesizer) Close ¶

func (s *Synthesizer) Close()

このSynthesizerを閉じる

func (*Synthesizer) CreateAccentPhrases ¶

func (s *Synthesizer) CreateAccentPhrases(text string, styleID StyleId) (a []AccentPhrase, err error)

アクセント句の配列を生成する

func (*Synthesizer) CreateAccentPhrasesFromKana ¶

func (s *Synthesizer) CreateAccentPhrasesFromKana(text string, styleID StyleId) (a []AccentPhrase, err error)

アクセント句の配列を生成する(AquesTalk風記法)

func (*Synthesizer) CreateAudioQuery ¶

func (s *Synthesizer) CreateAudioQuery(text string, styleID StyleId) (a AudioQuery, err error)

音声合成用のクエリを生成する

func (*Synthesizer) CreateAudioQueryFromKana ¶

func (s *Synthesizer) CreateAudioQueryFromKana(text string, styleID StyleId) (a AudioQuery, err error)

音声合成用のクエリを生成する(AquesTalk風記法)

func (*Synthesizer) GetMetas ¶

func (s *Synthesizer) GetMetas() (metas []Meta, err error)

現在読み込んでいる音声モデルのメタ情報を取得する

func (*Synthesizer) IsGpuMode ¶

func (s *Synthesizer) IsGpuMode() bool

GPUモードかどうかを判定する

func (*Synthesizer) LoadAllModels ¶

func (s *Synthesizer) LoadAllModels() (err error)

すべての音声モデルを読み込む

func (*Synthesizer) LoadModelsFromSpeakerId ¶

func (s *Synthesizer) LoadModelsFromSpeakerId(speakerId SpeakerId) (err error)

話者IDを元にして音声モデルを読み込む

func (*Synthesizer) LoadModelsFromStyleId ¶

func (s *Synthesizer) LoadModelsFromStyleId(styleId StyleId) (err error)

スタイルIDを元にして音声モデルを読み込む

func (*Synthesizer) Replace ¶

func (s *Synthesizer) Replace(ap []AccentPhrase, styleID StyleId) (a []AccentPhrase, err error)

アクセント句の配列を指定されたスタイルで再生成する

func (*Synthesizer) ReplaceOnlyMoraPitch ¶

func (s *Synthesizer) ReplaceOnlyMoraPitch(ap []AccentPhrase, styleID StyleId) (a []AccentPhrase, err error)

アクセント句の配列を指定されたスタイルで再生成する(音高のみ)

func (*Synthesizer) ReplaceOnlyPhonemeLength ¶

func (s *Synthesizer) ReplaceOnlyPhonemeLength(ap []AccentPhrase, styleID StyleId) (a []AccentPhrase, err error)

アクセント句の配列を指定されたスタイルで再生成する(音素長のみ)

func (*Synthesizer) Synthesis ¶

func (s *Synthesizer) Synthesis(aq AudioQuery, styleId StyleId) (io.ReadCloser, error)

AudioQueryから音声合成を行う

func (*Synthesizer) SynthesisWithoutInterrogativeUpspeak ¶

func (s *Synthesizer) SynthesisWithoutInterrogativeUpspeak(aq AudioQuery, styleId StyleId) (io.ReadCloser, error)

AudioQueryから音声合成を行う(疑問文の調整なし)

func (*Synthesizer) Tts ¶

func (s *Synthesizer) Tts(text string, styleID StyleId, options ...func(*TtsOptions)) (io.ReadCloser, error)

音声合成を行う

func (*Synthesizer) UnloadAllModels ¶

func (s *Synthesizer) UnloadAllModels() (err error)

すべての音声モデルをアンロードする

func (*Synthesizer) UnloadModelsFromSpeakerId ¶

func (s *Synthesizer) UnloadModelsFromSpeakerId(speakerId SpeakerId) (err error)

話者IDを元にして音声モデルをアンロードする

func (*Synthesizer) UnloadModelsFromStyleId ¶

func (s *Synthesizer) UnloadModelsFromStyleId(styleId StyleId) (err error)

スタイルIDを元にして音声モデルをアンロードする

type SynthesizerOption ¶

type SynthesizerOption struct {
	// contains filtered or unexported fields
}

シンセナイザの作成時に指定するオプション

type TtsOptions ¶

type TtsOptions struct {
	// contains filtered or unexported fields
}

Ttsを用いた音声合成時に指定するオプション

type UserDict ¶

type UserDict struct {
	// contains filtered or unexported fields
}

ユーザ辞書 voicevox.NewUserDictで作成する

func (*UserDict) AddWord ¶

func (ud *UserDict) AddWord(word Word) (id string, err error)

単語を追加する

func (*UserDict) Close ¶

func (ud *UserDict) Close()

ユーザ辞書を閉じる

func (*UserDict) Import ¶

func (ud *UserDict) Import(other *UserDict) (err error)

他のユーザ辞書をインポートする

func (*UserDict) Load ¶

func (ud *UserDict) Load(path string) (err error)

指定されたパスからユーザ辞書を読み込む

func (*UserDict) RemoveWord ¶

func (ud *UserDict) RemoveWord(id string) (err error)

単語を削除する

func (*UserDict) Save ¶

func (ud *UserDict) Save(path string) (err error)

ユーザ辞書を指定されたパスに保存する形式はJSONで、文字列として得たい場合はToJsonを使うこと

func (*UserDict) ToJson ¶

func (ud *UserDict) ToJson() (j string, err error)

ユーザ辞書をJSONとして出力する

func (*UserDict) UpdateWord ¶

func (ud *UserDict) UpdateWord(id string, word Word) (err error)

単語を更新する

func (*UserDict) Use ¶

func (ud *UserDict) Use() (err error)

ユーザ辞書を使用する

type Voicevox ¶

type Voicevox struct {
	// contains filtered or unexported fields
}

各種関数やポインタなどを保持する構造体

func NewVoicevox ¶

func NewVoicevox(corePath string, openJtalkPath string, modelPath string) (v *Voicevox, err error)

必要なパスを引数に取り、Voicevoxのインスタンスを生成する

func (*Voicevox) GetMessageFromResult ¶

func (v *Voicevox) GetMessageFromResult(code ResultCode) string

結果コードからメッセージを取得する

func (*Voicevox) GetMetas ¶

func (v *Voicevox) GetMetas() []Meta

メタ情報の一覧を取得する

func (*Voicevox) GetStyles ¶

func (v *Voicevox) GetStyles() []Style

スタイルの一覧を取得するここで取得されるスタイルのNameは「ずんだもん(あまあま)」のようなフォーマットになる

func (*Voicevox) GetVersion ¶

func (v *Voicevox) GetVersion() string

voicevoxのバージョンを取得する

func (*Voicevox) NewSynthesizer ¶

func (v *Voicevox) NewSynthesizer(options ...func(*SynthesizerOption)) (s Synthesizer, err error)

シンセナイザを作成する

func (*Voicevox) NewUserDict ¶

func (v *Voicevox) NewUserDict() (ud *UserDict)

ユーザ辞書を作成する

func (*Voicevox) SupportedDevices ¶

func (v *Voicevox) SupportedDevices() (sd SupportedDevices, err error)

利用可能なデバイスの情報を取得する

type Wav ¶

type Wav struct {
	// contains filtered or unexported fields
}

出力されたwavファイルを表す構造体 io.ReadCloserを実装している

func (*Wav) Close ¶

func (w *Wav) Close() error

func (*Wav) Read ¶

func (w *Wav) Read(p []byte) (n int, err error)

type Word ¶

type Word struct {
	Surface       string   // 表記
	Pronunciation string   // 読み
	AccentType    uint64   // アクセント型(音が下がる場所を指す)
	WordType      WordType // 単語の種類
	Priority      uint32   // 優先度(0〜10までの整数)
}

単語

func NewWord ¶

func NewWord(surface, pronunciation string, options ...func(*Word)) (w Word)

単語を作成する辞書に登録するには別途AddWordを呼び出す必要がある

type WordType ¶

type WordType int32

単語の種類

const (
	VOICEVOX_USER_DICT_WORD_TYPE_PROPER_NOUN WordType = iota // 固有名詞
	VOICEVOX_USER_DICT_WORD_TYPE_COMMON_NOUN                 // 一般名詞
	VOICEVOX_USER_DICT_WORD_TYPE_VERB                        // 動詞
	VOICEVOX_USER_DICT_WORD_TYPE_ADJECTIVE                   // 形容詞
	VOICEVOX_USER_DICT_WORD_TYPE_SUFFIX                      // 接尾辞
)

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
examples
audioquery
tts
userdict
internal
strings

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL