ffmpegsplit

package module
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 23, 2022 License: Apache-2.0 Imports: 10 Imported by: 0

README

audiobook-split-ffmpeg-go

Split audiobook file into per-chapter files using chapter metadata and ffmpeg.

Useful in situations where your preferred audio player does not support chapter metadata.

NOTE: Works only if the input file actually has chapter metadata (see example below)

NOTE: Feature-wise this program/library is identical to https://github.com/MawKKe/audiobook-split-ffmpeg except this one is written in Go instead of Python. "Why???" you might ask? Well, I was learning Go and need a project...

NOTE: this is a quick rewrite and test coverage is not that great; there might be bugs not present in the Python version.

Go

Example

Let's say, you have an audio file mybook.m4b, for which ffprobe -i mybook.m4b shows the following:

Chapter #0:0: start 0.000000, end 1079.000000
Metadata:
  title           : Chapter Zero
Chapter #0:1: start 1079.000000, end 2040.000000
Metadata:
  title           : Chapter One
Chapter #0:2: start 2040.000000, end 2878.000000
Metadata:
  title           : Chapter Two
Chapter #0:3: start 2878.000000, end 3506.000000

Then, running:

$ audiobook-split-ffmpeg-go --infile mybook.m4b --outdir /tmp/foo

..produces the following files:

  • /tmp/foo/001 - Chapter Zero.m4b
  • /tmp/foo/002 - Chapter One.m4b
  • /tmp/foo/003 - Chapter Two.m4b

You may then play these files with your preferred application.

Install

To install the main executable:

$ go install github.com/MawKKe/audiobook-split-ffmpeg-go/cmd/audiobook-split-ffmpeg-go@latest

This should place the executable into your user's $GOPATH/bin/. If that path is in your $PATH, you are good to go. Next, see Usage below.

However, if you want to use the library in your projects, run:

$ go get github.com/MawKKe/audiobook-split-ffmpeg-go

See the file cmd/audiobook-split-ffmpeg-go/main.go for hints how to use the library.

Usage

See the help:

$ audiobook-split-ffmpeg-go -h

In the simplest case you can just call

$ audiobook-split-ffmpeg-go --infile /path/to/audio.m4b --outdir foo

Note that this script will never overwrite files in foo/, so you must delete conflicting files manually (or specify some other empty/nonexistent directory)

The chapter titles will be included in the filenames if they are available in the chapter metadata. You may prevent this behaviour with flag --no-use-title-as-filename, in which case the filenames will include the input file basename instead (this is useful is your metadata is crappy or malformed, for example).

You may specify how many parallel ffmpeg jobs you want with command line param --concurrency. The default concurrency is equal to the number of cores available. Note that at some point increasing the concurrency might not increase the throughput. (We specifically instruct ffmpeg to NOT perform re-encoding, so most of the processing work consists of copying the existing encoded audio data from the input file to the output file(s) - this kind of processing is more I/O bounded than CPU-bounded).

Dependencies

The project was developed with Go version 1.18, but it should compile with earlier versions. You might be able to compile the project with earlier releases by adjusting the version in file go.mod.

This application has no 3rd party library dependencies, as everything is implemented using the Go standard library. However, the script assumes the that the following system-executables are available somewhere in your $PATH:

  • ffmpeg
  • ffprobe

For Ubuntu, these can be installed with apt install ffmpeg.

Development and Testing

The Go tooling handles dependencies via 'go get', although this project requires no external Go libraries at the moment.

To start working on the code, clone the repo:

$ git clone https://github.com/MawKKe/audiobook-split-ffmpeg-go && cd audiobook-split-ffmpeg-go

To build the main binary:

$ make

or manually:

$ go build cmd/audiobook-split-ffmpeg-go

To run tests:

$ go test

(TODO: test coverage needs some improvement)

The provided Makefile has some useful targets defined for easier development and testing. You should check it out.

Features

  • The script does not transcode/re-encode the audio data. This speeds up the processing, but has the possibility of creating mangled audio in some rare cases (let me know if this happens).

  • This script will instruct ffmpeg to write metadata in the resulting chapter files, including:

    • track: the chapter number; in the format X/Y, where X = chapter number, Y = total num of chapters.
    • title: the chapter title, as-is (if available)
  • The chapter numbers are included in the output file names, padded with zeroes so that all numbers are of equal length. This makes the files much easier to sort by name.

  • The work is parallelized to speed up the processing.

License

Copyright 2022 Markus Holmström (MawKKe)

The works under this repository are licenced under Apache License 2.0. See file LICENSE for more information.

Contributing

This project is hosted at https://github.com/MawKKe/audiobook-split-ffmpeg-go

You are welcome to leave bug reports, fixes and feature requests. Thanks!

Documentation

Overview

Package ffmpegsplit is for parsing chapter information from a multimedia file using FFProbe

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GetReadChaptersCommandline

func GetReadChaptersCommandline(infile string) []string

GetReadChaptersCommandline function builds the list of arguments used for reading chapter information via 'ffprobe' from file 'infile'. Note: this function is called by ReadFile() - as such it is only useful for debug purposes.

Types

type Chapter

type Chapter struct {
	ID        int               `json:"id"`
	TimeBase  string            `json:"time_base"` // float or fixnum? Not needed anyways
	Start     int               `json:"start"`
	StartTime string            `json:"start_time"` // float or fixnum? Not needed anyways
	End       int               `json:"end"`
	EndTime   string            `json:"end_time"` // float or fixnum? Not needed anyways
	Tags      map[string]string `json:"tags"`
}

Chapter represents a single chapter in ffprobe output JSON

type ChapterFilter added in v0.1.1

type ChapterFilter struct {
	Description string
	Filter      ChapterFilterFunction
}

ChapterFilter is a wrapper structure for the filtering functions; besides the filter function itself, it holds a description field for clarity. The description is not necessary for the filter function, but may help in debugging.

type ChapterFilterFunction added in v0.1.1

type ChapterFilterFunction func(Chapter) bool

ChapterFilterFunction is a function that determines whether a chapter is to be processed or not. The function follows 'filter' semantics, i.e the chapter is to be skiped if the ChapterFilter function returns true.

type FFProbeOutput

type FFProbeOutput struct {
	Chapters []Chapter `json:"chapters"`
	// contains filtered or unexported fields
}

FFProbeOutput represents the JSON structure returned by ffprobe command

func ReadChapters

func ReadChapters(infile string) (FFProbeOutput, error)

ReadChapters is an alias for ReadChaptersWiithContext(context.Background(), infile)

func ReadChaptersFromJSON added in v0.1.1

func ReadChaptersFromJSON(encoded []byte) (FFProbeOutput, error)

ReadChaptersFromJSON parses the given byte sequence into a struct FFProbeOutput.

func ReadChaptersWithContext added in v0.1.2

func ReadChaptersWithContext(ctx context.Context, infile string) (FFProbeOutput, error)

ReadChaptersWithContext collects chapter information from the given file 'infile' using ffprobe. Blocks until subprocess returns. On success, parses the output (JSON) and returns the information in struct FFProbeOutput. Otherwise returns the error produced by either exec.Cmd.Run or json.Decoder.Unmarshal.

Expects the program 'ffmpeg' to be somewhere in user's $PATH.

type InputFileMetadata

type InputFileMetadata struct {
	Path          string
	BaseNoExt     string
	Extension     string
	FFProbeOutput FFProbeOutput
}

InputFileMetadata tepresents all important details of the input file. Produced by ReadFile().

func ReadFile

func ReadFile(infile string) (InputFileMetadata, error)

ReadFile is an alias for ReadFileWithContext(context.Background(), infile)

func ReadFileWithContext added in v0.1.2

func ReadFileWithContext(ctx context.Context, infile string) (InputFileMetadata, error)

ReadFileWithContext reads file metadata of file at path 'infile'. The associated context is used for controlling the launched subprocesses

func (InputFileMetadata) ComputeWorkItems

func (imeta InputFileMetadata) ComputeWorkItems(outdir string, opts OutFileOpts) ([]WorkItem, error)

ComputeWorkItems processes struct workItem for each chapter. The workItem shall contain all the necessary information in order to extract the chapter using ffmpeg. When the sequence of workItems have been produced, the final processing step can be performed by calling workItem.Process().

func (InputFileMetadata) NumChapters

func (imeta InputFileMetadata) NumChapters() int

NumChapters returns the number of chapters found in the input file.

type OutFileOpts

type OutFileOpts struct {
	// Place chapter title in output file name? (NOTE: Only if title is available)
	UseTitleInName bool

	// Place chapter title in output file metadata? (NOTE: Only if title is available)
	UseTitleInMeta bool

	// Place chapter number in output file metadata?
	UseChapterNumberInMeta bool

	// Adjusts the starting value of filename enumeration. Sometimes it
	// might make more sense to start enumeration from 1 instead of 0, for example.
	// Negative value tells the library to choose automatically.
	EnumOffset int

	// When chapter number is used in the filename, the number may be
	// left-padded with zeros in order to produce constant-width "column" of chapter numbers.
	// This has the advantage that files can now be sorted more easily by various *nix tools.
	//
	// This flag specifies how many leading zeros should in the filename enumeration, if at all.
	// Set value to <0 to let the library automatically compute the appropriate padding.
	// Set valut to  0 to disable padding
	// Otherwise, the value will determine the number of leading zeros.
	EnumPaddedWidth int

	// Use this output file extension instead of the input file extension.
	//
	// WARNING: if the default file container type associated with this new extension
	// is incompatible with the input code, ffmpeg most likely will re-encode
	// the audio stream to something that IS compatible; all the parameters for
	// the conversion are chosen by ffmpeg (currently this library provides no support
	// for specifying the output codec parameters, this may change in the
	// future...).
	UseAlternateExtension string

	// Filters is a list of user-definable functions for filtering chapters.
	// To add filter, use method AddFilter().
	Filters []ChapterFilter
}

OutFileOpts contains user-defined options specifying how the output files will be named and what kind of metadata they shall contain (if metadata even is available in the original input file).

func DefaultOutFileOpts

func DefaultOutFileOpts() OutFileOpts

DefaultOutFileOpts returns some sensible set of default values for OutFileOpts.

func (*OutFileOpts) AddFilter added in v0.1.1

func (opts *OutFileOpts) AddFilter(flt ChapterFilter)

AddFilter appends appends a filter to the list of filters in the OutFileOpts struct

func (OutFileOpts) IsFiltered added in v0.1.1

func (opts OutFileOpts) IsFiltered(ch Chapter) bool

IsFiltered invokes all configured filters for the given Chapter. If any of the filters return true, the function returns true. In other words, IsFiltered() returns false iff all the filters return false for the chapter.

type Status

type Status struct {
	Successful int
	Failed     int
	Submitted  int
}

Status describes how many chapter extractions succeeded and how many failed. Note that successful + failed should equal submitted, otherwise an error happened somewhere.

func Process

func Process(workItems []WorkItem, maxConcurrent int) Status

Process all workItems, i.e. do the actual extraction process. The workItems contain all the necessary information for the extractions to be performed. The processing happens in parallel, using at most 'maxConcurrent' ffmpeg worker processes.

Note: the extraction process does not re-encode the audio stream, thus the processing performance is not likely CPU-bound. However, using too many workers extracting the same file may saturate I/O, decreasing overall performance. In summary: increasing 'maxConcurrent' value may improve performance, but only up to a point.

TODO: add similar processin interface with support for context.Context (use exec.CommandContext?)

func (Status) String

func (s Status) String() string

Produce a printable string from Status

type WorkItem added in v0.1.1

type WorkItem struct {
	Infile       string
	Outfile      string
	OutDirectory string
	Chapter      Chapter
	// contains filtered or unexported fields
}

WorkItem represents all the required information for processing the input file into a chapter specific file. To do the actual processing, run WorkItem.Process()

func (WorkItem) FFmpegArgs added in v0.1.1

func (wi WorkItem) FFmpegArgs() []string

FFmpegArgs converts a WorkItem to a list of arguments that are going to be passed to ffmpeg for actual processing step.

func (WorkItem) GetCommand added in v0.1.1

func (wi WorkItem) GetCommand() []string

GetCommand produces a list of command line arguments that would produce the chapter file specific to this workItem

func (WorkItem) Process added in v0.1.1

func (wi WorkItem) Process() error

func (WorkItem) ProcessWithContext added in v0.1.2

func (wi WorkItem) ProcessWithContext(ctx context.Context) error

ProcessWithContext performs the actual processing step via ffmpeg. Expects 'ffmpeg' be somewhere in user's $PATH.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL