walg

package module
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 18, 2017 License: Apache-2.0 Imports: 29 Imported by: 0

README

WAL-G

Build Status Go Report Card

WAL-G is an archival restoration tool for Postgres.

WAL-G is the successor of WAL-E with a number of key differences. WAL-G uses LZ4 compression, multiple processors and non-exclusive base backups for Postgres. More information on the design and implementation of WAL-G can be found on the Citus Data blog post "Introducing WAL-G by Citus: Faster Disaster Recovery for Postgres".

Table of Contents

Installation

A precompiled binary for Linux AMD 64 of the latest version of WAL-G can be obtained under the Releases tab.

To decompress the binary, use:

tar -zxvf wal-g.linux-amd64.tar.gz

For other incompatible systems, please consult the Development section for more information.

Configuration

Required

To connect to Amazon S3, WAL-G requires that these variables be set:

  • WALE_S3_PREFIX (eg. s3://bucket/path/to/folder)

WAL-G determines AWS credentials like other AWS tools. You can set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY (optionally with AWS_SECURITY_TOKEN), or ~/.aws/credentials (optionally with AWS_PROFILE), or you can set nothing to automatically fetch credentials from the EC2 metadata service.

WAL-G uses the usual PostgreSQL environment variables to configure its connection, especially including PGHOST, PGPORT, PGUSER, and PGPASSWORD/PGPASSFILE/~/.pgpass.

Optional

WAL-G can automatically determine the S3 bucket's region using s3:GetBucketLocation, but if you wish to avoid this API call or forbid it from the applicable IAM policy, specify:

  • AWS_REGION(eg. us-west-2)

Concurrency values can be configured using:

  • WALG_DOWNLOAD_CONCURRENCY

To configure how many goroutines to use during extraction, use WALG_DOWNLOAD_CONCURRENCY. By default, WAL-G uses the minimum of the number of files to extract and 10.

  • WALG_UPLOAD_CONCURRENCY

To configure how many concurrency streams to use during backup uploading, use WALG_UPLOAD_CONCURRENCY. By default, WAL-G uses 10 streams.

Usage

WAL-G currently supports these commands:

  • backup-fetch

When fetching base backups, the user should pass in the name of the backup and a path to a directory to extract to. If this directory does not exist, WAL-G will create it and any dependent subdirectories.

wal-g backup-fetch ~/extract/to/here example-backup

WAL-G can also fetch the latest backup using:

wal-g backup-fetch ~/extract/to/here LATEST
  • backup-push

When uploading backups to S3, the user should pass in the path containing the backup started by Postgres as in:

wal-g backup-push /backup/directory/path
  • wal-fetch

When fetching WAL archives from S3, the user should pass in the archive name and the name of the file to download to. This file should not exist as WAL-G will create it for you.

wal-g wal-fetch example-archive new-file-name
  • wal-push

When uploading WAL archives to S3, the user should pass in the absolute path to where the archive is located.

wal-g wal-push /path/to/archive

Development

Installing

To compile and build the binary:

go get github.com/wal-g/wal-g
make all

Users can also install WAL-G by using make install. Specifying the GOBIN environment variable before installing allows the user to specify the installation location. On default, make install puts the compiled binary in go/bin.

export GOBIN=/usr/local/bin
make install
Testing

WAL-G relies heavily on unit tests. These tests do not require S3 configuration as the upload/download parts are tested using mocked objects. For more information on testing, please consult test_tools.

WAL-G will perform a round-trip compression/decompression test that generates a directory for data (eg. data...), compressed files (eg. compressed), and extracted files (eg. extracted). These directories will only get cleaned up if the files in the original data directory match the files in the extracted one.

Test coverage can be obtained using:

go test -v -coverprofile=coverage.out
go tool cover -html=coverage.out

Authors

See also the list of contributors who participated in this project.

License

This project is licensed under the Apache License, Version 2.0, but the lzo support is licensed under GPL 3.0+. Please refer to the LICENSE.md file for more details.

Acknowledgements

WAL-G would not have happened without the support of Citus Data

WAL-G came into existence as a result of the collaboration between a summer engineering intern at Citus, Katie Li, and Daniel Farina, the original author of WAL-E who currently serves as a principal engineer on the Citus Cloud team. Citus Data also has an open source extension to Postgres that distributes database queries horizontally to deliver scale and performance.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var Compressed uint32

Compressed is used to log compression ratio.

View Source
var EXCLUDE = make(map[string]Empty)

EXCLUDE is a list of excluded members from the bundled backup.

View Source
var MAXBACKOFF = float64(32)

MAXBACKOFF is the maxmimum backoff time in seconds for upload.

View Source
var MAXRETRIES = 7

MAXRETRIES is the maximum number of retries for upload.

View Source
var Uncompressed uint32

Uncompressed is used to log compression ratio.

Functions

func CheckType

func CheckType(path string) string

CheckType grabs the file extension from PATH.

func Configure

func Configure() (*TarUploader, *Prefix, error)

Configure connects to S3 and creates an uploader. It makes sure that a valid session has started; if invalid, returns AWS error and `<nil>` values.

Requires these environment variables to be set: WALE_S3_PREFIX

Able to configure the upload part size in the S3 uploader.

func Connect

func Connect() (*pgx.Conn, error)

Connect establishes a connection to postgres using a UNIX socket. Must export PGHOST and run with `sudo -E -u postgres`. If PGHOST is not set or if the connection fails, an error is returned and the connection is `<nil>`.

func CreateUploader

func CreateUploader(svc s3iface.S3API, partsize, concurrency int) s3manageriface.UploaderAPI

CreateUploader returns an uploader with customizable concurrency and partsize.

func DecompressLz4

func DecompressLz4(d io.Writer, s io.Reader) error

DecompressLz4 decompresses a .lz4 file. Returns an error upon failure.

func DecompressLzo

func DecompressLzo(d io.Writer, s io.Reader) error

DecompressLzo decompresses an .lzo file. Returns the first error encountered.

func ExtractAll

func ExtractAll(ti TarInterpreter, files []ReaderMaker) error

ExtractAll Handles all files passed in. Supports `.lzo`, `.lz4, and `.tar`. File type `.nop` is used for testing purposes. Each file is extracted in its own goroutine and ExtractAll will wait for all goroutines to finish. Returns the first error encountered.

func FormatName

func FormatName(s string) (string, error)

FormatName grabs the name of the WAL file and returns it in the form of `base_...`. If no match is found, returns an empty string and a `NoMatchAvailableError`.

func HandleTar

func HandleTar(bundle TarBundle, path string, info os.FileInfo) error

HandleTar creates underlying tar writer and handles one given file. Does not follow symlinks. If file is in EXCLUDE, will not be included in the final tarball. EXCLUDED directories are created but their contents are not written to local disk.

func StartBackup

func StartBackup(conn *pgx.Conn, backup string) (string, error)

StartBackup starts a non-exclusive base backup immediately. When finishing the backup, `backup_label` and `tablespace_map` contents are not immediately written to a file but returned instead. Returns empty string and an error if backup fails.

Types

type Archive

type Archive struct {
	Prefix  *Prefix
	Archive *string
}

Archive contains information associated with a WAL archive.

func (*Archive) CheckExistence

func (a *Archive) CheckExistence() (bool, error)

CheckExistence checks that the specified WAL file exists.

func (*Archive) GetArchive

func (a *Archive) GetArchive() (io.ReadCloser, error)

GetArchive downloads the specified archive from S3.

type Backup

type Backup struct {
	Prefix *Prefix
	Path   *string
	Name   *string
	Js     *string
}

Backup contains information about a valid backup generated and uploaded by WAL-G.

func (*Backup) CheckExistence

func (b *Backup) CheckExistence() (bool, error)

CheckExistence checks that the specified backup exists.

func (*Backup) GetKeys

func (b *Backup) GetKeys() ([]string, error)

GetKeys returns all the keys for the files in the specified backup.

func (*Backup) GetLatest

func (b *Backup) GetLatest() (string, error)

GetLatest sorts the backups by last modified time and returns the latest backup key.

type BackupTime

type BackupTime struct {
	Name string
	Time time.Time
}

BackupTime is used to sort backups by latest modified time.

type Bundle

type Bundle struct {
	MinSize int64
	Sen     *Sentinel
	Tb      TarBall
	Tbm     TarBallMaker
}

A Bundle represents the directory to be walked. Contains at least one TarBall if walk has started. Each TarBall will be at least MinSize bytes. The Sentinel is used to ensure complete uploaded backups; in this case, pg_control is used as the sentinel.

func (*Bundle) GetTarBall

func (b *Bundle) GetTarBall() TarBall

func (*Bundle) HandleLabelFiles

func (bundle *Bundle) HandleLabelFiles(conn *pgx.Conn) error

HandleLabelFiles creates the `backup_label` and `tablespace_map` files and uploads it to S3 by stopping the backup. Returns error upon failure.

func (*Bundle) HandleSentinel

func (bundle *Bundle) HandleSentinel() error

HandleSentinel uploads the compressed tar file of `pg_control`. Will only be called after the rest of the backup is successfully uploaded to S3. Returns an error upon failure.

func (*Bundle) NewTarBall

func (b *Bundle) NewTarBall()

func (*Bundle) TarWalker

func (bundle *Bundle) TarWalker(path string, info os.FileInfo, err error) error

TarWalker walks files provided by the passed in directory and creates compressed tar members labeled as `part_00i.tar.lzo`.

To see which files and directories are skipped, please consult 'structs.go'. Excluded directories will be created but their contents will not be included in the tar bundle.

type Empty

type Empty struct{}

Empty is used for channel signaling.

type EmptyWriteIgnorer

type EmptyWriteIgnorer struct {
	io.WriteCloser
}

EmptyWriteIgnorer handles 0 byte write in LZ4 package to stop pipe reader/writer from blocking.

func (EmptyWriteIgnorer) Write

func (e EmptyWriteIgnorer) Write(p []byte) (int, error)

type ExponentialTicker

type ExponentialTicker struct {
	MaxRetries int

	MaxWait float64
	// contains filtered or unexported fields
}

ExponentialTicker is used for exponential backoff for uploading to S3. If the max wait time is reached, retries will occur after max wait time intervals up to max retries.

func NewExpTicker

func NewExpTicker(retries int, wait float64) *ExponentialTicker

NewExpTicker creates a new ExponentialTicker with configurable max number of retries and max wait time.

func (*ExponentialTicker) Sleep

func (et *ExponentialTicker) Sleep()

Sleep will wait in seconds.

func (*ExponentialTicker) Update

func (et *ExponentialTicker) Update()

Update increases running count of retries by 1 and exponentially increases the wait time until the max wait time is reached.

type FileTarInterpreter

type FileTarInterpreter struct {
	NewDir string
}

FileTarInterpreter extracts input to disk.

func (*FileTarInterpreter) Interpret

func (ti *FileTarInterpreter) Interpret(tr io.Reader, cur *tar.Header) error

Interpret extracts a tar file to disk and creates needed directories. Returns the first error encountered. Calls fsync after each file is written successfully.

type Lz4CascadeClose

type Lz4CascadeClose struct {
	*lz4.Writer
	Underlying io.WriteCloser
}

Lz4CascadeClose bundles multiple closures into one function. Calling Close() will close the lz4 and underlying writer.

func (*Lz4CascadeClose) Close

func (lcc *Lz4CascadeClose) Close() error

Close returns the first encountered error from closing the lz4 writer or the underlying writer.

type Lz4Error

type Lz4Error struct {
	// contains filtered or unexported fields
}

Lz4Error is used to catch specific errors from Lz4PipeWriter when uploading to S3. Will not retry upload if this error occurs.

func (Lz4Error) Error

func (e Lz4Error) Error() string

type LzPipeWriter

type LzPipeWriter struct {
	Input  io.Reader
	Output io.Reader
}

LzPipeWriter allows for flexibility of using compressed output. Input is read and compressed to a pipe reader.

func (*LzPipeWriter) Compress

func (p *LzPipeWriter) Compress()

Compress compresses input to a pipe reader. Output must be used or pipe will block.

type NoMatchAvailableError

type NoMatchAvailableError struct {
	// contains filtered or unexported fields
}

NoMatchAvailableError is used to signal no match found in string.

func (NoMatchAvailableError) Error

func (e NoMatchAvailableError) Error() string

type Prefix

type Prefix struct {
	Svc    s3iface.S3API
	Bucket *string
	Server *string
}

Prefix contains the S3 service client, bucket and string.

type RaskyReader

type RaskyReader struct {
	R io.Reader
}

RaskyReader handles cases when the Rasky lzo package crashes. Occurs if byte size is too small (1-5).

func (*RaskyReader) Read

func (r *RaskyReader) Read(p []byte) (int, error)

Read ensures all bytes are get read for Rasky package.

type ReaderMaker

type ReaderMaker interface {
	Reader() (io.ReadCloser, error)
	Format() string
	Path() string
}

ReaderMaker is the generic interface used by extract. It allows for ease of handling different file formats.

type S3ReaderMaker

type S3ReaderMaker struct {
	Backup     *Backup
	Key        *string
	FileFormat string
}

S3ReaderMaker handles cases where backups need to be uploaded to S3.

func (*S3ReaderMaker) Format

func (s *S3ReaderMaker) Format() string

func (*S3ReaderMaker) Path

func (s *S3ReaderMaker) Path() string

func (*S3ReaderMaker) Reader

func (s *S3ReaderMaker) Reader() (io.ReadCloser, error)

Reader creates a new S3 reader for each S3 object.

type S3TarBall

type S3TarBall struct {
	// contains filtered or unexported fields
}

S3TarBall represents a tar file that is going to be uploaded to S3.

func (*S3TarBall) BaseDir

func (s *S3TarBall) BaseDir() string

func (*S3TarBall) CloseTar

func (s *S3TarBall) CloseTar() error

CloseTar closes the tar writer, flushing any unwritten data to the underlying writer before also closing the underlying writer.

func (*S3TarBall) Finish

func (s *S3TarBall) Finish() error

Finish writes an empty .json file and uploads it with the the backup name. Finish will wait until all tar file parts have been uploaded. The json file will only be uploaded if all other parts of the backup are present in S3. an alert is given with the corresponding error.

func (*S3TarBall) Nop

func (s *S3TarBall) Nop() bool

func (*S3TarBall) Number

func (s *S3TarBall) Number() int

func (*S3TarBall) SetSize

func (s *S3TarBall) SetSize(i int64)

func (*S3TarBall) SetUp

func (s *S3TarBall) SetUp(names ...string)

SetUp creates a new tar writer and starts upload to S3. Upload will block until the tar file is finished writing. If a name for the file is not given, default name is of the form `part_....tar.lz4`.

func (*S3TarBall) Size

func (s *S3TarBall) Size() int64

func (*S3TarBall) StartUpload

func (s *S3TarBall) StartUpload(name string) io.WriteCloser

StartUpload creates a lz4 writer and runs upload in the background once a compressed tar member is finished writing.

func (*S3TarBall) Trim

func (s *S3TarBall) Trim() string

func (*S3TarBall) Tw

func (s *S3TarBall) Tw() *tar.Writer

type S3TarBallMaker

type S3TarBallMaker struct {
	BaseDir  string
	Trim     string
	BkupName string
	Tu       *TarUploader
	// contains filtered or unexported fields
}

S3TarBallMaker creates tarballs that are uploaded to S3.

func (*S3TarBallMaker) Make

func (s *S3TarBallMaker) Make() TarBall

Make returns a tarball with required S3 fields.

type Sentinel

type Sentinel struct {
	Info os.FileInfo
	// contains filtered or unexported fields
}

Sentinel is used to signal completion of a walked directory.

type TarBall

type TarBall interface {
	SetUp(args ...string)
	CloseTar() error
	Finish() error
	BaseDir() string
	Trim() string
	Nop() bool
	Number() int
	Size() int64
	SetSize(int64)
	Tw() *tar.Writer
}

A TarBall represents one tar file.

type TarBallMaker

type TarBallMaker interface {
	Make() TarBall
}

TarBallMaker is used to allow for flexible creation of different TarBalls.

type TarBundle

type TarBundle interface {
	NewTarBall()
	GetTarBall() TarBall
}

TarBundle represents one completed directory.

type TarInterpreter

type TarInterpreter interface {
	Interpret(r io.Reader, hdr *tar.Header) error
}

TarInterpreter behaves differently for different file types.

type TarUploader

type TarUploader struct {
	Upl        s3manageriface.UploaderAPI
	MaxRetries int
	MaxWait    float64
	Success    bool
	// contains filtered or unexported fields
}

TarUploader contains fields associated with uploading tarballs. Multiple tarballs can share one uploader. Must call CreateUploader() in 'upload.go'.

func NewTarUploader

func NewTarUploader(svc s3iface.S3API, bucket, server, region string, r int, w float64) *TarUploader

NewTarUploader creates a new tar uploader without the actual S3 uploader. CreateUploader() is used to configure byte size and concurrency streams for the uploader.

func (*TarUploader) Finish

func (tu *TarUploader) Finish()

Finish waits for all waiting parts to be uploaded. If an error occurs, prints alert to stderr.

func (*TarUploader) UploadWal

func (tu *TarUploader) UploadWal(path string) (string, error)

UploadWal compresses a WAL file using LZ4 and uploads to S3. Returns the first error encountered and an empty string upon failure.

type TimeSlice

type TimeSlice []BackupTime

TimeSlice represents a backup and its last modified time.

func (TimeSlice) Len

func (p TimeSlice) Len() int

func (TimeSlice) Less

func (p TimeSlice) Less(i, j int) bool

func (TimeSlice) Swap

func (p TimeSlice) Swap(i, j int)

type UnsetEnvVarError

type UnsetEnvVarError struct {
	// contains filtered or unexported fields
}

UnsetEnvVarError is used to indicate required environment variables for WAL-G.

func (UnsetEnvVarError) Error

func (e UnsetEnvVarError) Error() string

type UnsupportedFileTypeError

type UnsupportedFileTypeError struct {
	Path       string
	FileFormat string
}

UnsupportedFileTypeError is used to signal file types that are unsupported by WAL-G.

func (UnsupportedFileTypeError) Error

func (e UnsupportedFileTypeError) Error() string

type WalFiles

type WalFiles interface {
	CheckExistence() (bool, error)
}

WalFiles represent any file generated by WAL-G.

type ZeroReader

type ZeroReader struct{}

ZeroReader generates a slice of zeroes. Used to pad tar in cases where length of file changes.

func (*ZeroReader) Read

func (z *ZeroReader) Read(p []byte) (int, error)

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL