logdb

package module
v0.0.0-...-106869f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 26, 2016 License: MIT Imports: 16 Imported by: 0

README

logdb Build Status Report Card Coverage Status GoDoc

LogDB is a Go library for efficient log-structured databases. A log-structured database is a very simple data store where writes are only ever appended to the database, there are no random-access writes at all. To prevent the database from growing indefinitely, a contiguous chunk of entries can be removed from either the beginning or the end.

This library is efficient and provides consistency guarantees: an entry is either stored or it is not, even in the event of power loss during execution the database cannot be left in an inconsistent state.

The godoc is available online.

Project Status

Very early days. The API is unstable, and everything is in flux.

Data Consistency

The guiding principle for consistency and correctness is that, no matter where the program is interrupted, there should never be silent corruption in the database. In particular, this means:

  • If an Append is interrupted, or a Sync following an Append is interrupted, the entry will either be there or not: if the entry is accessible, it will have the expected contents.

  • If an AppendEntries is interrupted, or a Sync following an AppendEntries is interrupted, some of the entries may not be there, but there will be no gaps: if any messages are there, it will be an initial portion of the appended ones, with the expected contents (as with Append).

  • If a Forget, Rollback, or Truncate is interrupted, or a Sync following one of those is interrupted, some of the entries may remain: but there will be no gaps; it won't be possible to access both entry x and entry y, unless all entries between x and y are also accessible.

As the database is so simple, ensuring this data consistency isn't the great challenge it is in more fully-featured database systems. Care is taken to sync chunk data files before writing out chunk metadata files, and metadata files are implemented as an append-only log. A sensible default can be recovered for the one non-append-only piece of metadata (the ID of the oldest visible entry in the database (which, due to a Forget may be newer than the ID of the oldest entry in the database)) if it is corrupted or lost.

If it is impossible to unambiguously and safely open a database, an error is returned. Otherwise, automatic recovery is performed. If an error occurs, please file a bug report including the error message and a description of what was being done to the database when the process terminated, as this shouldn't happen without external tampering.

Contributing

Bug reports, pull requests, and comments are very welcome!

Feel free to contact me on GitHub, through IRC (on freenode), or email (mike@barrucadu.co.uk).

Documentation

Overview

Package logdb provides an efficient log-structured database supporting efficient insertion of new entries and removal from either end of the log.

This provides a number of interfaces and types, and a lot of errors.

  • 'LogDB' is the main interface for a log-structured database.
  • 'PersistDB' is an interface for databases which can be persisted in some way.
  • 'BoundedDB' is an interface for databases with a fixed maximum entry size.
  • 'CloseDB' is an interface for databases which can be closed.

The 'LockFreeChunkDB' and 'ChunkDB' types implement all of these interfaces, and are created with 'Open' and 'WrapForConcurrency' respectively. As the names suggest, the difference is the thread-safety. A 'LockFreeChunkDB' is only safe for single-threaded access, where a 'ChunkDB' wraps it and adds locking, for safe concurrent access. Additionally, the 'InMemDB' type implements the 'LogDB' interface using a purely in-memory store.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrIDOutOfRange means that the requested ID is not present in the log.
	ErrIDOutOfRange = errors.New("log ID out of range")

	// ErrUnknownVersion means that the disk format version of an opened database is unknown.
	ErrUnknownVersion = errors.New("unknown disk format version")

	// ErrNotDirectory means that the path given to 'Open' exists and is not a directory.
	ErrNotDirectory = errors.New("database path not a directory")

	// ErrPathDoesntExist means that the path given to 'Open' does not exist and the 'create' flag was
	// false.
	ErrPathDoesntExist = errors.New("database directory does not exist")

	// ErrTooBig means that an entry could not be appended because it is larger than the chunk size.
	ErrTooBig = errors.New("entry larger than chunksize")

	// ErrClosed means that the database handle is closed.
	ErrClosed = errors.New("database is closed")

	// ErrEmptyNonfinalChunk means that the metadata for a non-final chunk has zero entries.
	ErrEmptyNonfinalChunk = errors.New("metadata of non-final chunk contains no entries")
)
View Source
var ErrNotValueSlice = errors.New("AppendValues must be called with a slice argument")

ErrNotValueSlice means that AppendValues was called with a non-slice argument.

Functions

This section is empty.

Types

type AtomicityError

type AtomicityError struct {
	AppendErr   error
	RollbackErr error
}

AtomicityError means that an error occurred while appending an entry in an 'AppendEntries' call, and attempting to rollback also gave an error. It wraps the actual errors.

func (*AtomicityError) Error

func (e *AtomicityError) Error() string

func (*AtomicityError) WrappedErrors

func (e *AtomicityError) WrappedErrors() []error

type BoundedDB

type BoundedDB interface {
	// 'BoundedDB' is an extension of 'LogDB'.
	LogDB

	// The maximum size of an entry. It is an error to try to insert an entry larger than this.
	MaxEntrySize() uint64
}

A BoundedDB has a maximum entry size. In addition to defining methods, a 'BoundedDB' changes some the behaviour of 'Append' and 'AppendEntries': they now return 'ErrTooBig' if an entry appended is larger than the maximum size.

type ChunkContinuityError

type ChunkContinuityError struct {
	ChunkFilePath string
	Expected      uint64
	Actual        uint64
}

ChunkContinuityError means that two adjacent chunks do not contain a contiguous sequence of entries.

func (*ChunkContinuityError) Error

func (e *ChunkContinuityError) Error() string

type ChunkDB

type ChunkDB struct {
	// The underlying 'LockFreeChunkDB'. This is not safe for concurrent use with the 'ChunkDB'.
	*LockFreeChunkDB
	// contains filtered or unexported fields
}

ChunkDB is a 'LogDB' implementation using an on-disk format where entries are stored in fixed-size "chunks". It provides data consistency guarantees in the event of program failure: database entries are guaranteed to be contiguous, and if the database thinks an entry is there, it will contain non-corrupt data.

In addition to implementing the behaviour specified in the 'LogDB', 'PersistDB', 'BoundedDB', and 'CloseDB' interfaces, a 'Sync' is always performed if an entire chunk is forgotten, rolled back, or truncated; or if an append creates a new on-disk chunk.

func WrapForConcurrency

func WrapForConcurrency(db *LockFreeChunkDB) *ChunkDB

Wrap a 'LockFreeChunkDB' into a 'ChunkDB', which is safe for concurrent use. The underlying 'LockFreeChunkDB' should not be used while the returned 'ChunkDB' is live.

func (*ChunkDB) AppendEntries

func (db *ChunkDB) AppendEntries(entries [][]byte) (uint64, error)

AppendEntries implements the 'LogDB', 'PersistDB', 'BoundedDB', and 'CloseDB' interfaces.

func (*ChunkDB) Close

func (db *ChunkDB) Close() error

Close implements the 'CloseDB' interface. This also closes the underlying 'LockFreeChunkDB'.

func (*ChunkDB) Forget

func (db *ChunkDB) Forget(newOldestID uint64) error

Forget implements the 'LogDB', 'PersistDB', and 'CloseDB' interfaces.

func (*ChunkDB) Get

func (db *ChunkDB) Get(id uint64) ([]byte, error)

Get implements the 'LogDB' and 'CloseDB' interfaces.

func (*ChunkDB) Rollback

func (db *ChunkDB) Rollback(newNewestID uint64) error

Rollback implements the 'LogDB', 'PersistDB', and 'CloseDB' interfaces.

func (*ChunkDB) SetSync

func (db *ChunkDB) SetSync(every int) error

SetSync implements the 'PersistDB' and 'CloseDB' interface.

func (*ChunkDB) Sync

func (db *ChunkDB) Sync() error

Sync implements the 'PersistDB' and 'CloseDB' interface.

func (*ChunkDB) Truncate

func (db *ChunkDB) Truncate(newOldestID, newNewestID uint64) error

Truncate implements the 'LogDB', 'PersistDB', and 'CloseDB' interfaces.

type ChunkFileNameError

type ChunkFileNameError struct {
	FilePath string
}

ChunkFileNameError means that a filename is not valid for a chunk file.

func (*ChunkFileNameError) Error

func (e *ChunkFileNameError) Error() string

type ChunkMetaError

type ChunkMetaError struct {
	ChunkFilePath string
	Err           error
}

ChunkMetaError means that the metadata for a chunk could not be read. It wraps the actual error.

func (*ChunkMetaError) Error

func (e *ChunkMetaError) Error() string

func (*ChunkMetaError) WrappedErrors

func (e *ChunkMetaError) WrappedErrors() []error

type ChunkSizeError

type ChunkSizeError struct {
	ChunkFilePath string
	Expected      uint32
	Actual        uint32
}

ChunkSizeError means that a chunk file is not the expected size.

func (*ChunkSizeError) Error

func (e *ChunkSizeError) Error() string

type CloseDB

type CloseDB interface {
	// 'CloseDB' is an extension of 'LogDB'.
	LogDB

	// Close performs some database-specific clean-up. It is an error to try to use a database after
	// closing it.
	//
	// Returns 'ErrClosed' if the database is already closed.
	Close() error
}

A CloseDB can be closed, which may perform some clean-up.

If a 'CloseDB' is also a 'PersistDB', then 'Sync' should be called during 'Close'. In addition, all 'LogDB' and 'PersistDB' with an error return will fail after calling 'Close', with 'ErrClosed'.

type CodingDB

type CodingDB struct {
	LogDB

	Encode func(interface{}) ([]byte, error)
	Decode func([]byte, interface{}) error
}

A CodingDB wraps a 'LogDB' with functions to encode and decode values of some sort, giving a higher-level interface than raw byte slices.

func BinaryCoder

func BinaryCoder(logdb LogDB, byteOrder binary.ByteOrder) *CodingDB

BinaryCoder creates a 'CodingDB' with the binary encoder/decoder. Values must be valid input for the 'binary.Write' function.

func GobCoder

func GobCoder(logdb LogDB) *CodingDB

GobCoder creates a 'CodingDB' with the gob encoder/decoder. Values must be valid input for the 'god.Encode' function.

func IdentityCoder

func IdentityCoder(logdb LogDB) *CodingDB

IdentityCoder creates a 'CodingDB' with the identity encoder/decoder. It is an error to append a value which is not a '[]byte'.

func (*CodingDB) AppendValue

func (db *CodingDB) AppendValue(value interface{}) (uint64, error)

AppendValue encodes a value using the encoder, and stores it in the underlying 'LogDB' is there is no error.

func (*CodingDB) AppendValues

func (db *CodingDB) AppendValues(values interface{}) (uint64, error)

AppendValues encodes a slice of values (represented as an 'interface{}', to make the casting simpler), and stores them in the underlying 'LogDB' if there is no error.

Returns 'ErrNotValueSlice' if called with a non-slice argument.

func (*CodingDB) GetValue

func (db *CodingDB) GetValue(id uint64, data interface{}) error

GetValue retrieves a value from the underlying 'LogDB' and decodes it.

type CompressingDB

type CompressingDB struct {
	LogDB

	Compress   func([]byte) ([]byte, error)
	Decompress func([]byte) ([]byte, error)
}

A CompressingDB wraps a 'LogDB' with functions to compress and decompress entries, applied transparently during 'Append', 'AppendEntries', and 'Get'.

func CompressDEFLATE

func CompressDEFLATE(logdb LogDB, level int) (*CompressingDB, error)

CompressDEFLATE creates a 'CompressingDB' with DEFLATE compression at the given level.

Returns an error if the level is < -2 or > 9.

func CompressIdentity

func CompressIdentity(logdb LogDB) *CompressingDB

CompressIdentity create a 'CompressingDB' with the identity compressor/decompressor.

func CompressLZW

func CompressLZW(logdb LogDB, order lzw.Order, litWidth int) (*CompressingDB, error)

CompressLZW creates a 'CompressingDB' with LZW compression with the given order and literal width.

Returns an error if the lit width is < 2 or > 8.

func (*CompressingDB) Append

func (db *CompressingDB) Append(entry []byte) (uint64, error)

Append implements the 'LogDB' interface. If the underlying 'LogDB' is also a 'PersistDB', 'BoundedDB', or 'CloseDB' then those interfaces are also implemented.

With 'BoundedDB' the bounded entry size is that of the compressed byte array; so it may be possible to insert entries which are larger than the bound.

func (*CompressingDB) AppendEntries

func (db *CompressingDB) AppendEntries(entries [][]byte) (uint64, error)

AppendEntries implements the 'LogDB' interface. If the underlying 'LogDB' is also a 'PersistDB', 'BoundedDB', or 'CloseDB' then those interfaces are also implemented.

With 'BoundedDB' the bounded entry size is that of the compressed byte array; so it may be possible to insert entries which are larger than the bound.

func (*CompressingDB) Get

func (db *CompressingDB) Get(id uint64) ([]byte, error)

Get implements the 'LogDB' interface. If the underlying 'LogDB; is also a 'CloseDB' then that interface is also implemented.

With 'BoundedDB' the bounded entry size is that of the compressed byte array; so it may be possible to retrieve entries which are larger than the bound.

type DeleteError

type DeleteError struct{ Err error }

DeleteError means that a file could not be deleted from disk. It wraps the actual error.

func (*DeleteError) Error

func (e *DeleteError) Error() string

func (*DeleteError) WrappedErrors

func (e *DeleteError) WrappedErrors() []error

type FormatError

type FormatError struct {
	FilePath string
	Err      error
}

FormatError means that there is a problem with the database files. It wraps the actual error.

func (*FormatError) Error

func (e *FormatError) Error() string

func (*FormatError) WrappedErrors

func (e *FormatError) WrappedErrors() []error

type InMemDB

type InMemDB struct {
	// contains filtered or unexported fields
}

InMemDB is an in-memory 'LogDB' implementation. As does not support persistence, it shouldn't be used in a production system. It is, however, helpful for benchmark comparisons as an absolute best case to compare against.

func (*InMemDB) Append

func (db *InMemDB) Append(entry []byte) (uint64, error)

Append implements the 'LogDB' interface.

func (*InMemDB) AppendEntries

func (db *InMemDB) AppendEntries(entries [][]byte) (uint64, error)

AppendEntries implements the 'LogDB' interface.

func (*InMemDB) Forget

func (db *InMemDB) Forget(newOldestID uint64) error

Forget implements the 'LogDB' interface.

func (*InMemDB) Get

func (db *InMemDB) Get(id uint64) ([]byte, error)

Get implements the 'LogDB' interface

func (*InMemDB) NewestID

func (db *InMemDB) NewestID() uint64

NewestID implements the 'LogDB' interface.

func (*InMemDB) OldestID

func (db *InMemDB) OldestID() uint64

OldestID implements the 'LogDB' interface.

func (*InMemDB) Rollback

func (db *InMemDB) Rollback(newNewestID uint64) error

Rollback implements the 'LogDB' interface.

func (*InMemDB) Truncate

func (db *InMemDB) Truncate(newOldestID, newNewestID uint64) error

Truncate implements the 'LogDB' interface.

type LockError

type LockError struct{ Err error }

LockError means that the database files could not be locked. It wraps the actual error.

func (*LockError) Error

func (e *LockError) Error() string

func (*LockError) WrappedErrors

func (e *LockError) WrappedErrors() []error

type LockFreeChunkDB

type LockFreeChunkDB struct {
	// contains filtered or unexported fields
}

A LockFreeChunkDB is a 'ChunkDB' with no internal locks. It is NOT safe for concurrent use.

func Open

func Open(path string, chunkSize uint32, create bool) (*LockFreeChunkDB, error)

Open a 'LockFreeChunkDB' database.

It is not possible to have multiple open references to the same database, as the files are locked. Concurrent usage of one open handle in a single process is safe.

The log is stored on disk in fixed-size files, controlled by the 'chunkSize' parameter. Entries are not split over chunks, and so if entries are a fixed size, the chunk size should be a multiple of that to avoid wasting space. Furthermore, no entry can be larger than the chunk size. There is a trade-off to be made: a chunk is only deleted when its entries do not overlap with the live entries at all (this happens through calls to 'Forget' and 'Rollback'), so a larger chunk size means fewer files, but longer persistence.

If the 'create' flag is true and the database doesn't already exist, the database is created using the given chunk size. If the database does exist, the chunk size parameter is ignored, and detected automatically from the chunk files.

func (*LockFreeChunkDB) Append

func (db *LockFreeChunkDB) Append(entry []byte) (uint64, error)

Append implements the 'LogDB', 'PersistDB', 'BoundedDB', and 'CloseDB' interfaces.

func (*LockFreeChunkDB) AppendEntries

func (db *LockFreeChunkDB) AppendEntries(entries [][]byte) (uint64, error)

AppendEntries implements the 'LogDB', 'PersistDB', 'BoundedDB', and 'CloseDB' interfaces.

func (*LockFreeChunkDB) Close

func (db *LockFreeChunkDB) Close() error

Close implements the 'CloseDB' interface.

func (*LockFreeChunkDB) Forget

func (db *LockFreeChunkDB) Forget(newOldestID uint64) error

Forget implements the 'LogDB', 'PersistDB', and 'CloseDB' interfaces.

func (*LockFreeChunkDB) Get

func (db *LockFreeChunkDB) Get(id uint64) ([]byte, error)

Get implements the 'LogDB' and 'CloseDB' interfaces.

func (*LockFreeChunkDB) MaxEntrySize

func (db *LockFreeChunkDB) MaxEntrySize() uint64

MaxEntrySize implements the 'BoundedDB' interface.

func (*LockFreeChunkDB) NewestID

func (db *LockFreeChunkDB) NewestID() uint64

NewestID implements the 'LogDB' interface.

func (*LockFreeChunkDB) OldestID

func (db *LockFreeChunkDB) OldestID() uint64

OldestID implements the 'LogDB' interface.

func (*LockFreeChunkDB) Rollback

func (db *LockFreeChunkDB) Rollback(newNewestID uint64) error

Rollback implements the 'LogDB', 'PersistDB', and 'CloseDB' interfaces.

func (*LockFreeChunkDB) SetSync

func (db *LockFreeChunkDB) SetSync(every int) error

SetSync implements the 'PersistDB' and 'CloseDB' interface.

func (*LockFreeChunkDB) Sync

func (db *LockFreeChunkDB) Sync() error

Sync implements the 'PersistDB' and 'CloseDB' interface.

func (*LockFreeChunkDB) Truncate

func (db *LockFreeChunkDB) Truncate(newOldestID, newNewestID uint64) error

Truncate implements the 'LogDB', 'PersistDB', and 'CloseDB' interfaces.

type LogDB

type LogDB interface {
	// Append writes a new entry to the log and returns its ID.
	//
	// Returns 'WriteError' value if the database files could not be written to.
	Append(entry []byte) (uint64, error)

	// AppendEntries atomically writes a collection of new entries to the log and returns the ID of the
	// first (the IDs are contiguous). If the slice is empty or nil, the returned ID is meaningless.
	//
	// Returns the same errors as 'Append', and an 'AtomicityError' value if any entry fails to
	// append and rolling back the log failed.
	AppendEntries(entries [][]byte) (uint64, error)

	// Get looks up an entry by ID.
	//
	// Returns 'ErrIDOutOfRange' if the requested ID is lesser than the oldest or greater than the
	// newest.
	Get(id uint64) ([]byte, error)

	// Forget removes entries from the end of the log.
	//
	// If the new "oldest" ID is older than the current, this is a no-op.
	//
	// Returns 'ErrIDOutOfRange' if the ID is newer than the "newest" ID.
	Forget(newOldestID uint64) error

	// Rollback removes entries from the head of the log.
	//
	// If the new "newest" ID is newer than the current, this is a no-op.
	//
	// Returns the same errors as 'Forget', with 'ErrIDOutOfRange' being returned if the ID is older
	// than the "oldest" ID.
	Rollback(newNewestID uint64) error

	// Truncate performs a 'Forget' followed by a 'Rollback' atomically. The semantics are that if
	// the 'Forget' fails, the 'Rollback' is not performed; but the 'Forget' is not undone either.
	//
	// Returns the same errors as 'Forget' and 'Rollback', and also an 'ErrIDOutOfRange' if the new
	// newest < the new oldest.
	Truncate(newOldestID, newNewestID uint64) error

	// OldestID gets the ID of the oldest log entry.
	//
	// For an empty database, this will return 0.
	OldestID() uint64

	// NewestID gets the ID of the newest log entry.
	//
	// For an empty database, this will return 0.
	NewestID() uint64
}

A LogDB is a log-structured database.

type MetaContinuityError

type MetaContinuityError struct {
	Expected int32
	Actual   int32
}

MetaContinuityError means that the metadata for a chunk does not contain a contiguous sequence of entries.

func (*MetaContinuityError) Error

func (e *MetaContinuityError) Error() string

type MetaOffsetError

type MetaOffsetError struct {
	Expected int32
	Actual   int32
}

MetaOffsetError means that the metadata for a chunk does not contain a monotonically increasing sequence of entry ending offsets.

func (*MetaOffsetError) Error

func (e *MetaOffsetError) Error() string

type PathError

type PathError struct{ Err error }

PathError means that a directory could not be created. It wraps the actual error.

func (*PathError) Error

func (e *PathError) Error() string

func (*PathError) WrappedErrors

func (e *PathError) WrappedErrors() []error

type PersistDB

type PersistDB interface {
	// 'PersistDB' is an extension of 'LogDB'.
	LogDB

	// SetSync configures the database to synchronise the data after touching (appending, forgetting,
	// or rolling back) at most this many entries.
	//
	// <0 disables periodic syncing, and 'Sync' must be called instead. The default value is 256.
	// Both 0 and 1 cause a 'Sync' after every write.
	//
	// Returns a 'SyncError' value if this triggered an immediate synchronisation which failed, and
	// 'ErrClosed' if the handle is closed.
	SetSync(every int) error

	// Sync persists the data now.
	//
	// May return a SyncError value, and 'ErrClosed' if the handle is closed.
	Sync() error
}

A PersistDB is a database which can be persisted in some fashion. In addition to defining methods, a 'PersistDB' changes some existing behaviours:

  • 'Append', 'AppendEntries', 'Forget', 'Rollback', and 'Truncate' can now cause a 'Sync', if 'SetSync' has been called.

  • The above may return a 'SyncError' value if a periodic synchronisation failed.

type ReadError

type ReadError struct{ Err error }

ReadError means that a read failed. It wraps the actual error.

func (*ReadError) Error

func (e *ReadError) Error() string

func (*ReadError) WrappedErrors

func (e *ReadError) WrappedErrors() []error

type SyncError

type SyncError struct{ Err error }

SyncError means that a file could not be synced to disk. It wraps the actual error.

func (*SyncError) Error

func (e *SyncError) Error() string

func (*SyncError) WrappedErrors

func (e *SyncError) WrappedErrors() []error

type WriteError

type WriteError struct{ Err error }

WriteError means that a write failed. It wraps the actual error.

func (*WriteError) Error

func (e *WriteError) Error() string

func (*WriteError) WrappedErrors

func (e *WriteError) WrappedErrors() []error

Directories

Path Synopsis
cmd
Package raft provides a wrapper making a 'LogDB' appropriate for use as a 'LogStore' for the github.com/hashicorp/raft library.
Package raft provides a wrapper making a 'LogDB' appropriate for use as a 'LogStore' for the github.com/hashicorp/raft library.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL