types

package
v0.4.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 20, 2023 License: MPL-2.0 Imports: 4 Imported by: 1

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrNotFound is our own version of raft's not found error. It's important
	// it's exactly the same because the raft lib checks for equality with it's
	// own type as a crucial part of replication processing (detecting end of logs
	// and that a snapshot is needed for a follower).
	ErrNotFound = raft.ErrLogNotFound
	ErrCorrupt  = errors.New("WAL is corrupt")
	ErrSealed   = errors.New("segment is sealed")
	ErrClosed   = errors.New("closed")
)

Functions

This section is empty.

Types

type LogEntry

type LogEntry struct {
	Index uint64
	Data  []byte
}

LogEntry represents an entry that has already been encoded.

type MetaStore

type MetaStore interface {
	// Load loads the existing persisted state. If there is no existing state
	// implementations are expected to create initialize new storage and return an
	// empty state.
	Load(dir string) (PersistentState, error)

	// CommitState must atomically replace all persisted metadata in the current
	// store with the set provided. It must not return until the data is persisted
	// durably and in a crash-safe way otherwise the guarantees of the WAL will be
	// compromised. The WAL will only ever call this in a single thread at one
	// time and it will never be called concurrently with Load however it may be
	// called concurrently with Get/SetStable operations.
	CommitState(PersistentState) error

	// GetStable returns a value from stable store or nil if it doesn't exist. May
	// be called concurrently by multiple threads.
	GetStable(key []byte) ([]byte, error)

	// SetStable stores a value from stable store. May be called concurrently with
	// GetStable.
	SetStable(key, value []byte) error

	io.Closer
}

MetaStore is the interface we need to some persistent, crash safe backend. We implement it with BoltDB for real usage but the interface allows alternatives to be used, or tests to mock out FS access.

type PersistentState

type PersistentState struct {
	NextSegmentID uint64
	Segments      []SegmentInfo
}

PersistentState represents the WAL file metadata we need to store reliably to recover on restart.

type PooledBuffer

type PooledBuffer struct {
	Bs      []byte
	CloseFn func()
}

PooledBuffer is a wrapper that allows WAL to return read buffers to segment implementations when we're done decoding.

func (*PooledBuffer) Close

func (b *PooledBuffer) Close() error

Close implements io.Closer and returns the buffer to the pool. It should be called exactly once for each buffer when it's no longer needed. It's no longer safe to access Bs or any slice taken from it after the call.

type ReadableFile

type ReadableFile interface {
	io.ReaderAt
	io.Closer
}

ReadableFile provides random read access to a file.

type SegmentFiler

type SegmentFiler interface {
	// Create adds a new segment with the given info and returns a writer or an
	// error.
	Create(info SegmentInfo) (SegmentWriter, error)

	// RecoverTail is called on an unsealed segment when re-opening the WAL it
	// will attempt to recover from a possible crash. It will either return an
	// error, or return a valid segmentWriter that is ready for further appends.
	// If the expected tail segment doesn't exist it must return an error wrapping
	// os.ErrNotExist.
	RecoverTail(info SegmentInfo) (SegmentWriter, error)

	// Open an already sealed segment for reading. Open may validate the file's
	// header and return an error if it doesn't match the expected info.
	Open(info SegmentInfo) (SegmentReader, error)

	// List returns the set of segment IDs currently stored. It's used by the WAL
	// on recovery to find any segment files that need to be deleted following a
	// unclean shutdown. The returned map is a map of ID -> BaseIndex. BaseIndex
	// is returned to allow subsequent Delete calls to be made.
	List() (map[uint64]uint64, error)

	// Delete removes the segment with given baseIndex and id if it exists. Note
	// that baseIndex is technically redundant since ID is unique on it's own. But
	// in practice we name files (or keys) with both so that they sort correctly.
	// This interface allows a  simpler implementation where we can just delete
	// the file if it exists without having to scan the underlying storage for a.
	Delete(baseIndex, ID uint64) error
}

SegmentFiler is the interface that provides access to segments to the WAL. It encapsulated creating, and recovering segments and returning reader or writer interfaces to interact with them. It's main purpose is to abstract the core WAL logic both from the actual encoding layer of segment files. You can think of it as a layer of abstraction above the VFS which abstracts actual file system operations on files but knows nothing about the format. In tests for example we can implement a SegmentFiler that is way simpler than the real encoding/decoding layer on top of a VFS - even an in-memory VFS which makes tests much simpler to write and run.

type SegmentInfo

type SegmentInfo struct {
	// ID uniquely identifies this segment file
	ID uint64

	// BaseIndex is the raft index of the first entry that will be written to the
	// segment.
	BaseIndex uint64

	// MinIndex is the logical lowest index that still exists in the segment. It
	// may be greater than BaseIndex if a head truncation has "deleted" a prefix
	// of the segment.
	MinIndex uint64

	// MaxIndex is the logical highest index that still exists in the segment. It
	// may be lower than the actual highest index if a tail truncation has
	// "deleted" a suffix of the segment. It is zero for unsealed segments and
	// only set one seal.
	MaxIndex uint64

	// Codec identifies the codec used to encode log entries. Codec values 0 to
	// 16k (i.e. the lower 16 bits) are reserved for internal future usage. Custom
	// codecs must be registered with an identifier higher than this which the
	// caller is responsible for ensuring uniquely identifies the specific version
	// of their codec used in any given log. uint64 provides sufficient space that
	// a randomly generated identifier is almost certainly unique.
	Codec uint64

	// IndexStart is the file offset where the index can be read from it's 0 for
	// tail segments and only set after a segment is sealed.
	IndexStart uint64

	// CreateTime records when the segment was first created.
	CreateTime time.Time

	// SealTime records when the segment was sealed. Zero indicates that it's not
	// sealed yet.
	SealTime time.Time

	// SizeLimit is the soft limit for the segment's size. The segment file may be
	// pre-allocated to this size on filesystems that support it. It is a soft
	// limit in the sense that the final Append usually takes the segment file
	// past this size before it is considered full and sealed.
	SizeLimit uint32
}

SegmentInfo is the metadata describing a single WAL segment.

type SegmentReader

type SegmentReader interface {
	io.Closer

	// GetLog returns the raw log entry bytes associated with idx. If the log
	// doesn't exist in this segment ErrNotFound must be returned.
	GetLog(idx uint64) (*PooledBuffer, error)
}

SegmentReader wraps a ReadableFile to allow lookup of logs in an existing segment file. It's an interface to make testing core WAL simpler. The first call will always be validate which passes in the ReaderAt to be used for subsequent reads.

type SegmentWriter

type SegmentWriter interface {
	io.Closer
	SegmentReader

	// Append adds one or more entries. It must not return until the entries are
	// durably stored otherwise raft's guarantees will be compromised. Append must
	// not be called concurrently with any other call to Sealed, Append or
	// ForceSeal.
	Append(entries []LogEntry) error

	// Sealed returns whether the segment is sealed or not. If it is it returns
	// true and the file offset that it's index array starts at to be saved in
	// meta data. WAL will call this after every append so it should be relatively
	// cheap in the common case. This design allows the final Append to write out
	// the index or any additional data needed at seal time in the same fsync.
	// Sealed must not be called concurrently with any other call to Sealed,
	// Append or ForceSeal.
	Sealed() (bool, uint64, error)

	// ForceSeal causes the segment to become sealed by writing out an index
	// block. This is not used in the typical flow of append and rotation, but is
	// necessary during truncations where some suffix of the writer needs to be
	// truncated. Rather than manipulate what is on disk in a complex way, the WAL
	// will simply force seal it with whatever state it has already saved and then
	// open a new segment at the right offset for continued writing. ForceSeal may
	// be called on a segment that has already been sealed and should just return
	// the existing index offset in that case. (We don't actually rely on that
	// currently but it's easier not to assume we'll always call it at most once).
	// ForceSeal must not be called concurrently with any other call to Sealed,
	// Append or ForceSeal.
	ForceSeal() (uint64, error)

	// LastIndex returns the most recently persisted index in the log. It must
	// respond without blocking on Append since it's needed frequently by read
	// paths that may call it concurrently. Typically this will be loaded from an
	// atomic int. If the segment is empty lastIndex should return zero.
	LastIndex() uint64
}

SegmentWriter manages appending logs to the tail segment of the WAL. It's an interface to make testing core WAL simpler. Every SegmentWriter will have either `init` or `recover` called once before any other methods. When either returns it must either return an error or be ready to accept new writes and reads.

type VFS

type VFS interface {
	// ListDir returns a list of all files in the specified dir in lexicographical
	// order. If the dir doesn't exist, it must return an error. Empty array with
	// nil error is assumed to mean that the directory exists and was readable,
	// but contains no files.
	ListDir(dir string) ([]string, error)

	// Create creates a new file with the given name. If a file with the same name
	// already exists an error is returned. If a non-zero size is given,
	// implementations should make a best effort to pre-allocate the file to be
	// that size. The dir must already exist and be writable to the current
	// process.
	Create(dir, name string, size uint64) (WritableFile, error)

	// Delete indicates the file is no longer required. Typically it should be
	// deleted from the underlying system to free disk space.
	Delete(dir, name string) error

	// OpenReader opens an existing file in read-only mode. If the file doesn't
	// exist or permission is denied, an error is returned, otherwise no checks
	// are made about the well-formedness of the file, it may be empty, the wrong
	// size or corrupt in arbitrary ways.
	OpenReader(dir, name string) (ReadableFile, error)

	// OpenWriter opens a file in read-write mode. If the file doesn't exist or
	// permission is denied, an error is returned, otherwise no checks are made
	// about the well-formedness of the file, it may be empty, the wrong size or
	// corrupt in arbitrary ways.
	OpenWriter(dir, name string) (WritableFile, error)
}

VFS is the interface WAL needs to interact with the file system. In production it would normally be implemented by RealFS which interacts with the operating system FS using standard go os package. It's useful to allow testing both to run quicker (by being in memory only) and to make it easy to simulate all kinds of disk errors and failure modes without needing a more elaborate external test harness like ALICE.

type WritableFile

type WritableFile interface {
	io.WriterAt
	io.ReaderAt
	io.Closer

	Sync() error
}

WritableFile provides random read-write access to a file as well as the ability to fsync it to disk.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL