barrel

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 18, 2022 License: MIT Imports: 16 Imported by: 0

README

logo

barreldb

A disk based key-value store based on Bitcask.

Go Reference Go Report Card GitHub Actions


BarrelDB is a Golang implementation of Bitcask by Riak paper and aims to closely follow the spec.

Bitcask is based on a log-structured hash table to store key-value data on disk. It opens a "datafile" (term used for a Bitcask DB file) in an append-only mode and all the writes are sequentially written to this file. Additionally, it also updates an in-memory hash table which maps the key with the offset of the record in the file. This clever yet simple design decision makes it possible to retrieve records from the disk using a single disk seek.

Benefits of this approach
  • Low Latency: Write queries are handled with a single O(1) disk seek. Keys lookup happen in memory using a hash table lookup. This makes it possible to achieve low latency even with a lot of keys/values in the database. Bitcask also relies on the filesystem read-ahead cache for a faster reads.
  • High Throughput: Since the file is opened in "append only" mode, it can handle large volumes of write operations with ease.
  • Predictable performance: The DB has a consistent performance even with growing number of records. This can be seen in benchmarks as well.
  • Crash friendly: Bitcask commits each record to the disk and also generates a "hints" file which makes it easy to recover in case of a crash.
  • Elegant design: Bitcask achieves a lot just by keeping the architecture simple and relying on filesystem primitives for handling complex scenarios (for eg: backup/recovery, cache etc).
  • Ability to handle datasets larger than RAM.
Limitations
  • The main limitation is that all the keys must fit in RAM since they're held inside as an in-memory hash table. A potential workaround for this could be to shard the keys in multiple buckets. Incoming records can be hashed into different buckets based on the key. A shard based approach allows each bucket to have limited RAM usage.

Internals

You can refer to Writing a disk-based key-value store in Golang blog post to read about the internals of Bitcask which also explains how BarrelDB works.

Usage

Library
import (
	"github.com/mr-karan/barreldb"
)

barrel, _ := barrel.Init(barrel.WithDir("data/"))

// Set a key.
barrel.Put("hello", []byte("world"))

// Fetch the key.
v, _ := barrel.Get("hello")

// Delete a key.
barrel.Delete("hello")

// Set with expiry.
barrel.PutEx("hello", []byte("world"), time.Second * 3)

For a complete example, visit examples.

Redis Client

barreldb implements the API over a simple Redis-compatible server (barreldb):

127.0.0.1:6379> set hello world
OK
127.0.0.1:6379> get hello
"world"
127.0.0.1:6379> set goodbye world 10s
OK
127.0.0.1:6379> get goodbye
"world"
127.0.0.1:6379> get goodbye
ERR: invalid key: key is already expired

Benchmarks

Using make bench:

go test -bench=. -benchmem ./...
HELLO
goos: linux
goarch: amd64
pkg: github.com/mr-karan/barreldb
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
BenchmarkPut/DisableSync-8                385432              3712 ns/op        1103.48 MB/s          88 B/op          4 allocs/op
BenchmarkPut/AlwaysSync-8                    222           5510563 ns/op           0.74 MB/s         115 B/op          4 allocs/op
BenchmarkGet-8                            840627              1304 ns/op        3142.20 MB/s        4976 B/op          5 allocs/op
PASS
ok      github.com/mr-karan/barreldb 10.751s

Using redis-benchmark:

$ redis-benchmark -p 6379 -t set -n 10000 -r 100000000
Summary:
  throughput summary: 140845.06 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.196     0.016     0.175     0.255     1.031     2.455

$ redis-benchmark -p 6379 -t set -n 200000 -r 100000000
Summary:
  throughput summary: 143678.17 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.184     0.016     0.183     0.223     0.455     2.183

$ redis-benchmark -p 6379 -t get -n 100000 -r 100000000
Summary:
  throughput summary: 170068.03 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.153     0.040     0.143     0.199     0.367     1.447

References

Documentation

Index

Constants

View Source
const (
	LOCKFILE   = "barrel.lock"
	HINTS_FILE = "barrel.hints"
)
View Source
const (
	MaxKeySize   = 1<<32 - 1
	MaxValueSize = 1<<32 - 1
)

Variables

View Source
var (
	ErrLocked   = errors.New("a lockfile already exists")
	ErrReadOnly = errors.New("operation not allowed in read only mode")

	ErrChecksumMismatch = errors.New("invalid data: checksum does not match")

	ErrEmptyKey   = errors.New("invalid key: key cannot be empty")
	ErrExpiredKey = errors.New("invalid key: key is already expired")
	ErrLargeKey   = errors.New("invalid key: size cannot be more than 4294967296 bytes")
	ErrNoKey      = errors.New("invalid key: key is either deleted or expired or unset")

	ErrLargeValue = errors.New("invalid value: size cannot be more than 4294967296 bytes")
)

Functions

This section is empty.

Types

type Barrel

type Barrel struct {
	sync.Mutex
	// contains filtered or unexported fields
}

func Init

func Init(cfg ...Config) (*Barrel, error)

Init initialises a datastore for storing data.

func (*Barrel) Delete

func (b *Barrel) Delete(k string) error

Delete creates a tombstone record for the given key. The tombstone value is simply an empty byte array. Actual deletes happen in background when merge is called. Since the file is opened in append-only mode, the new value of the key is overwritten both on disk and in memory as a tombstone record.

func (*Barrel) ExamineFileSize

func (b *Barrel) ExamineFileSize(evalInterval time.Duration)

ExamineFileSize checks for file size at a periodic interval. It examines the file size of the active db file and marks it as stale if the file size exceeds the configured size.

func (*Barrel) Fold

func (b *Barrel) Fold(fn func(k string) error) error

Fold iterates over all keys and calls the given function for each key.

func (*Barrel) Get

func (b *Barrel) Get(k string) ([]byte, error)

Get takes a key and finds the metadata in the in-memory hashtable (Keydir). Using the offset present in metadata it finds the record in the datafile with a single disk seek. It further decodes the record and returns the value as a byte array for the given key.

func (*Barrel) Len

func (b *Barrel) Len() int

Len iterates over all keys and returns the total number of keys.

func (*Barrel) List

func (b *Barrel) List() []string

List iterates over all keys and returns the list of keys.

func (*Barrel) Put

func (b *Barrel) Put(k string, val []byte) error

Put takes a key and value and encodes the data in bytes and writes to the db file. It also stores the key with some metadata in memory. This metadata helps for faster reads as the last position of the file is recorded so only a single disk seek is required to read value.

func (*Barrel) PutEx

func (b *Barrel) PutEx(k string, val []byte, ex time.Duration) error

PutEx is same as Put but also takes an additional expiry time.

func (*Barrel) RunCompaction

func (b *Barrel) RunCompaction(evalInterval time.Duration)

RunCompaction runs cleanup process to compact the keys and cleanup dead/expired keys at a periodic interval. This helps to save disk space and merge old inactive db files in a single file. It also generates a hints file which helps in caching all the keys during a cold start.

func (*Barrel) Shutdown

func (b *Barrel) Shutdown() error

Shutdown closes all the open file descriptors and removes any file locks. If non running in a read-only mode, it's essential to call close so that it removes any file locks on the database directory. Not calling close will prevent future startups until it's removed manually.

func (*Barrel) Sync

func (b *Barrel) Sync() error

Sync calls fsync(2) on the active data file.

func (*Barrel) SyncFile

func (b *Barrel) SyncFile(evalInterval time.Duration)

SyncFile checks for file size at a periodic interval. It examines the file size of the active db file and marks it as stale if the file size exceeds the configured size.

type Config

type Config func(*Options) error

Config is a function on the Options for barreldb. These are used to configure particular options.

func WithAlwaysSync

func WithAlwaysSync() Config

func WithAutoSync

func WithAutoSync() Config

func WithBackgrondSync

func WithBackgrondSync(interval time.Duration) Config

func WithCheckFileSizeInterval

func WithCheckFileSizeInterval(interval time.Duration) Config

func WithCompactInterval

func WithCompactInterval(interval time.Duration) Config

func WithDebug

func WithDebug() Config

func WithDir

func WithDir(dir string) Config

func WithMaxActiveFileSize

func WithMaxActiveFileSize(size int64) Config

func WithReadOnly

func WithReadOnly() Config
type Header struct {
	Checksum  uint32
	Timestamp uint32
	Expiry    uint32
	KeySize   uint32
	ValSize   uint32
}

Header represents the fixed width fields present at the start of every record.

type KeyDir

type KeyDir map[string]Meta

KeyDir represents an in-memory hash for faster lookups of the key. Once the key is found in the map, the additional metadata like the offset record and the file ID is used to extract the underlying record from the disk. Advantage is that this approach only requires a single disk seek of the db file since the position offset (in bytes) is already stored.

func (*KeyDir) Decode

func (k *KeyDir) Decode(fPath string) error

Decode decodes the gob data in the map.

func (*KeyDir) Encode

func (k *KeyDir) Encode(fPath string) error

Encode encodes the map to a gob file. This is typically used to generate a hints file. Caller of this program should ensure to lock/unlock the map before calling.

type Meta

type Meta struct {
	Timestamp  int
	RecordSize int
	RecordPos  int
	FileID     int
}

Meta represents some additional properties for the given key. The actual value of the key is not stored in the in-memory hashtable.

type Options

type Options struct {
	// contains filtered or unexported fields
}

Options represents configuration options for managing a datastore.

func DefaultOptions

func DefaultOptions() *Options

type Record

type Record struct {
	Header Header
	Key    string
	Value  []byte
}

Record is a binary representation of how each record is persisted in the disk. Header represents how the record is stored and some metadata with it. For storing CRC checksum hash, timestamp and expiry of record, each field uses 4 bytes. (uint32 == 32 bits). The next field stores the max size of the key which is also represented with uint32. So the max size of the key can not be more than 2^32-1 which is ~ 4.3GB. The next field stores the max size of the value which is also represented with unint32. Max size of value can not be more than 2^32-1 which is ~ 4.3GB.

Each entry cannot exceed more than ~8.6GB as a theoretical limit. In a practical sense, this is also constrained by the memory of the underlying VM where this program would run.

Representation of the record stored on disk. ------------------------------------------------------------------------------ | crc(4) | time(4) | expiry (4) | key_size(4) | val_size(4) | key | val | ------------------------------------------------------------------------------

Directories

Path Synopsis
cmd
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL