lsmtree

package
v0.0.0-...-b0c7fd6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 18, 2023 License: MIT Imports: 21 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CompactionRule

type CompactionRule struct {
	// Level is the level of the segment that will be compacted.
	Level int
	// TargetLevel is the level where the compacted segment will be placed.
	TargetLevel int
	// MaxSegments is the maximum number of segments that can be in the level.
	MaxSegments int
	// MaxLevelSize is the maximum size of the level in bytes.
	MaxLevelSize int64
}

type Config

type Config struct {
	// Logger is the logger used to log the events inside the LSM-tree,
	// such as flushing memtables to disk. Defaults to a no-op logger.
	Logger log.Logger
	// DataRoot is the directory where the lsm-tree will be stored. Has no effect
	// if DataFS is specified. Defaults to the current working directory.
	DataRoot string
	// MaxMemtableSize is the maximum number of entries in the memtable before
	// it is flushed to disk. Defaults to 1000.
	MaxMemtableSize int64
	// BloomFilterProbability is the probability of false positives in the bloom filter.
	// It will be used to dynamically calculate the number of hash functions and the size
	// of the bloom filter. Defaults to 0.01 which means that there is a 1% chance of
	// false positives.
	BloomFilterProbability float64
	// SparseIndexGapBytes is the size of the gap in bytes between the index entries in the
	// sparse index. Larger gaps result in smaller index files, but slower lookups. Defaults
	// to 64KB.
	SparseIndexGapBytes int64
	// UseMmap enables memory mapping of SSTable files. Although it may have a
	// positive impact on performance due to reduced number of syscalls, it is
	// generally advised not to use mmap in databases, so it is disabled by default.
	// Please check out the following paper for more details:
	// https://db.cs.cmu.edu/mmap-cidr2022/
	UseMmap bool
	// CompactionRules is a list of compaction rules that will be used to determine
	// when to compact the segments. Defaults to compacting the level 0 after
	// reaching 10 segments and the level 1 is limited to 1 segment.
	CompactionRules []CompactionRule
}

func DefaultConfig

func DefaultConfig() Config

type LSMTree

type LSMTree struct {
	// contains filtered or unexported fields
}

LSMTree is a persistent key-value store based on the LSM-Tree data structure. It is a write-optimized, which means that it is optimized for writes, but reads may be slow.

func Create

func Create(conf Config) (*LSMTree, error)

Create initializes a new LSM-Tree instance in the directory given in the config. It restores the state of the tree from the previous run if it exists. Otherwise it creates a new tree.

func (*LSMTree) Close

func (lsm *LSMTree) Close() error

Close closes the LSM tree. It will wait for all pending flushes to complete, and then close all the sstables and the state file. One should ensure that no reads or writes are performed after calling this method.

func (*LSMTree) Get

func (lsm *LSMTree) Get(key string) (*proto.DataEntry, bool, error)

Get returns the value for the given key, if it exists. It checks the active memtable first, then the memtables that are waiting to be flushed, and finally the sstables on disk. Note that the returned entry is a pointer to the actual entry in the memtable or sstable, so it should not be modified.

func (*LSMTree) Put

func (lsm *LSMTree) Put(entry *proto.DataEntry) error

Put puts a data entry into the LSM tree. It will first check if the active memtable is full, and if so, it will create a new one and flush the old one to disk. If the memtable is not full, it will add the entry to the active memtable.

type Memtable

type Memtable struct {
	*MemtableInfo
	// contains filtered or unexported fields
}

func (*Memtable) Contains

func (mt *Memtable) Contains(key string) bool

Contains returns true if the memtable contains an entry with the given key.

func (*Memtable) Discard

func (mt *Memtable) Discard() error

Discard removes data files associated with the memtable. It is used when the memtable is no longer needed, e.g. when it is merged into a SSTable.

func (*Memtable) Get

func (mt *Memtable) Get(key string) (*proto.DataEntry, bool)

Get returns an entry with the given key. If the entry does not exist or is a tombstone, the second return value is false.

func (*Memtable) Iter

Iter returns an iterator over the memtable.

func (*Memtable) Len

func (mt *Memtable) Len() int

Len returns the number of entries in the memtable.

func (*Memtable) Put

func (mt *Memtable) Put(entry *proto.DataEntry) error

Put inserts a new entry into the memtable. The entry is first appended to the WAL file and then inserted into the memtable. If the entry already exists in the memtable, it is overwritten. Removing an entry is done by inserting a entry with a tombstone flag set to true.

func (*Memtable) Size

func (mt *Memtable) Size() int64

Size returns the size of the memtable in bytes, represented by the size of the WAL file.

func (*Memtable) Sync

func (mt *Memtable) Sync() error

type MemtableInfo

type MemtableInfo struct {
	ID      int64
	WALFile string
}

type SSTable

type SSTable struct {
	*SSTableInfo
	// contains filtered or unexported fields
}

SSTable is a sorted string table. It is a collection of key/value pairs that are sorted by key. It is immutable, and is used to store data on disk.

func OpenTable

func OpenTable(info *SSTableInfo, prefix string, useMmap bool) (*SSTable, error)

OpenTable opens an SSTable from the given paths. All files must exist, and the parameters of the bloom filter must match the parameters used to create the SSTable.

func (*SSTable) Close

func (sst *SSTable) Close() error

Close closes the SSTable, freeing up any resources it is using. Once closed, any current or subsequent calls to Get will fail. Note that ine index reader is already closed during loadIndex.

func (*SSTable) Get

func (sst *SSTable) Get(key string) (*proto.DataEntry, bool, error)

func (*SSTable) Iterator

func (sst *SSTable) Iterator() *protoio.Iterator[*proto.DataEntry]

Iterator returns an iterator over the SSTable.

func (*SSTable) MayContain

func (sst *SSTable) MayContain(key string) bool

MayContain checks the underlying bloom filter to see if the key is in the SSTable. This is a fast operation, and can be used to avoid accessing the disk if the key is not present. Yet, it may return false positives.

type SSTableInfo

type SSTableInfo struct {
	ID         int64
	Level      int
	NumEntries int64
	Size       int64
	IndexFile  string
	DataFile   string
	BloomFile  string
}

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL