llrb

package
v0.0.0-...-b47ea92 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 10, 2021 License: MIT Imports: 20 Imported by: 2

README

Left Leaning Red Black tree (LLRB)

GoDoc

LLRB can manage an in-memory instance of sorted index using left-leaning-red-black tree. LLRB is self balancing tree that supports all basic write operations, like create, update, delete. Additionally there is an MVCC (Multi-Version-Concurrency-Control) implementation of lef-leaning-read-black-tree.

  • Entry also called as llrb-node has a key, value pair.
  • Key are binary string that can handle comparision operation.
  • Value can be a blob of binary, text or JSON. LLRB don't interpret the shape of Value.

Background routines

LLRB without MVCC is a passive library that serializes all index-read and index-write operations using rw-lock. That is, all write and read operations are serialized, while there can be concurrent readers. Based on the operations, an LLRB instance will incur memory cost. Apart from that it is straight forward and should not throw any surprises.

When using MVCC, all index-write operations are serialsed but there can be any number of concurrent readers on a MVCC snapshots. Unlike LLRB, write operations don't block read operations. For each MVCC instance there will be single go-routine spawned to generate periodic snapshots.

Snapshots

Snapshots matter only with MVCC. For write intensive applications, it is recommended to use LLRB. While Read intensive applications might want to use MVCC and use concurrent readers to scale with number of cores.

Memory fragmentation

Memory fragmentation is when most of the memory is allocated in a large number of non-contiguous blocks, or chunks - leaving a good percentage of total memory unallocated, but unusable by rest of the system. This can manifest itself as out of memory exceptions. Sadly this can happen to any process running on the machine, even though it is not at fault.

LLRB tree is built to handle large number of upsert/delete operations which can quickly lead to memory fragmentation. Especially when entries are deleted in a particular pattern, where each pool allocated from OS contains just few allocations.

A proper solution would be to have an allocator that can directly manage the the CPU/Memory virtual pages. Even then fragmentation issue won't be solved completely, and, having an allocator that is tightly bolted to bare metal will open up a new set of issues.

One idea that can be employed to avoid memory-fragmentation is to use copying-garbage-collection. LLRB does support Copying GC out of the box, but here is a set of limitations and recommentations that applications can use:

  • Pointer re-write is not possible and not recommended.
  • Allocate a new LLRB and start populating it with current set of entries.
  • This means, holding an iterator for a long time.
  • With MVCC, Iterations won't lock the writer and won't interfer with other concurrent readers. But if there are hundreds and thousands of mutations happening in the background, while the iterator is holding the snapshot, it can lead to huge memory pressure.
  • If applications maintain a seqno for all mutations, then it is possible to build a piece-wise Iterator() that can be released for every few milliseconds. Refer #35.

Log-Structured-Merge (LSM)

Log-Structured-Merge (LSM) is supported at api level. Specifically with Delete API.

  • Delete will simply be marked as deleted and seqno is updated to current seqno.
  • For Delete operation, if entry is missing in the index. An entry will be inserted and then marked as deleted with its seqno updated to current-seqno.
  • When a key is marked as deleted and Upserted again, the delete operation will get de-duped.

Package lsm/ provides a set of API that can do merge-get and merge-sort on LSM enabled data-structures.

Compare-And-Set (CAS)

CAS operations help in atomic updates to index entries. It ensures that index entry does not change between a previous read/write operation and next write operation. CAS operation has the following effects:

  • If CAS is > ZERO, Get key from the index, and check whether its seqno matches with the supplied CAS value.
  • If CAS is ZERO, then it is equivalent to CREATE operation, and expects that the key is not present in the index.

If LLRB tree is holding only a subset, called the working-set, of an index, it is application's responsibility to do CAS match with full set of index and convert the CAS operation into plain Upsert operation.

Panic and Recovery

Panics are to be expected when APIs are misused. Programmers might choose to ignore the errors, but not panics. For example:

  • Validate() will panic if there is a fatal error.

Documentation

Overview

Package llrb implement a self-balancing verions of binary-tree, called, LLRB (Left Leaning Red Black).

  • Index key, value (value is optional).
  • Each key shall be unique within the index sample-set.
  • Configurable memory backend.
  • In single-threaded settings, reads and writes are serialized.

MVCC instances support serialized writes and concurrent reads.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Defaultsettings

func Defaultsettings() s.Settings

Defaultsettings for llrb instance.

"memcapacity" (int64, default: available free-ram)

Memory capacity required for keys / values. Default will be ramsize.

"snapshottick" (int64, default: 4)

Used only in MVCC, time period in millisecond, for generating
read-snapshots.

"allocator" (string, default: "flist")

Type of allocator to use.

func LogComponents

func LogComponents(components ...string)

LogComponents enable logging. By default logging is disabled, if applications want log information for llrb components call this function with "self" or "all" or "llrb" or "mvcc" as argument.

Types

type Cursor

type Cursor struct {
	// contains filtered or unexported fields
}

Cursor object maintains an active pointer into the index. Use OpenCursor on Txn object to create a new cursor.

func (*Cursor) Delcursor

func (cur *Cursor) Delcursor(lsm bool)

Delcursor deletes the entry at the cursor.

func (*Cursor) Delete

func (cur *Cursor) Delete(key, oldvalue []byte, lsm bool) []byte

Delete is an alias to txn.Delete call. The current position of the cursor does not affect the delete operation.

func (*Cursor) GetNext

func (cur *Cursor) GetNext() (key, value []byte, deleted bool, err error)

GetNext move cursor to next entry in snapshot and return its key and value. Returned byte slices will be a reference to index entry, hence must not be used after transaction is committed or aborted.

func (*Cursor) Key

func (cur *Cursor) Key() (key []byte, deleted bool)

Key return current key under the cursor. Returned byte slice will be a reference to index-key, hence must not be used after transaction is commited or aborted.

func (*Cursor) Set

func (cur *Cursor) Set(key, value, oldvalue []byte) []byte

Set is an alias to txn.Set call. The current position of the cursor does not affect the set operation.

func (*Cursor) Value

func (cur *Cursor) Value() []byte

Value return current value under the cursor. Returned byte slice will be a reference to value in index, hence must not be used after transaction is commited or aborted.

func (*Cursor) YNext

func (cur *Cursor) YNext(
	fin bool) (key, value []byte, seqno uint64, deleted bool, err error)

YNext implements Iterator api, to iterate over the index. Typically used for lsm-sort.

type LLRB

type LLRB struct {
	// contains filtered or unexported fields
}

LLRB to manage a single instance of in-memory sorted index using left-leaning-red-black tree. LLRB instance shall implement api.Index interface, and compliant with api.Getter and api.Iterator APIs.

func LoadLLRB

func LoadLLRB(name string, setts s.Settings, iter api.Iterator) *LLRB

LoadLLRB creates an LLRB instance and populate it with initial set of data (key, value) from iterator. After loading the data, applications shall use Setseqno() to update the latest sequence number.

func NewLLRB

func NewLLRB(name string, setts s.Settings) *LLRB

NewLLRB a new instance of in-memory sorted index.

func (*LLRB) BeginTxn

func (llrb *LLRB) BeginTxn(id uint64) api.Transactor

BeginTxn starts a read-write transaction. Transactions must satisfy ACID properties. Structure will be locked, no other read or write operation can be performed, until transaction is committed or aborted.

func (*LLRB) Clone

func (llrb *LLRB) Clone(name string) *LLRB

Clone llrb instance and return the clone.

func (*LLRB) Close

func (llrb *LLRB) Close()

Close does nothing.

func (*LLRB) Count

func (llrb *LLRB) Count() int64

Count return the number of items indexed.

func (*LLRB) Delete

func (llrb *LLRB) Delete(key, oldvalue []byte, lsm bool) ([]byte, uint64)

Delete key from index. Key should not be nil, if key found return its value. If lsm is true, then don't delete the node instead mark the node as deleted. Again, if lsm is true but key is not found in index, a new entry will inserted.

func (*LLRB) Destroy

func (llrb *LLRB) Destroy()

Destroy releases all resources held by the tree. No other method call are allowed after Destroy.

func (*LLRB) Dotdump

func (llrb *LLRB) Dotdump(buffer io.Writer)

Dotdump to convert whole tree into dot script that can be visualized using graphviz. Until dotdump exits concurrent write operations will block.

func (*LLRB) Footprint

func (llrb *LLRB) Footprint() int64

Footprint return the heap footprint consumed by llrb instance.

func (*LLRB) Get

func (llrb *LLRB) Get(
	key, value []byte) (v []byte, cas uint64, deleted, ok bool)

Get value for key, if value argument points to valid buffer it will, be used to copy the entry's value. Also returns entry's cas, whether entry is marked as deleted by LSM. If ok is false, then key is not found.

func (*LLRB) Getseqno

func (llrb *LLRB) Getseqno() uint64

Getseqno return current seqno on this tree.

func (*LLRB) ID

func (llrb *LLRB) ID() string

ID is same as the name supplied while creating the LLRB instance.

func (*LLRB) Log

func (llrb *LLRB) Log()

Log vital information.

func (*LLRB) Scan

func (llrb *LLRB) Scan() api.Iterator

Scan return a full table iterator, if iteration is stopped before reaching end of table (io.EOF), application should call iterator with fin as true. EG: iter(true)

func (*LLRB) ScanEntries

func (llrb *LLRB) ScanEntries() api.EntryIterator

ScanEntries return a full table iterator, if iteration is stopped before reaching end of table (io.EOF), application should call iterator with fin as true. EG: iter(true)

func (*LLRB) Set

func (llrb *LLRB) Set(key, value, oldvalue []byte) (ov []byte, cas uint64)

Set a key, value pair in the index, if key is already present, its value will be over-written. Make sure key is not nil. Return old value if oldvalue points to valid buffer.

func (*LLRB) SetCAS

func (llrb *LLRB) SetCAS(
	key, value, oldvalue []byte, cas uint64) ([]byte, uint64, error)

SetCAS a key, value pair in the index, if CAS is ZERO then key should not be present in the index, otherwise existing CAS should match the supplied CAS. Value will be over-written. Make sure key is not nil. Return old value if oldvalue points to valid buffer.

func (*LLRB) Setseqno

func (llrb *LLRB) Setseqno(seqno uint64)

Setseqno can be called immediately after creating the LLRB instance. All futher mutating APIs will start counting seqno from this value.

func (*LLRB) Stats

func (llrb *LLRB) Stats() map[string]interface{}

Stats return a map of data-structure statistics and operational statistics.

func (*LLRB) Validate

func (llrb *LLRB) Validate()

Validate data structure. This is a costly operation, walks through the entire tree and holds a read lock while doing so.

func (*LLRB) View

func (llrb *LLRB) View(id uint64) api.Transactor

View start a read only transaction. Structure will be read-locked, no other write operations can be performed, until transaction is committed or aborted. Concurrent reads are still allowed.

type Llrbnode

type Llrbnode struct {
	// contains filtered or unexported fields
}

Llrbnode defines a node in LLRB tree.

func (*Llrbnode) Value

func (nd *Llrbnode) Value() []byte

Value return the value byte-slice for this entry.

type MVCC

type MVCC struct {
	// contains filtered or unexported fields
}

MVCC manages a single instance of in-memory sorted index using left-leaning-red-black tree. This implementations supports copy on write for all write operations and enables concurrent read operations. MVCC instance implement api.Index interface, and compliant with api.Getter and api.Iterator.

func LoadMVCC

func LoadMVCC(name string, setts s.Settings, iter api.Iterator) *MVCC

LoadMVCC creates an MVCC instance and populate it with initial set of data (key, value) from iterator. After loading the data applications can use Setseqno() to update the latest sequence number.

func NewMVCC

func NewMVCC(name string, setts s.Settings) *MVCC

NewMVCC a new instance of in-memory sorted index.

func (*MVCC) BeginTxn

func (mvcc *MVCC) BeginTxn(id uint64) api.Transactor

BeginTxn starts a read-write transaction. All transactions should either be committed or aborted. Every transaction holds on to a MVCC snapshot. If transactions are not released for long time accumulating too many background mutations, it will increase the memory pressure on the system. Concurrent transactions are allowed, and serialized internally.

func (*MVCC) Clone

func (mvcc *MVCC) Clone(name string) *MVCC

Clone mvcc instance and return the clone. Clone walks the entire tree and concurrent reads and writes will block until call returns.

func (*MVCC) Close

func (mvcc *MVCC) Close()

Close does nothing

func (*MVCC) Count

func (mvcc *MVCC) Count() int64

Count return the number of items indexed.

func (*MVCC) Delete

func (mvcc *MVCC) Delete(key, oldvalue []byte, lsm bool) ([]byte, uint64)

Delete key from index. Key should not be nil, if key found return its value. If lsm is true, then don't delete the node instead mark the node as deleted. Again, if lsm is true but key is not found in index, a new entry will be inserted.

func (*MVCC) Destroy

func (mvcc *MVCC) Destroy()

Destroy releases all resources held by the tree. No other method call are allowed after Destroy.

func (*MVCC) Dotdump

func (mvcc *MVCC) Dotdump(buffer io.Writer)

Dotdump to convert whole tree into dot script that can be visualized using graphviz. Until dotdump exits concurrent write operations will block.

func (*MVCC) Finalize

func (mvcc *MVCC) Finalize()

Finalize will wait for read snapshot to catchup with the tip. To be careful when calling with background mutations, call may never return.

func (*MVCC) Footprint

func (mvcc *MVCC) Footprint() int64

Footprint return the heap footprint consumed by mvcc instance.

func (*MVCC) Get

func (mvcc *MVCC) Get(
	key, value []byte) (v []byte, cas uint64, deleted, ok bool)

Get value for key, if value argument points to valid buffer, it will be used to copy the entry's value. Also returns entry's cas, whether entry is marked as deleted by LSM. If ok is false, then key is not found.

func (*MVCC) Getseqno

func (mvcc *MVCC) Getseqno() uint64

Getseqno return current seqno on this tree.

func (*MVCC) ID

func (mvcc *MVCC) ID() string

ID is same as the name supplied while creating the MVCC instance.

func (*MVCC) Log

func (mvcc *MVCC) Log()

Log vital information.

func (*MVCC) Scan

func (mvcc *MVCC) Scan() api.Iterator

Scan return a full table iterator, if iteration is stopped before reaching end of table (io.EOF), application should call iterator with fin as true. EG: iter(true)

func (*MVCC) ScanEntries

func (mvcc *MVCC) ScanEntries() api.EntryIterator

ScanEntries return a full table iterator, if iteration is stopped before reaching end of table (io.EOF), application should call iterator with fin as true. EG: iter(true)

func (*MVCC) Set

func (mvcc *MVCC) Set(key, value, oldvalue []byte) (ov []byte, cas uint64)

Set a key, value pair in the index, if key is already present, its value will be over-written. Make sure key is not nil. Return old value if oldvalue points to a valid buffer.

func (*MVCC) SetCAS

func (mvcc *MVCC) SetCAS(
	key, value, oldvalue []byte, cas uint64) ([]byte, uint64, error)

SetCAS a key, value pair in the index, if CAS is ZERO then key should not be present in the index, otherwise existing CAS should match the supplied CAS. Value will be over-written. Make sure key is not nil. Return old value if oldvalue points to valid buffer.

func (*MVCC) Setseqno

func (mvcc *MVCC) Setseqno(seqno uint64)

Setseqno can be called immediately after creating the MVCC instance. All futher mutating APIs will start counting seqno from this value.

func (*MVCC) Stats

func (mvcc *MVCC) Stats() map[string]interface{}

Stats return a map of data-structure statistics and operational statistics.

func (*MVCC) Validate

func (mvcc *MVCC) Validate()

Validate data structure. This is a costly operation, walks through the entire tree and holds a read lock while doing so.

func (*MVCC) View

func (mvcc *MVCC) View(id uint64) api.Transactor

View starts a read-only transaction. Other than that it is similar to BeginTxn. All view transactions should be aborted.

type Txn

type Txn struct {
	// contains filtered or unexported fields
}

Txn transaction definition. Transaction gives a gaurantee of isolation and atomicity on the latest snapshot.

func (*Txn) Abort

func (txn *Txn) Abort()

Abort transaction, underlying index won't be touched.

func (*Txn) Commit

func (txn *Txn) Commit() error

Commit transaction, commit will block until all write operations under the transaction are successfully applied. Return ErrorRollback if ACID properties are not met while applying the write operations. Transactions are never partially committed.

func (*Txn) Delete

func (txn *Txn) Delete(key, oldvalue []byte, lsm bool) []byte

Delete key from index. The Delete operation will be remembered as a log entry and applied on the underlying structure during commit.

func (*Txn) Get

func (txn *Txn) Get(
	key, value []byte) (v []byte, cas uint64, deleted, ok bool)

Get value for key from snapshot.

func (*Txn) ID

func (txn *Txn) ID() uint64

ID return transaction id.

func (*Txn) OpenCursor

func (txn *Txn) OpenCursor(key []byte) (api.Cursor, error)

OpenCursor open an active cursor inside the index.

func (*Txn) Set

func (txn *Txn) Set(key, value, oldvalue []byte) []byte

Set an entry of key, value pair. The set operation will be remembered as a log entry and applied on the underlying structure during Commit.

type View

type View struct {
	// contains filtered or unexported fields
}

View transaction definition. Read only version of Txn.

func (*View) Abort

func (view *View) Abort()

Abort view, must be called once done with the view.

func (*View) Commit

func (view *View) Commit() error

Commit not allowed.

func (*View) Delete

func (view *View) Delete(key, oldvalue []byte, lsm bool) []byte

Delete is not allowed.

func (*View) Get

func (view *View) Get(
	key, value []byte) (v []byte, cas uint64, deleted, ok bool)

Get value for key from snapshot.

func (*View) ID

func (view *View) ID() uint64

ID return transaction id.

func (*View) OpenCursor

func (view *View) OpenCursor(key []byte) (api.Cursor, error)

OpenCursor open an active cursor inside the index.

func (*View) Set

func (view *View) Set(key, value, oldvalue []byte) []byte

Set is not allowed

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL