mst

package
v0.0.0-...-efe2ce5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 25, 2024 License: Apache-2.0, MIT Imports: 15 Imported by: 1

Documentation

Overview

Package mst contains a Merkle Search Tree (MST) implementation for atproto.

This implementation is a port of the Typescript implementation in the `atproto` git repo.

## Notes copied from TS repo

This is an implementation of a Merkle Search Tree (MST) The data structure is described here: https://hal.inria.fr/hal-02303490/document The MST is an ordered, insert-order-independent, deterministic tree. Keys are laid out in alphabetic order. The key insight of an MST is that each key is hashed and starting 0s are counted to determine which layer it falls on (5 zeros for ~32 fanout). This is a merkle tree, so each subtree is referred to by it's hash (CID). When a leaf is changed, ever tree on the path to that leaf is changed as well, thereby updating the root hash.

For atproto, we use SHA-256 as the key hashing algorithm, and ~16 fanout (4-bits of zero per layer).

NOTE: currently keys are strings, not bytes. Because UTF-8 strings can't be safely split at arbitrary byte boundaries (the results are not necessarily valid UTF-8 strings), this means that "wide" characters not really supported in keys, particularly across programming language implementations. We recommend sticking with simple alphanumeric (ASCII) strings.

A couple notes on CBOR encoding:

There are never two neighboring subtrees. Therefore, we can represent a node as an array of leaves & pointers to their right neighbor (possibly null), along with a pointer to the left-most subtree (also possibly null).

Most keys in a subtree will have overlap. We do compression on prefixes by describing keys as: - the length of the prefix that it shares in common with the preceding key - the rest of the string

For example: If the first leaf in a tree is `bsky/posts/abcdefg` and the second is `bsky/posts/abcdehi` Then the first will be described as `prefix: 0, key: 'bsky/posts/abcdefg'`, and the second will be described as `prefix: 16, key: 'hi'.`

Index

Constants

This section is empty.

Variables

View Source
var ErrNotFound = fmt.Errorf("mst: not found")

Functions

func CBORTypes

func CBORTypes() []reflect.Type

CBORTypes returns the types in this package that need to be registered with the CBOR codec.

Types

type DiffOp

type DiffOp struct {
	Depth  int
	Op     string
	Rpath  string
	OldCid cid.Cid
	NewCid cid.Cid
}

func DiffTrees

func DiffTrees(ctx context.Context, bs blockstore.Blockstore, from, to cid.Cid) ([]*DiffOp, error)

TODO: this code isn't great, should be rewritten on top of the baseline datastructures once functional and correct

type MerkleSearchTree

type MerkleSearchTree struct {
	// contains filtered or unexported fields
}

MerkleSearchTree represents an MST tree node (NodeData type). It can be in several levels of hydration: fully hydrated (entries and "pointer" (CID) computed); dirty (entries correct, but pointer (CID) not valid); virtual (pointer is defined, no entries have been pulled from blockstore)

MerkleSearchTree values are immutable. Methods return copies with changes.

func LoadMST

func LoadMST(cst cbor.IpldStore, root cid.Cid) *MerkleSearchTree

This is poorly named in both implementations, because it is lazy Typescript: MST.load(storage, cid, layer=null, fanout) -> MST

func NewEmptyMST

func NewEmptyMST(cst cbor.IpldStore) *MerkleSearchTree

NewEmptyMST reports a new empty MST using cst as its storage.

func (*MerkleSearchTree) Add

func (mst *MerkleSearchTree) Add(ctx context.Context, key string, val cid.Cid, knownZeros int) (*MerkleSearchTree, error)

"Adds a new leaf for the given key/value pair. Throws if a leaf with that key already exists" Typescript: MST.add(key, value, knownZeros?) -> MST

func (*MerkleSearchTree) Delete

func (mst *MerkleSearchTree) Delete(ctx context.Context, k string) (*MerkleSearchTree, error)

"Deletes the value at the given key" Typescript: MST.delete(key) -> MST

func (*MerkleSearchTree) Get

func (mst *MerkleSearchTree) Get(ctx context.Context, k string) (cid.Cid, error)

"Gets the value at the given key" Typescript: MST.get(key) -> (CID|null)

func (*MerkleSearchTree) GetPointer

func (mst *MerkleSearchTree) GetPointer(ctx context.Context) (cid.Cid, error)

"We don't hash the node on every mutation for performance reasons. Instead we keep track of whether the pointer is outdated and only (recursively) calculate when needed" Typescript: MST.getPointer() -> CID

func (*MerkleSearchTree) Update

func (mst *MerkleSearchTree) Update(ctx context.Context, k string, val cid.Cid) (*MerkleSearchTree, error)

"Edits the value at the given key. Throws if the given key does not exist" Typescript: MST.update(key, value) -> MST

func (*MerkleSearchTree) WalkLeavesFrom

func (mst *MerkleSearchTree) WalkLeavesFrom(ctx context.Context, from string, cb func(key string, val cid.Cid) error) error

WalkLeavesFrom walks the leaves of the tree, calling the cb callback on each key that's greater than or equal to the provided from key. If cb returns an error, the walk is aborted and the error is returned.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL