bptree

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 29, 2019 License: GPL-2.0 Imports: 11 Imported by: 3

Documentation

Overview

A Memory Mapped B+ Tree

This is a low level file structure. It is a memory mapped B+ Tree. It is not thread safe nor is it safe to access the backing file from multiple processes at once.

Features:

1. Fixed size key. Set at B+ Tree creation. Key resizes mean the tree needs to be recreated.

2. Variable length values. They can very from 0 bytes to 2^32 - 1 bytes.

3. Duplicate key support. Duplicates are kept out of the index and only occur in the leaves.

Creating a new *BpTree

bf, err := fmap.CreateBlockFile("/path/to/file")
if err != nil {
	log.Fatal(err)
}
defer bf.Close()
bpt, err := New(bf, 8)
if err != nil {
	log.Fatal(err)
}
// do stuff with bpt

Opening a *BpTree

bf, err := fmap.OpenBlockFile("/path/to/file")
if err != nil {
	log.Fatal(err)
}
defer bf.Close()
bpt, err := Open(bf)
if err != nil {
	log.Fatal(err)
}
// do stuff with bpt

Add a key/value pair. Note, since this is low level you have to serialize your keys and values. The length of the []byte representing the key must exactly match the key size of the B+ Tree. You can find out what that was set to by called `bpt.KeySize()`

import (
	"encoding/binary"
)

var key uint64 = 12
value := "hello world"
kBytes := make([]byte, 8)
binary.PutUvarint(kBytes, key)
err := bpt.Add(kBytes, []byte(value))
if err != nil {
	log.Fatal(err)
}

As you can see it can be a little verbose to serialize and deserialize your keys and values. So be sure to wrap that up in utility functions or even to wrap the interface of the *BpTree so that client code does not have to think about it.

Since a B+Tree is a "multi-map" meaning there may be more than one value per key. There is no "Get" method. To retrieve the values associated with a key use the `Find` method.

{
	var key, value []byte
	kvi, err := bpt.Find(kBytes)
	if err != nil {
		log.Fatal(err)
	}
	for key, value, err, kvi = kvi(); kvi != nil; key, value, err, kvi = kvi() {
		// do stuff with the keys and values
	}
	if err != nil {
		log.Fatal(err)
	}
}

That interface is easy to misuse if you do not check the error values as show in the example above. An easier interface is provided for all of the iterators (Range, Find, Keys, Values, Iterate) called the Do iterface.

err = bpt.DoFind(kBytes,
	func(key, value []byte) error {
		// do stuff with the keys and values
		return nil
})
if err != nil {
	log.Fatal(err)
}

It is recommended that you always use the Do* interfaces. The other is provided if the cost of extra method calls is too high.

Removal is also slightly more complicated due to the duplicate keys. This example will remove all key/value pairs associated with the given key:

err = bpt.Remove(kBytes, func(value []byte) bool {
	return true
})
if err != nil {
	log.Fatal(err)
}

to remove just the one I added earlier do:

err = bpt.Remove(kBytes, func(v []byte) bool {
	return bytes.Equal(v, []byte(value))
})
if err != nil {
	log.Fatal(err)
}

That wraps up the basic usage. If you want to ensure that the bytes you have written are in fact on disk you have 2 options

1. call bf.Sync() - Note this uses the async mmap interface under the hood. The bytes are not guarateed to hit the disk after this returns but they will go there soon.

2. call bf.Close()

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type BpTree

type BpTree struct {
	// contains filtered or unexported fields
}

The Ubiquitous B+ Tree

func New

func New(bf *fmap.BlockFile, keySize, valSize int) (*BpTree, error)

Create a new B+ Tree in the given BlockFile.

bf *BlockFile. Can be an anonymous map or a file backed map keySize int. If this is negative it will use varchar keys valSize int. If this is negative it will use varchar values

func NewAt

func NewAt(bf *fmap.BlockFile, metaOff uint64, keySize, valSize int) (*BpTree, error)

func Open

func Open(bf *fmap.BlockFile) (*BpTree, error)

Open an existing B+Tree (it knows its key size so you do not have to supply that).

func OpenAt

func OpenAt(bf *fmap.BlockFile, metaOff uint64) (*BpTree, error)

func (*BpTree) Add

func (self *BpTree) Add(key, value []byte) (err error)

Add a key/value pair to the tree. There is a reason this isn't called `Put`, this operation does not replace or modify any data in the tree. It only adds this key. The B+ Tree supports duplicate keys and even duplicate keys with the same value!

func (*BpTree) Backward

func (self *BpTree) Backward() (kvi fs2.Iterator, err error)

Iterate over all of the key/values pairs in reverse. See Iterate() for usage details.

func (*BpTree) Count

func (self *BpTree) Count(key []byte) (count int, err error)

How many key/value pairs are there with the given key.

func (*BpTree) DoBackward

func (self *BpTree) DoBackward(do func(key, value []byte) error) error

Iterate over all of the key/values pairs in reverse. See DoIterate() for usage details.

func (*BpTree) DoFind

func (self *BpTree) DoFind(key []byte, do func(key, value []byte) error) error

Iterate over all of the key/values pairs with the given key. See DoIterate() for usage details.

func (*BpTree) DoIterate

func (self *BpTree) DoIterate(do func(key, value []byte) error) error

Iterate over all of the key/value pairs in the tree

err = bpt.DoIterate(func(key, value []byte) error {
	// do something with each key and value in the tree
})
if err != nil {
	// handle error
}

Note, it is safe for the keys and values to escape the `do` context. They are copied into it so you cannot harm the tree. An unsafe version of this is being considered.

func (*BpTree) DoKeys

func (self *BpTree) DoKeys(do func([]byte) error) error

Iterate over all of the keys in the tree. See DoIterate() for usage details

func (*BpTree) DoRange

func (self *BpTree) DoRange(from, to []byte, do func(key, value []byte) error) error

Iterate over all of the key/values pairs between [from, to] inclusive. See DoIterate() for usage details.

func (*BpTree) DoValues

func (self *BpTree) DoValues(do func([]byte) error) error

Iterate over all of the values in the tree. See DoIterate() for usage details

func (*BpTree) Find

func (self *BpTree) Find(key []byte) (kvi fs2.Iterator, err error)

Iterate over all of the key/values pairs with the given key. See Iterate() for usage details.

func (*BpTree) Has

func (self *BpTree) Has(key []byte) (has bool, err error)

Check for the existence of a given key. An error will be returned if there was some problem reading the underlying file.

func (*BpTree) Iterate

func (self *BpTree) Iterate() (kvi fs2.Iterator, err error)

Iterate over each of the keys and values in the tree. I recommend that you use the `DoIterate` method instead (it is easier to use). If you do use the method always use it as follows:

kvi, err := bpt.Iterate()
if err != nil {
	// handle error
}
var key, value []byte // must be declared here
// do not use a := assign here only a =
for key, value, err, kvi = kvi(); kvi != nil; key, value, err, kvi = kvi() {
	// do something with each key and value
}
// now the iterator could have exited with an error so check the
// error before continuing
if err != nil {
	// handle error
}

Note, it is safe for the keys and values to escape the iterator context. They are copied into it so you cannot harm the tree. An unsafe version of this is being considered.

func (*BpTree) KeySize

func (b *BpTree) KeySize() int

What is the key size of this tree?

func (*BpTree) Keys

func (self *BpTree) Keys() (it fs2.ItemIterator, err error)

Iterate over all of the keys in the tree. See Iterate() for usage details

func (*BpTree) Range

func (self *BpTree) Range(from, to []byte) (kvi fs2.Iterator, err error)

Iterate over all of the key/values pairs between [from, to] inclusive. See Iterate() for usage details.

func (*BpTree) Remove

func (self *BpTree) Remove(key []byte, where func([]byte) bool) (err error)

Remove one or more key/value pairs at the given key. The callback `where` will be called for each pair encountered and the value will be passed into the callback. If `where` returns true the item is removed otherwise it is left unchanged. To remove all items with a particular key simply:

err = bpt.Remove(key, func(value []byte) bool { return true })
if err != nil {
	panic(err)
}

func (*BpTree) Size

func (b *BpTree) Size() int

How many items are in the tree?

func (*BpTree) UnsafeRange

func (self *BpTree) UnsafeRange(from, to []byte) (kvi fs2.Iterator, err error)

func (*BpTree) Values

func (self *BpTree) Values() (it fs2.ItemIterator, err error)

Iterate over all of the values in the tree. See Iterate() for usage details

func (*BpTree) Verify

func (self *BpTree) Verify() (err error)

Verify() error Looks at the structure of the B+Tree and checks it conforms to the B+Tree structural invariants. It can be used to look check for database corruption either from errors in the algorithms or from disk or memory corruption. It should be noted that it cannot check to ensure that no bits have flipped inside of keys and values as currently not error correcting codes are generated for them. This only looks at the structure of the tree itself. It could be corruption has occurred and this will not find it as the tree is still a valid B+Tree.

type Varchar

type Varchar struct {
	// contains filtered or unexported fields
}

func NewVarchar

func NewVarchar(bf *fmap.BlockFile, a uint64) (v *Varchar, err error)

Create a new varchar structure. This takes a blockfile and an offset of an allocated block. The block becomes the control block for the varchar file (storing the free list for the allocator). It is important for the parent structure to track the location of this control block.

func OpenVarchar

func OpenVarchar(bf *fmap.BlockFile, a uint64) (v *Varchar, err error)

Open a varchar structure in the given blockfile with the given offset as the control block. This function will confirm that the control block is indeed a properly formated control block.

func (*Varchar) Alloc

func (v *Varchar) Alloc(length int) (a uint64, err error)

Allocate a varchar of the desired length.

func (*Varchar) Deref

func (v *Varchar) Deref(a uint64) (err error)

Deref decremnents the ref field. If it ever reaches 0 it will automatically be freed (by calling `v.Free(a)`).

func (*Varchar) Do

func (v *Varchar) Do(a uint64, do func([]byte) error) (err error)

Interact with the contents of the varchar. The bytes passed into the callback are UNSAFE. You could cause a segmentation fault if you simply copy the *slice* out of the function. You need to copy the data instead.

The right way:

var myBytes []byte
err = v.Do(a, func(bytes []byte) error {
	myBytes = make([]byte, len(bytes))
	copy(myBytes, bytes)
	return nil
})
if err != nil {
	log.Fatal(err)
}

you can of course interact with the bytes in the callback in any way you want as long as no pointers escape. You can even change the values of the bytes (and these changes will be persisted). However, you cannot change the length of the varchar.

func (*Varchar) Free

func (v *Varchar) Free(a uint64) (err error)

Free the varchar at the address a.

func (*Varchar) Ref

func (v *Varchar) Ref(a uint64) (err error)

Ref increments the ref field of the block. It starts out as one (when allocated). Each call to ref will add 1 to that.

func (*Varchar) UnsafeGet

func (v *Varchar) UnsafeGet(a uint64) (bytes []byte, err error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL