Documentation ¶
Overview ¶
A Memory Mapped B+ Tree
This is a low level file structure. It is a memory mapped B+ Tree. It is not thread safe nor is it safe to access the backing file from multiple processes at once.
Features:
1. Fixed size key. Set at B+ Tree creation. Key resizes mean the tree needs to be recreated.
2. Variable length values. They can very from 0 bytes to 2^32 - 1 bytes.
3. Duplicate key support. Duplicates are kept out of the index and only occur in the leaves.
Creating a new *BpTree
bf, err := fmap.CreateBlockFile("/path/to/file") if err != nil { log.Fatal(err) } defer bf.Close() bpt, err := New(bf, 8) if err != nil { log.Fatal(err) } // do stuff with bpt
Opening a *BpTree
bf, err := fmap.OpenBlockFile("/path/to/file") if err != nil { log.Fatal(err) } defer bf.Close() bpt, err := Open(bf) if err != nil { log.Fatal(err) } // do stuff with bpt
Add a key/value pair. Note, since this is low level you have to serialize your keys and values. The length of the []byte representing the key must exactly match the key size of the B+ Tree. You can find out what that was set to by called `bpt.KeySize()`
import ( "encoding/binary" ) var key uint64 = 12 value := "hello world" kBytes := make([]byte, 8) binary.PutUvarint(kBytes, key) err := bpt.Add(kBytes, []byte(value)) if err != nil { log.Fatal(err) }
As you can see it can be a little verbose to serialize and deserialize your keys and values. So be sure to wrap that up in utility functions or even to wrap the interface of the *BpTree so that client code does not have to think about it.
Since a B+Tree is a "multi-map" meaning there may be more than one value per key. There is no "Get" method. To retrieve the values associated with a key use the `Find` method.
{ var key, value []byte kvi, err := bpt.Find(kBytes) if err != nil { log.Fatal(err) } for key, value, err, kvi = kvi(); kvi != nil; key, value, err, kvi = kvi() { // do stuff with the keys and values } if err != nil { log.Fatal(err) } }
That interface is easy to misuse if you do not check the error values as show in the example above. An easier interface is provided for all of the iterators (Range, Find, Keys, Values, Iterate) called the Do iterface.
err = bpt.DoFind(kBytes, func(key, value []byte) error { // do stuff with the keys and values return nil }) if err != nil { log.Fatal(err) }
It is recommended that you always use the Do* interfaces. The other is provided if the cost of extra method calls is too high.
Removal is also slightly more complicated due to the duplicate keys. This example will remove all key/value pairs associated with the given key:
err = bpt.Remove(kBytes, func(value []byte) bool { return true }) if err != nil { log.Fatal(err) }
to remove just the one I added earlier do:
err = bpt.Remove(kBytes, func(v []byte) bool { return bytes.Equal(v, []byte(value)) }) if err != nil { log.Fatal(err) }
That wraps up the basic usage. If you want to ensure that the bytes you have written are in fact on disk you have 2 options
1. call bf.Sync() - Note this uses the async mmap interface under the hood. The bytes are not guarateed to hit the disk after this returns but they will go there soon.
2. call bf.Close()
Index ¶
- type BpTree
- func (self *BpTree) Add(key, value []byte) (err error)
- func (self *BpTree) Backward() (kvi fs2.Iterator, err error)
- func (self *BpTree) Count(key []byte) (count int, err error)
- func (self *BpTree) DoBackward(do func(key, value []byte) error) error
- func (self *BpTree) DoFind(key []byte, do func(key, value []byte) error) error
- func (self *BpTree) DoIterate(do func(key, value []byte) error) error
- func (self *BpTree) DoKeys(do func([]byte) error) error
- func (self *BpTree) DoRange(from, to []byte, do func(key, value []byte) error) error
- func (self *BpTree) DoValues(do func([]byte) error) error
- func (self *BpTree) Find(key []byte) (kvi fs2.Iterator, err error)
- func (self *BpTree) Has(key []byte) (has bool, err error)
- func (self *BpTree) Iterate() (kvi fs2.Iterator, err error)
- func (b *BpTree) KeySize() int
- func (self *BpTree) Keys() (it fs2.ItemIterator, err error)
- func (self *BpTree) Range(from, to []byte) (kvi fs2.Iterator, err error)
- func (self *BpTree) Remove(key []byte, where func([]byte) bool) (err error)
- func (b *BpTree) Size() int
- func (self *BpTree) UnsafeRange(from, to []byte) (kvi fs2.Iterator, err error)
- func (self *BpTree) Values() (it fs2.ItemIterator, err error)
- func (self *BpTree) Verify() (err error)
- type Varchar
- func (v *Varchar) Alloc(length int) (a uint64, err error)
- func (v *Varchar) Deref(a uint64) (err error)
- func (v *Varchar) Do(a uint64, do func([]byte) error) (err error)
- func (v *Varchar) Free(a uint64) (err error)
- func (v *Varchar) Ref(a uint64) (err error)
- func (v *Varchar) UnsafeGet(a uint64) (bytes []byte, err error)
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BpTree ¶
type BpTree struct {
// contains filtered or unexported fields
}
The Ubiquitous B+ Tree
func New ¶
Create a new B+ Tree in the given BlockFile.
bf *BlockFile. Can be an anonymous map or a file backed map keySize int. If this is negative it will use varchar keys valSize int. If this is negative it will use varchar values
func (*BpTree) Add ¶
Add a key/value pair to the tree. There is a reason this isn't called `Put`, this operation does not replace or modify any data in the tree. It only adds this key. The B+ Tree supports duplicate keys and even duplicate keys with the same value!
func (*BpTree) Backward ¶
Iterate over all of the key/values pairs in reverse. See Iterate() for usage details.
func (*BpTree) DoBackward ¶
Iterate over all of the key/values pairs in reverse. See DoIterate() for usage details.
func (*BpTree) DoFind ¶
Iterate over all of the key/values pairs with the given key. See DoIterate() for usage details.
func (*BpTree) DoIterate ¶
Iterate over all of the key/value pairs in the tree
err = bpt.DoIterate(func(key, value []byte) error { // do something with each key and value in the tree }) if err != nil { // handle error }
Note, it is safe for the keys and values to escape the `do` context. They are copied into it so you cannot harm the tree. An unsafe version of this is being considered.
func (*BpTree) DoRange ¶
Iterate over all of the key/values pairs between [from, to] inclusive. See DoIterate() for usage details.
func (*BpTree) DoValues ¶
Iterate over all of the values in the tree. See DoIterate() for usage details
func (*BpTree) Find ¶
Iterate over all of the key/values pairs with the given key. See Iterate() for usage details.
func (*BpTree) Has ¶
Check for the existence of a given key. An error will be returned if there was some problem reading the underlying file.
func (*BpTree) Iterate ¶
Iterate over each of the keys and values in the tree. I recommend that you use the `DoIterate` method instead (it is easier to use). If you do use the method always use it as follows:
kvi, err := bpt.Iterate() if err != nil { // handle error } var key, value []byte // must be declared here // do not use a := assign here only a = for key, value, err, kvi = kvi(); kvi != nil; key, value, err, kvi = kvi() { // do something with each key and value } // now the iterator could have exited with an error so check the // error before continuing if err != nil { // handle error }
Note, it is safe for the keys and values to escape the iterator context. They are copied into it so you cannot harm the tree. An unsafe version of this is being considered.
func (*BpTree) Keys ¶
func (self *BpTree) Keys() (it fs2.ItemIterator, err error)
Iterate over all of the keys in the tree. See Iterate() for usage details
func (*BpTree) Range ¶
Iterate over all of the key/values pairs between [from, to] inclusive. See Iterate() for usage details.
func (*BpTree) Remove ¶
Remove one or more key/value pairs at the given key. The callback `where` will be called for each pair encountered and the value will be passed into the callback. If `where` returns true the item is removed otherwise it is left unchanged. To remove all items with a particular key simply:
err = bpt.Remove(key, func(value []byte) bool { return true }) if err != nil { panic(err) }
func (*BpTree) UnsafeRange ¶
func (*BpTree) Values ¶
func (self *BpTree) Values() (it fs2.ItemIterator, err error)
Iterate over all of the values in the tree. See Iterate() for usage details
func (*BpTree) Verify ¶
Verify() error Looks at the structure of the B+Tree and checks it conforms to the B+Tree structural invariants. It can be used to look check for database corruption either from errors in the algorithms or from disk or memory corruption. It should be noted that it cannot check to ensure that no bits have flipped inside of keys and values as currently not error correcting codes are generated for them. This only looks at the structure of the tree itself. It could be corruption has occurred and this will not find it as the tree is still a valid B+Tree.
type Varchar ¶
type Varchar struct {
// contains filtered or unexported fields
}
func NewVarchar ¶
Create a new varchar structure. This takes a blockfile and an offset of an allocated block. The block becomes the control block for the varchar file (storing the free list for the allocator). It is important for the parent structure to track the location of this control block.
func OpenVarchar ¶
Open a varchar structure in the given blockfile with the given offset as the control block. This function will confirm that the control block is indeed a properly formated control block.
func (*Varchar) Deref ¶
Deref decremnents the ref field. If it ever reaches 0 it will automatically be freed (by calling `v.Free(a)`).
func (*Varchar) Do ¶
Interact with the contents of the varchar. The bytes passed into the callback are UNSAFE. You could cause a segmentation fault if you simply copy the *slice* out of the function. You need to copy the data instead.
The right way:
var myBytes []byte err = v.Do(a, func(bytes []byte) error { myBytes = make([]byte, len(bytes)) copy(myBytes, bytes) return nil }) if err != nil { log.Fatal(err) }
you can of course interact with the bytes in the callback in any way you want as long as no pointers escape. You can even change the values of the bytes (and these changes will be persisted). However, you cannot change the length of the varchar.