Documentation ¶
Overview ¶
TODO : update
Offheap ¶
An off-heap hash-table for Go (golang). Originally called go-offheap-hashtable, but now shortened to just offheap.
The purpose here is to have a hash table that can work away from Go's Garbage Collector, to avoid long GC pause times.
We accomplish this by writing our own Malloc() and Free() implementation (see malloc.go) which requests memory directly from the OS.
The keys, values, and entire hash table is kept on off-heap storage. This storage can also optionally be backed by memory mapped file for speedy persistence and fast startup times.
Initial HashTable implementation inspired by the public domain C++ code of
https://github.com/preshing/CompareIntegerMaps
See also
http://preshing.com/20130107/this-hash-table-is-faster-than-a-judy-array/
for performance studies of the C++ code.
HashTable ¶
The implementation is mostly in offheap.go, read that to start.
Maps pointer-sized integers to Cell structures, which in turn hold Val_t as well as Key_t structures.
Uses open addressing with linear probing. This makes it very cache friendly and thus very fast.
In the t.Cells array, UnHashedKey = 0 is reserved to indicate an unused cell. Actual value for key 0 (if any) is stored in t.ZeroCell. The hash table automatically doubles in size when it becomes 75% full. The hash table never shrinks in size, even after Clear(), unless you explicitly call Compact().
Basic operations: Lookup(), Insert(), DeleteKey(). These are the equivalent of the builtin map[uint64]interface{}.
As an example of how to specialize for a map[string]*Cell equivalent, see the following functions in the bytekey.go file:
func (t *HashTable) InsertStringKey(strkey string, value interface{}) bool func (t *HashTable) LookupStringKey(strkey string) (Val_t, bool) func (t *HashTable) DeleteStringKey(strkey string) bool
Example use:
h := offheap.NewHashTable(2) // basic three operations are: h.InsertStringKey("My number", 43) val, ok := h.LookupStringKey("My number") h.DeleteStringKey("My number")
Note that this library is only a starting point of source code, and not intended to be used without customization. Users of the HashTable will have to customize it by changing the definitions of Key_t and Val_t to suite their needs. I'm experimenting next with storing objects in Capnproto serialized format, but this branch (branch capnp) isn't quite ready for use.
Related ideas:
https://gist.github.com/mish15/9822474 (using CGO)
https://groups.google.com/forum/#!topic/golang-nuts/kCQP6S6ZGh0
not fully off-heap, but using a slice instead of a map appears to help GC quite alot too:
Index ¶
- Constants
- type CellInt
- type CustomMetadataIntTest
- type EmptyStructInt
- type HashTableCustomMetadataInt
- type HashTableInt
- func (t *HashTableInt) Bytes() []byte
- func (t *HashTableInt) Clear()
- func (t *HashTableInt) Compact()
- func (t *HashTableInt) DeleteCellInt(cell *CellInt)
- func (t *HashTableInt) DeleteKey(key uint64)
- func (t *HashTableInt) DestroyHashTable()
- func (t *HashTableInt) Insert(key uint64) (*CellInt, bool)
- func (h *HashTableInt) InsertInt(k uint64, v int)
- func (t *HashTableInt) Lookup(key uint64) *CellInt
- func (h *HashTableInt) LookupInt(key uint64) (int, bool)
- func (h *HashTableInt) NewIterator() *IntIterator
- func (t *HashTableInt) Repopulate(desiredSize uint64)
- func (t *HashTableInt) Save()
- func (t *HashTableInt) UpdateChecksum()
- type HashTableMetadataInt
- type IntIterator
Constants ¶
const MAGIC_NUMBERInt = 0x123456789ABCDEF
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CustomMetadataIntTest ¶
type EmptyStructInt ¶
type EmptyStructInt struct{}
type HashTableCustomMetadataInt ¶
type HashTableCustomMetadataInt struct {
CustomMetadataIntTest
}
type HashTableInt ¶
type HashTableInt struct { *HashTableMetadataInt *HashTableCustomMetadataInt // contains filtered or unexported fields }
func NewHashTableInt ¶
func NewHashTableInt(initialSize uint64) *HashTableInt
Create a new hash table, able to hold initialSize count of keys.
func NewHashTableIntFileBacked ¶
func NewHashTableIntFileBacked(initialSize uint64, filepath string) (*HashTableInt, error)
func OpenHashTableIntFileBacked ¶
func OpenHashTableIntFileBacked(filepath string) (*HashTableInt, error)
func (*HashTableInt) Bytes ¶
func (t *HashTableInt) Bytes() []byte
func (*HashTableInt) Clear ¶
func (t *HashTableInt) Clear()
Clear does not resize the table, but zeroes-out all entries and the custom metadata
func (*HashTableInt) Compact ¶
func (t *HashTableInt) Compact()
Compact will compress the hashtable so that it is at most 75% full.
func (*HashTableInt) DeleteCellInt ¶
func (t *HashTableInt) DeleteCellInt(cell *CellInt)
DeleteCellInt deletes the cell pointed to by cell.
func (*HashTableInt) DeleteKey ¶
func (t *HashTableInt) DeleteKey(key uint64)
DeleteKey will delete the contents of the cell associated with key.
func (*HashTableInt) DestroyHashTable ¶
func (t *HashTableInt) DestroyHashTable()
DestroyHashTable frees the memory-mapping, returning the memory containing the hash table and its cells to the OS. By default the save-to-file-on-disk functionality in malloc.go is not used, but that can be easily activated. See malloc.go. Deferencing any cells/pointers into the hash table after destruction will result in crashing your process, almost surely.
func (*HashTableInt) InsertInt ¶
func (h *HashTableInt) InsertInt(k uint64, v int)
func (*HashTableInt) Lookup ¶
func (t *HashTableInt) Lookup(key uint64) *CellInt
Lookup a cell based on a uint64 key value. Returns nil if key not found.
func (*HashTableInt) NewIterator ¶
func (h *HashTableInt) NewIterator() *IntIterator
NewIterator creates a new iterator for HashTable tab.
func (*HashTableInt) Repopulate ¶
func (t *HashTableInt) Repopulate(desiredSize uint64)
Repopulate expands the hashtable to the desiredSize count of cells.
func (*HashTableInt) Save ¶
func (t *HashTableInt) Save()
Save syncs the memory mapped file to disk using MmapMalloc::BlockUntilSync()
func (*HashTableInt) UpdateChecksum ¶
func (t *HashTableInt) UpdateChecksum()
type HashTableMetadataInt ¶
type IntIterator ¶
type IntIterator struct { H *HashTableInt Pos int64 Cur *CellInt }
IntIterator
sample use: given a HashTable h, enumerate h's contents with:
for it := offheap.NewIterator(h); it.Cur != nil; it.Next() { found = append(found, it.Cur.unHashedKey) }
func (*IntIterator) Done ¶
func (it *IntIterator) Done() bool
Done checks to see if we have already iterated through all cells in the table. Equivalent to checking it.Cur == nil.
func (*IntIterator) Next ¶
func (it *IntIterator) Next() *CellInt
Next advances the iterator so that it.Cur points to the next filled cell in the table, and returns that cell. Returns nil once there are no more cells to be visited.