hyperbloom

package module
v0.0.0-...-18ddaa2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 10, 2023 License: MIT Imports: 9 Imported by: 0

README

HyperBloom

Build Status godoc complete

Bloom

A collection of high performance bloom filter data structures for use in Go. They all use the 64 bit version of Google's XXHASH "extremely fast non-cryptographic" hashing algorithm. Detailed documentation is available via godoc.

BloomFilter

A textbook implementation of a bloom filter. Like StripedBloomFilter, it uses an array of unsigned 64 bit integers. However, it uses centralized locking (via a RWMutex) in place of sharded locking. In addition, it supports non-locking inserts and lookups (InsertAsync) and (LookupAsync). Use if you plan on doing mostly reads and not many writes OR if you plan on using the bloomfilter in a single-threaded scenario (make sure to use InsertAsync and LookupAsync to bypass the mutex in this case)

StripedBloomFilter

A bloom filter implemented using an array of unsigned 64 bit integers which is divided into 'n' shards. Each shard contains its own mutex, allowing for a high degree of concurrent insert and lookup throughput. One restriction is that the filter size must be a multiple of the number of shards.

Assuming 16 cores, a 32 shard StriedBloomFilter provides a three order of magnitude decrease in lock contention over a standard bloom filter.

Use if you plan on a roughly equal balance of writes and reads and are using the bloomfilter in a multi-threaded scenario. This variant is especially good for large (1 billion buckets or more) filters.

NaiveBloomFilter

A bloom filter implemented using a byte array (with each bit in the filter assigned to a byte). This reduces the number of instructions necessary to perform a lookup in the go 1.4 and latest gcc-go compilers. As a result, the NaiveBloomFilter will almost always be faster than the standard variants at the cost of an 8x penalty in memory usage.

This version uses centralized locking (via a RWMutex) and is perfect for a filter that will mainly be used for reads or in a single threaded context (use the LookupAsync and InsertAsync functions to bypass the mutex in this case). For very small (<20 million buckets) bloom filters, the NaiveBloomFilter can yield enormous performance boosts since most of the filter fits in the processor cache.

NaiveStripedBloomFilter

A bloom filter implemented using a byte array (with each bit in the filter assigned to a byte) but with distributed locking over 'n' shards. This provides increased concurrent throughput. This is the perfect choice for filters where read performance over multiple threads needs to be maximized (you get a performance gain from not bit mangling).

Documentation

Overview

A high performance concurrent bloom filter library with striped and naive implementations

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type BloomFilter

type BloomFilter struct {
	// contains filtered or unexported fields
}

BloomFilter is a bloomfilter backed by an array of unsigned 64 bit integers (with bits encoded in each one). It uses central locking via a RWMutex and supports both synchronous and asynchronous inserts and lookups

func NewBloomFilter

func NewBloomFilter(size uint64, hf int) (*BloomFilter, error)

NewBloomfilter allocates a BloomFilter with a given size (in bits) and using a certain number of hashes. Size must be a power of 2 and larger than 64

func (BloomFilter) Insert

func (bf BloomFilter) Insert(entry string) error

Inserts an entry into the NaiveBloomFilter. Locks the filter.

func (BloomFilter) InsertAsync

func (bf BloomFilter) InsertAsync(entry string) error

Inserts an entry into the NaiveBloomFilter. Doesn't lock the filter.

func (BloomFilter) Load

func (bf BloomFilter) Load(filename string) error

Merges current bit vector with one loaded from a file. This locks the filter for each nonzero byte being written.

func (BloomFilter) Lookup

func (bf BloomFilter) Lookup(entry string) (bool, error)

Looks up an entry into the BloomFilter. Returns true if a match is found, false otherwise. If an error occurs, will also return false. This perform a reader lock on the filter (writers must wait until all active readers finish).

func (BloomFilter) LookupAsync

func (bf BloomFilter) LookupAsync(entry string) (bool, error)

Looks up an entry into the BloomFilter. Returns true if a match is found, false otherwise. If an error occurs, will also return false. This won't lock the filter.

func (BloomFilter) Write

func (bf BloomFilter) Write(filename string) error

Writes underlying bit vector to a file.

type NaiveBloomFilter

type NaiveBloomFilter struct {
	// contains filtered or unexported fields
}

NaiveBloomFilter is a bloomfilter backed by a byte vector rather than a bitvector. As a result, lookups are faster although at an 8x space penalty. It uses central locking via a RWMutex

func NewNaiveBloomFilter

func NewNaiveBloomFilter(size uint64, hf int) (*NaiveBloomFilter, error)

NewNaiveBloomfilter allocates a NaiveBloomFilter with a given size (in bytes) and using a certain number of hashes. Size must be a power of 2 and larger than 64

func (NaiveBloomFilter) Insert

func (bf NaiveBloomFilter) Insert(entry string) error

Inserts an entry into the NaiveBloomFilter. Locks the filter.

func (NaiveBloomFilter) InsertAsync

func (bf NaiveBloomFilter) InsertAsync(entry string) error

Inserts an entry into the NaiveBloomFilter. Does not lock the filter.

func (NaiveBloomFilter) Load

func (bf NaiveBloomFilter) Load(filename string) error

Merges current byte vector with one loaded from a file. This locks the filter for each nonzero byte being written.

func (NaiveBloomFilter) Lookup

func (bf NaiveBloomFilter) Lookup(entry string) (bool, error)

Looks up an entry into the NaiveBloomFilter. Returns true if a match is found, false otherwise. If an error occurs, will also return false. This perform a reader lock on the filter (writers must wait until all active readers finish).

func (NaiveBloomFilter) LookupAsync

func (bf NaiveBloomFilter) LookupAsync(entry string) (bool, error)

Looks up an entry into the NaiveBloomFilter. Returns true if a match is found, false otherwise. If an error occurs, will also return false. This won't lock the filter.

func (NaiveBloomFilter) Write

func (bf NaiveBloomFilter) Write(filename string) error

Writes underlying byte vector to a file.

type NaiveStripedBloomFilter

type NaiveStripedBloomFilter struct {
	// contains filtered or unexported fields
}

NaiveStripedBloomFilter is a bloomfilter backed by a byte vector rather than a bit vector. It uses distributed locking via striping and supports both synchronous and asynchronous inserts and lookups.

func NewNaiveStripedBloomFilter

func NewNaiveStripedBloomFilter(size uint64, hf int, shards uint64) (*NaiveStripedBloomFilter, error)

NewNaiveStripedBloomfilter allocates a NaiveStripedBloomFilter with a given size (in bits) and using a certain number of hashes. Size must be a power of 2 and larger than 64. Shards must be a power of 2 (smaller than size) and cannot exceed size/64.

func (NaiveStripedBloomFilter) Insert

func (bf NaiveStripedBloomFilter) Insert(entry string) error

Inserts an entry into the NaiveStripedBloomFilter. Locks one shard of the filter.

func (NaiveStripedBloomFilter) InsertAsync

func (bf NaiveStripedBloomFilter) InsertAsync(entry string) error

Inserts an entry into the NaiveBloomFilter. Doesn't lock the filter.

func (NaiveStripedBloomFilter) Load

func (bf NaiveStripedBloomFilter) Load(filename string) error

Merges current bit vector with one loaded from a file. This locks the filter for each nonzero byte being written.

func (NaiveStripedBloomFilter) Lookup

func (bf NaiveStripedBloomFilter) Lookup(entry string) (bool, error)

Looks up an entry in the NaiveStripedBloomFilter. Returns true if a match is found, false otherwise. If an error occurs, will also return false. This will lock one shard of the filter.

func (NaiveStripedBloomFilter) LookupAsync

func (bf NaiveStripedBloomFilter) LookupAsync(entry string) (bool, error)

Looks up an entry in the NaiveStripedBloomFilter. Returns true if a match is found, false otherwise. If an error occurs, will also return false. This won't lock the filter.

func (NaiveStripedBloomFilter) Write

func (bf NaiveStripedBloomFilter) Write(filename string) error

Writes underlying bit vector to a file.

type StripedBloomFilter

type StripedBloomFilter struct {
	// contains filtered or unexported fields
}

StripedBloomFilter is a bloomfilter backed by an array of unsigned 64 bit integers (with bits encoded in each one). It uses distributed locking via striping and supports both synchronous and asynchronous inserts and lookups.

func NewStripedBloomFilter

func NewStripedBloomFilter(size uint64, hf int, shards uint64) (*StripedBloomFilter, error)

NewBloomfilter allocates a StripedBloomFilter with a given size (in bits) and using a certain number of hashes. Size must be a power of 2 and larger than 64. Shards must be a power of 2 (smaller than size) and cannot exceed size/64.

func (StripedBloomFilter) Insert

func (bf StripedBloomFilter) Insert(entry string) error

Inserts an entry into the StripedBloomFilter. Locks the filter.

func (StripedBloomFilter) InsertAsync

func (bf StripedBloomFilter) InsertAsync(entry string) error

Inserts an entry into the StripedBloomFilter. Doesn't lock the filter.

func (StripedBloomFilter) Load

func (bf StripedBloomFilter) Load(filename string) error

Merges current bit vector with one loaded from a file. This locks the filter for each nonzero byte being written.

func (StripedBloomFilter) Lookup

func (bf StripedBloomFilter) Lookup(entry string) (bool, error)

Looks up an entry in the StripedBloomFilter. Returns true if a match is found, false otherwise. If an error occurs, will also return false. This perform a reader lock on the filter (writers must wait until all active readers finish).

func (StripedBloomFilter) LookupAsync

func (bf StripedBloomFilter) LookupAsync(entry string) (bool, error)

Looks up an entry in the StripedBloomFilter. Returns true if a match is found, false otherwise. If an error occurs, will also return false. This won't lock the filter.

func (StripedBloomFilter) Write

func (bf StripedBloomFilter) Write(filename string) error

Writes underlying bit vector to a file.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL