abloom

package module

v0.0.0-...-3e144bd Latest Latest Go to latest Published: Nov 13, 2022 License: Apache-2.0 Imports: 5 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/arpitbbhayani/abloom

Links

Open Source Insights

README ¶

abloom

abloom is a pure Go implementation of Bloom Filter. The implementation supports the following Bloom Filters

Simple Bloom Filter
Deletable Bloom Filter

Installation

$ go get github.com/arpitbbhayani/abloom

How to use

A simple example of using bloom filter is as shown below. More examples can be found in the examples directory of the package.

package main

import (
	"log"

	"github.com/arpitbbhayani/abloom"
)

func main() {
	b := abloom.NewSimpleBF(512, nil)
	b.Put([]byte("apple"))
	b.Put([]byte("banana"))
	b.Put([]byte("cat"))

	v, err := b.Check([]byte("apple"))
	if err != nil {
		log.Fatal("error while computing the hash")
	}
	log.Println("is apple present?", v)
}

Running Tests

To run all the tests fire the following command

$ go test ./...

Behaviour

The behaviour of bloom filter on various parameters is evalated on the usecase of profanity detector as documented in the examples directory. Please look at the source code to understand what and how we have benchmarked.

Bloom Filter Size vs False Positive Rate

We can see how the size of bloom filter is related to the observed false positivity rate. Larger the size, smaller the false positivity rate.

size-fp

Number of Hash Functions vs False Positive Rate and Time

Here we see how the number of hash functions used in a bloom filter affects the false positivity rate and also time. The FP rate decreases initially but after a certain stage it saturates and then increases.

The time takes for the operation increases almost linearly with the number of hash functions given how expensive hash computation can get.

hash-fp hash-time

Deletable Bloom Filter Number of Regions vs False Positive Rate

dbf-region

When not to use BF

paradox-benchmark

Benchmark

Preliminary benchmark results on profanity usecase are shown below. Please refer to the perf test code to see what has been tested and benchmarked.

goos: linux
goarch: amd64
pkg: github.com/arpitbbhayani/abloom/examples/perftest
cpu: AMD Ryzen 7 4800U with Radeon Graphics         
BenchmarkBFCheck-16                          381           3172999 ns/op
BenchmarkBFFirstCheck-16                20828637                51.24 ns/op
BenchmarkBFOneRandomCheck-16            11330626               112.2 ns/op
BenchmarkSetCheck-16                         230           4906682 ns/op
BenchmarkSetFirstCheck-16               82293306                14.12 ns/op
BenchmarkSetOneRandomCheck-16            6412029               178.6 ns/op

Citations

Bloom, Burton H. “Space/Time Trade-Offs in Hash Coding with Allowable Errors.” Communications of the ACM 13, no. 7 (July 1970): 422–26. https://doi.org/10.1145/362686.362692.
Rothenberg, Christian Esteve, Carlos A. B. Macapuna, Fabio L. Verdi, and Mauricio F. Magalhaes. “The Deletable Bloom Filter: A New Member of the Bloom Family.” arXiv, May 3, 2010. http://arxiv.org/abs/1005.0352.

Contributors

License

Abloom is open-sourced under Apache License, Version 2.0.

Documentation ¶

Index ¶

Constants
type DeletableBloom
- func NewDeletableBF(size int, hashSeeds []int, numRegions int) *DeletableBloom
type SimpleBF
- func NewSimpleBF(size int, hashSeeds []int) *SimpleBF
- func (b *SimpleBF) Check(x []byte) (bool, error)
- func (b *SimpleBF) Put(x []byte) error

Constants ¶

View Source

const DefaultHashFns int = 2

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type DeletableBloom ¶

type DeletableBloom struct {
	// contains filtered or unexported fields
}

func NewDeletableBF ¶

func NewDeletableBF(size int, hashSeeds []int, numRegions int) *DeletableBloom

NewDeletableBF creates a deletable bloom filter of length `size` bytes and `hashSeeds` are the seed values for the murmur hash functions bloom will be initialized with len(hashSeeds) hash functions with provided seeds. if no hashSeeds are provided then 2 hash functions will be used with random seeds. numRegions are the number of regions in which the bloom filter needs to be split so as to make it deletable. For each region, one extra bit of memory will be allocated.

func (*DeletableBloom) Check ¶

func (b *DeletableBloom) Check(x []byte) (bool, error)

Check checks the existence of the element `x` in the bloom filter

func (*DeletableBloom) Delete ¶

func (b *DeletableBloom) Delete(x []byte) (bool, error)

Delete possibly deletes the element `x` from the bloom filter the element `x` returns error if any and if the element `x` was deleted or not

func (*DeletableBloom) Put ¶

func (b *DeletableBloom) Put(x []byte) error

Put puts the element `x` in the deletable bloom filter

type SimpleBF ¶

type SimpleBF struct {
	// contains filtered or unexported fields
}

func NewSimpleBF ¶

func NewSimpleBF(size int, hashSeeds []int) *SimpleBF

NewSimpleBF creates a simple bloom filter of length `size` bytes and `hashSeeds` are the seed values for themurmur hash functions bloom will be initialized with len(hashSeeds) hash functions with provided seeds. if no hashSeeds are provided then 2 hash functions will be used with random seeds.

func (*SimpleBF) Check ¶

func (b *SimpleBF) Check(x []byte) (bool, error)

Check checks the existence of the element `x` in the bloom filter

func (*SimpleBF) Put ¶

func (b *SimpleBF) Put(x []byte) error

Put puts the element `x` in the bloom filter

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
examples
bloom-paradox
profanity-detector
simple-bf

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL