abloom

package module
v0.0.0-...-3e144bd Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 13, 2022 License: Apache-2.0 Imports: 5 Imported by: 0

README

abloom

abloom is a pure Go implementation of Bloom Filter. The implementation supports the following Bloom Filters

  • Simple Bloom Filter
  • Deletable Bloom Filter

Installation

$ go get github.com/arpitbbhayani/abloom

How to use

A simple example of using bloom filter is as shown below. More examples can be found in the examples directory of the package.

package main

import (
	"log"

	"github.com/arpitbbhayani/abloom"
)

func main() {
	b := abloom.NewSimpleBF(512, nil)
	b.Put([]byte("apple"))
	b.Put([]byte("banana"))
	b.Put([]byte("cat"))

	v, err := b.Check([]byte("apple"))
	if err != nil {
		log.Fatal("error while computing the hash")
	}
	log.Println("is apple present?", v)
}

Running Tests

To run all the tests fire the following command

$ go test ./...

Behaviour

The behaviour of bloom filter on various parameters is evalated on the usecase of profanity detector as documented in the examples directory. Please look at the source code to understand what and how we have benchmarked.

Bloom Filter Size vs False Positive Rate

We can see how the size of bloom filter is related to the observed false positivity rate. Larger the size, smaller the false positivity rate.

size-fp

Number of Hash Functions vs False Positive Rate and Time

Here we see how the number of hash functions used in a bloom filter affects the false positivity rate and also time. The FP rate decreases initially but after a certain stage it saturates and then increases.

The time takes for the operation increases almost linearly with the number of hash functions given how expensive hash computation can get.

hash-fp hash-time

Deletable Bloom Filter Number of Regions vs False Positive Rate

dbf-region

When not to use BF

paradox-benchmark

Benchmark

Preliminary benchmark results on profanity usecase are shown below. Please refer to the perf test code to see what has been tested and benchmarked.

goos: linux
goarch: amd64
pkg: github.com/arpitbbhayani/abloom/examples/perftest
cpu: AMD Ryzen 7 4800U with Radeon Graphics         
BenchmarkBFCheck-16                          381           3172999 ns/op
BenchmarkBFFirstCheck-16                20828637                51.24 ns/op
BenchmarkBFOneRandomCheck-16            11330626               112.2 ns/op
BenchmarkSetCheck-16                         230           4906682 ns/op
BenchmarkSetFirstCheck-16               82293306                14.12 ns/op
BenchmarkSetOneRandomCheck-16            6412029               178.6 ns/op

Citations

  • Bloom, Burton H. “Space/Time Trade-Offs in Hash Coding with Allowable Errors.” Communications of the ACM 13, no. 7 (July 1970): 422–26. https://doi.org/10.1145/362686.362692.
  • Rothenberg, Christian Esteve, Carlos A. B. Macapuna, Fabio L. Verdi, and Mauricio F. Magalhaes. “The Deletable Bloom Filter: A New Member of the Bloom Family.” arXiv, May 3, 2010. http://arxiv.org/abs/1005.0352.

Contributors

License

Abloom is open-sourced under Apache License, Version 2.0.

Documentation

Index

Constants

View Source
const DefaultHashFns int = 2

Variables

This section is empty.

Functions

This section is empty.

Types

type DeletableBloom

type DeletableBloom struct {
	// contains filtered or unexported fields
}

func NewDeletableBF

func NewDeletableBF(size int, hashSeeds []int, numRegions int) *DeletableBloom

NewDeletableBF creates a deletable bloom filter of length `size` bytes and `hashSeeds` are the seed values for the murmur hash functions bloom will be initialized with len(hashSeeds) hash functions with provided seeds. if no hashSeeds are provided then 2 hash functions will be used with random seeds. numRegions are the number of regions in which the bloom filter needs to be split so as to make it deletable. For each region, one extra bit of memory will be allocated.

func (*DeletableBloom) Check

func (b *DeletableBloom) Check(x []byte) (bool, error)

Check checks the existence of the element `x` in the bloom filter

func (*DeletableBloom) Delete

func (b *DeletableBloom) Delete(x []byte) (bool, error)

Delete possibly deletes the element `x` from the bloom filter the element `x` returns error if any and if the element `x` was deleted or not

func (*DeletableBloom) Put

func (b *DeletableBloom) Put(x []byte) error

Put puts the element `x` in the deletable bloom filter

type SimpleBF

type SimpleBF struct {
	// contains filtered or unexported fields
}

func NewSimpleBF

func NewSimpleBF(size int, hashSeeds []int) *SimpleBF

NewSimpleBF creates a simple bloom filter of length `size` bytes and `hashSeeds` are the seed values for themurmur hash functions bloom will be initialized with len(hashSeeds) hash functions with provided seeds. if no hashSeeds are provided then 2 hash functions will be used with random seeds.

func (*SimpleBF) Check

func (b *SimpleBF) Check(x []byte) (bool, error)

Check checks the existence of the element `x` in the bloom filter

func (*SimpleBF) Put

func (b *SimpleBF) Put(x []byte) error

Put puts the element `x` in the bloom filter

Directories

Path Synopsis
examples

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL