bloomfilter

package module
v0.0.0-...-0e70bfd Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 9, 2018 License: GPL-3.0 Imports: 2 Imported by: 0

README

Bloom Filter

GoDoc Master Build Status Coverage Status Go Report Card

This Go implementation of a Bloom filter uses the non-cryptographic Fowler–Noll–Vo hash function for speed.

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not, thus a Bloom filter has a 100% recall rate (Bloom, 1970)

Installation

go get -u github.com/Echelon9/bloomfilter-go

Usage

This implementation accepts keys for adding and testing as string

import (
	...
	"github.com/Echelon9/bloomfilter-go"
	...
)

set := bloomfilter.New(1000) // Size
set.Add("Hello, world!")
set.Test("Hello, world!") // => true
set.Test("String Not-appearing-in-this-set") // => false

Optimizations

The Bloom filter implemented in this library is the "Standard" variant.

There are no current plans to implement future modifications for additional variants, but patches are welcome!

However, this Bloom filter implementation does adopt an idea from (Kirsh and Mitzenmachner, 2006) that two hash functions can simulate an arbitrary number of additional hash functions without losing functionality.

Less hashing, same performance: Building a better bloom filter (2006) Adam Kirsch, Michael Mitzenmacher

"A standard technique from the hashing literature is to use two hash functions h1(x) and h2(x) to simulate additional hash functions of the form gi(x) = h1(x) + ih2(x). We demonstrate that this technique can be usefully applied to Bloom filters and related data structures. Specifically, only two hash functions are necessary to effectively implement a Bloom filter without any loss in the asymptotic false positive probability. This leads to less computation and potentially less need for randomness in practice."

Running all tests

Before committing the code, please check if it passes all tests using:

go test -v ./...

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type BloomFilter

type BloomFilter struct {
	// contains filtered or unexported fields
}

BloomFilter probabilistic data structure definition

func New

func New(size uint) *BloomFilter

New returns a new BloomFilter object

func (*BloomFilter) Add

func (f *BloomFilter) Add(item string)

Add item to the Bloom filter

func (*BloomFilter) EstimatedFillRatio

func (f *BloomFilter) EstimatedFillRatio() float64

EstimatedFillRatio returns the estimated fill ratio of the Bloom filter

func (*BloomFilter) Test

func (f *BloomFilter) Test(item string) (exists bool)

Test returns true if the item is in the Bloom filter, false otherwise. If true, the result might be a false positive. If false, the data is definitely not in the set.

type Interface

type Interface interface {
	Add(item string)             // Adds item to the BloomFilter
	Test(item string) bool       // Test if item is maybe in the BloomFilter
	EstimatedFillRatio() float64 // Estimate fill ratio of the BloomFilter
}

Interface provides a template for a Bloom filter implementation

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL