`import "github.com/yourbasic/bloom"`

Package bloom provides a Bloom filter implementation.

A Bloom filter is a fast and space-efficient probabilistic data structure used to test set membership.

A membership test returns either ”likely member” or ”definitely not a member”. Only false positives can occur: an element that has been added to the filter will always be identified as ”likely member”.

The probabilities of different outcomes of a membership test at a false-positives rate of 1/100 are:

Test(s) true false -------------------------------------- s has been added 1 0 s has not been added 0.01 0.99

Elements can be added, but not removed. With more elements in the filter, the probability of false positives increases.

A full filter with a false-positives rate of 1/p uses roughly 0.26ln(p) bytes per element and performs ⌈1.4ln(p)⌉ bit array lookups per test:

p bytes lookups ------------------------- 4 0.4 2 8 0.5 3 16 0.7 4 32 0.9 5 64 1.1 6 128 1.3 7 256 1.5 8 512 1.6 9 1024 1.8 10

Each membership test makes a single call to a 128-bit hash function. This improves speed without increasing the false-positives rate as shown by Kirsch and Mitzenmacher.

This implementation is not intended for cryptographic use.

The internal data representation is different for big-endian and little-endian machines.

The Basics example contains a typcial use case: a blacklist of shady websites.

Build a blacklist of shady websites.

Code:

// Create a Bloom filter with room for 10000 elements // at a false-positives rate less than 0.5 percent. blacklist := bloom.New(10000, 200) // Add an element to the filter. url := "https://rascal.com" blacklist.Add(url) // Test for membership. if blacklist.Test(url) { fmt.Println(url, "seems to be shady.") } else { fmt.Println(url, "has not yet been added to our blacklist.") }

Output:

https://rascal.com seems to be shady.

❖

```
type Filter struct {
// contains filtered or unexported fields
}
```

Filter represents a Bloom filter.

New creates an empty Bloom filter with room for n elements at a false-positives rate less than 1/p.

Add adds s to the filter and tells if s was already a likely member.

AddByte adds b to the filter and tells if b was already a likely member.

Count returns an estimate of the number of elements in the filter.

Test tells if s is a likely member of the filter. If true, s is probably a member; if false, s is definitely not a member.

TestByte tells if b is a likely member of the filter. If true, b is probably a member; if false, b is definitely not a member.

Union returns a new Bloom filter that consists of all elements that belong to either f1 or f2. The two filters must be of the same size n and have the same false-positives rate p.

The resulting filter is the same as the filter created from scratch using the union of the two sets.

Compute the union of two filters.

Code:

// Create two Bloom filters, each with room for 1000 elements // at a false-positives rate less than 1/100. n, p := 1000, 100 f1 := bloom.New(n, p) f2 := bloom.New(n, p) // Add "0", "2", …, "498" to f1 for i := 0; i < n/2; i += 2 { f1.Add(strconv.Itoa(i)) } // Add "1", "3", …, "499" to f2 for i := 1; i < n/2; i += 2 { f2.Add(strconv.Itoa(i)) } // Compute the approximate size of f1 ∪ f2. fmt.Println("f1 ∪ f2:", f1.Union(f2).Count())

Output:

f1 ∪ f2: 505

Package bloom imports 1 packages (graph). Updated 2017-12-20. Refresh now. Tools for package owners.