apbf

package module
v0.0.0-...-171306a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 15, 2022 License: MIT Imports: 7 Imported by: 0

README

Age-Partitioned Bloom Filters

Age-Partitioned Bloom Filters (APBF) is a novel approach for duplicate detection in sliding windows over an unbounded stream of items described in Age-Partitioned Bloom Filters: Ariel Shtul, Carlos Baquero and Paulo Sérgio Almeida, 2020.

The implementation employs the enhanced double hashing technique for fast index computation introduced in Bloom Filters in Probabilistic Verification: Peter C. Dillinger and Panagiotis Manolios, 2004.

Example

// create a filter with k=10, l=7, and g=1000
filter := apbf.New(10, 7, 1000)

item := []byte("test item")
filter.Add(item)

if filter.Query(item) {
    fmt.Println("item was found")
}

Installation

Use go get to add the project to your workspace:

go get -u github.com/CrowdStrike/apbf

Benchmarks

The following results show the performance of main filter operations Add and Query with and without refresh enabled for a small and large filter. Benchmarks were executed on a MacBook Pro 2017 dev laptop.

BenchmarkSmallFilterAdd-8                20000000       103 ns/op       0 B/op       0 allocs/op
BenchmarkSmallFilterAddWithRefresh-8     10000000       177 ns/op       0 B/op       0 allocs/op
BenchmarkSmallFilterQuery-8              10000000       133 ns/op       0 B/op       0 allocs/op
BenchmarkSmallFilterQueryWithRefresh-8   10000000       206 ns/op       0 B/op       0 allocs/op
BenchmarkLargeFilterAdd-8                 5000000       252 ns/op       0 B/op       0 allocs/op
BenchmarkLargeFilterAddWithRefresh-8      5000000       325 ns/op       0 B/op       0 allocs/op
BenchmarkLargeFilterQuery-8               3000000       431 ns/op       0 B/op       0 allocs/op
BenchmarkLargeFilterQueryWithRefresh-8    2000000       543 ns/op       0 B/op       0 allocs/op

Contributors

Bogdan-Ciprian Rusu - Author/Maintainer

License

The project is licensed under the MIT License.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CalculateFalsePositiveRate

func CalculateFalsePositiveRate(k, l uint) float64

CalculateFalsePositiveRate computes the false positive rate for given k and l parameters.

Types

type Filter

type Filter struct {
	// contains filtered or unexported fields
}

Filter represents the Age-Partitioned Bloom Filter (APBF). The implementation is safe for concurrent use.

func New

func New(k, l, g uint) *Filter

New returns the APBF with k + l slices and g generation size.

func NewFromSnapshot

func NewFromSnapshot(s Snapshot) *Filter

NewFromSnapshot recreates the matching filter from the provided snapshot.

func NewWithRefresh

func NewWithRefresh(k, l, g uint, r time.Duration) *Filter

NewWithRefresh returns the APBF with k + l slices, g generation size, and r refresh interval.

func (*Filter) Add

func (f *Filter) Add(item []byte)

Add item to the set.

func (*Filter) MaxCapacity

func (f *Filter) MaxCapacity() int

MaxCapacity returns filter max capacity.

func (*Filter) MaxGenerations

func (f *Filter) MaxGenerations() int

MaxGenerations returns filter max generations count.

func (*Filter) NextGeneration

func (f *Filter) NextGeneration()

NextGeneration transitions to next generation.

func (*Filter) Query

func (f *Filter) Query(item []byte) bool

Query returns true if the item is in the set and false otherwise. A true value might be a false positive whereas false is always correct.

func (*Filter) Snapshot

func (f *Filter) Snapshot() Snapshot

Snapshot returns a consistent snapshot of filter state.

type Snapshot

type Snapshot struct {
	K                    uint64   `protobuf:"varint,1,opt,name=k,proto3" json:"k,omitempty"`
	L                    uint64   `protobuf:"varint,2,opt,name=l,proto3" json:"l,omitempty"`
	G                    uint64   `protobuf:"varint,3,opt,name=g,proto3" json:"g,omitempty"`
	R                    uint64   `protobuf:"varint,4,opt,name=r,proto3" json:"r,omitempty"`
	Base                 uint64   `protobuf:"varint,5,opt,name=base,proto3" json:"base,omitempty"`
	Count                uint64   `protobuf:"varint,6,opt,name=count,proto3" json:"count,omitempty"`
	Buffer               []byte   `protobuf:"bytes,7,opt,name=buffer,proto3" json:"buffer,omitempty"`
	XXX_NoUnkeyedLiteral struct{} `json:"-"`
	XXX_unrecognized     []byte   `json:"-"`
	XXX_sizecache        int32    `json:"-"`
}

func (*Snapshot) Descriptor

func (*Snapshot) Descriptor() ([]byte, []int)

func (*Snapshot) GetBase

func (m *Snapshot) GetBase() uint64

func (*Snapshot) GetBuffer

func (m *Snapshot) GetBuffer() []byte

func (*Snapshot) GetCount

func (m *Snapshot) GetCount() uint64

func (*Snapshot) GetG

func (m *Snapshot) GetG() uint64

func (*Snapshot) GetK

func (m *Snapshot) GetK() uint64

func (*Snapshot) GetL

func (m *Snapshot) GetL() uint64

func (*Snapshot) GetR

func (m *Snapshot) GetR() uint64

func (*Snapshot) ProtoMessage

func (*Snapshot) ProtoMessage()

func (*Snapshot) Reset

func (m *Snapshot) Reset()

func (*Snapshot) String

func (m *Snapshot) String() string

func (*Snapshot) XXX_DiscardUnknown

func (m *Snapshot) XXX_DiscardUnknown()

func (*Snapshot) XXX_Marshal

func (m *Snapshot) XXX_Marshal(b []byte, deterministic bool) ([]byte, error)

func (*Snapshot) XXX_Merge

func (m *Snapshot) XXX_Merge(src proto.Message)

func (*Snapshot) XXX_Size

func (m *Snapshot) XXX_Size() int

func (*Snapshot) XXX_Unmarshal

func (m *Snapshot) XXX_Unmarshal(b []byte) error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL