filter

package

v3.0.0 Latest Latest Go to latest Published: Apr 8, 2024 License: AGPL-3.0 Imports: 8 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/grafana/loki

Links

Open Source Insights

Documentation ¶

Index ¶

func OptimalK(fpRate float64) uint
func OptimalM(n uint, fpRate float64) uint
type BucketGetter
type Buckets
- func NewBuckets(count uint, bucketSize uint8) *Buckets
type Checker
type Filter
type PartitionedBloomFilter
- func NewPartitionedBloomFilter(n uint, fpRate float64) *PartitionedBloomFilter
- func NewPartitionedBloomFilterWithCapacity(m uint, fpRate float64) *PartitionedBloomFilter
type ScalableBloomFilter
- func NewDefaultScalableBloomFilter(fpRate float64) *ScalableBloomFilter
- func NewScalableBloomFilter(hint uint, fpRate, r float64) *ScalableBloomFilter

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func OptimalK ¶

func OptimalK(fpRate float64) uint

OptimalK calculates the optimal number of hash functions to use for a Bloom filter based on the desired rate of false positives.

func OptimalM ¶

func OptimalM(n uint, fpRate float64) uint

OptimalM calculates the optimal Bloom filter size, m, based on the number of items and the desired rate of false positives.

Types ¶

type BucketGetter ¶

type BucketGetter interface {
	Get(bucket uint) uint32
}

type Buckets ¶

type Buckets struct {
	// contains filtered or unexported fields
}

Buckets is a fast, space-efficient array of buckets where each bucket can store up to a configured maximum value.

func NewBuckets ¶

func NewBuckets(count uint, bucketSize uint8) *Buckets

NewBuckets creates a new Buckets with the provided number of buckets where each bucket is the specified number of bits.

func (*Buckets) Count ¶

func (b *Buckets) Count() uint

Count returns the number of buckets.

func (*Buckets) DecodeFrom ¶

func (b *Buckets) DecodeFrom(data []byte) (int64, error)

DecodeFrom reads a binary representation of Buckets (such as might have been written by WriteTo()) from a buffer. Whereas ReadFrom() reads the entire data into memory and makes a copy of the data buffer, DecodeFrom keeps a reference to the original data buffer and can only be used to Test.

func (*Buckets) Get ¶

func (b *Buckets) Get(bucket uint) uint32

Get returns the value in the specified bucket.

func (*Buckets) GobDecode ¶

func (b *Buckets) GobDecode(data []byte) error

GobDecode implements gob.GobDecoder interface.

func (*Buckets) GobEncode ¶

func (b *Buckets) GobEncode() ([]byte, error)

GobEncode implements gob.GobEncoder interface.

func (*Buckets) Increment ¶

func (b *Buckets) Increment(bucket uint, delta int32) *Buckets

Increment will increment the value in the specified bucket by the provided delta. A bucket can be decremented by providing a negative delta. The value is clamped to zero and the maximum bucket value. Returns itself to allow for chaining.

func (*Buckets) MaxBucketValue ¶

func (b *Buckets) MaxBucketValue() uint8

MaxBucketValue returns the maximum value that can be stored in a bucket.

func (*Buckets) PopCount ¶

func (b *Buckets) PopCount() (count int)

func (*Buckets) ReadFrom ¶

func (b *Buckets) ReadFrom(stream io.Reader) (int64, error)

ReadFrom reads a binary representation of Buckets (such as might have been written by WriteTo()) from an i/o stream. It returns the number of bytes read.

func (*Buckets) Reset ¶

func (b *Buckets) Reset() *Buckets

Reset restores the Buckets to the original state. Returns itself to allow for chaining.

func (*Buckets) Set ¶

func (b *Buckets) Set(bucket uint, value uint8) *Buckets

Set will set the bucket value. The value is clamped to zero and the maximum bucket value. Returns itself to allow for chaining.

func (*Buckets) WriteTo ¶

func (b *Buckets) WriteTo(stream io.Writer) (int64, error)

WriteTo writes a binary representation of Buckets to an i/o stream. It returns the number of bytes written.

type Checker ¶

type Checker interface {
	// Test will test for membership of the data and returns true if it is a
	// member, false if not.
	Test(data []byte) bool
}

type Filter ¶

type Filter interface {
	Checker

	// Add will add the data to the Bloom filter. It returns the filter to
	// allow for chaining.
	Add([]byte) Filter

	// TestAndAdd is equivalent to calling Test followed by Add. It returns
	// true if the data is a member, false if not.
	TestAndAdd([]byte) bool
}

Filter is a probabilistic data structure which is used to test the membership of an element in a set.

type PartitionedBloomFilter ¶

type PartitionedBloomFilter struct {
	// contains filtered or unexported fields
}

PartitionedBloomFilter implements a variation of a classic Bloom filter as described by Almeida, Baquero, Preguica, and Hutchison in Scalable Bloom Filters:

http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf

This filter works by partitioning the M-sized bit array into k slices of size m = M/k bits. Each hash function produces an index over m for its respective slice. Thus, each element is described by exactly k bits, meaning the distribution of false positives is uniform across all elements.

func NewPartitionedBloomFilter ¶

func NewPartitionedBloomFilter(n uint, fpRate float64) *PartitionedBloomFilter

NewPartitionedBloomFilter creates a new partitioned Bloom filter optimized to store n items with a specified target false-positive rate.

func NewPartitionedBloomFilterWithCapacity ¶

func NewPartitionedBloomFilterWithCapacity(m uint, fpRate float64) *PartitionedBloomFilter

NewPartitionedBloomFilterWithEstimates creates a new partitioned Bloom filter with a specific capacity

func (*PartitionedBloomFilter) Add ¶

func (p *PartitionedBloomFilter) Add(data []byte) Filter

Add will add the data to the Bloom filter. It returns the filter to allow for chaining.

func (*PartitionedBloomFilter) Capacity ¶

func (p *PartitionedBloomFilter) Capacity() uint

Capacity returns the Bloom filter capacity, m.

func (*PartitionedBloomFilter) Count ¶

func (p *PartitionedBloomFilter) Count() uint

Count returns the number of items added to the filter.

func (*PartitionedBloomFilter) DecodeFrom ¶

func (p *PartitionedBloomFilter) DecodeFrom(data []byte) (int64, error)

DecodeFrom reads a binary representation of PartitionedBloomFilter (such as might have been written by WriteTo()) from a buffer. Whereas ReadFrom() calls Buckets.ReadFrom() hence making a copy of the data, DecodeFrom calls Buckets.DecodeFrom which keeps a reference to the original data buffer.

func (*PartitionedBloomFilter) EstimatedFillRatio ¶

func (p *PartitionedBloomFilter) EstimatedFillRatio() float64

EstimatedFillRatio returns the current estimated ratio of set bits.

func (*PartitionedBloomFilter) FillRatio ¶

func (p *PartitionedBloomFilter) FillRatio() float64

FillRatio returns the average ratio of set bits across all partitions.

func (*PartitionedBloomFilter) GobDecode ¶

func (p *PartitionedBloomFilter) GobDecode(data []byte) error

GobDecode implements gob.GobDecoder interface.

func (*PartitionedBloomFilter) GobEncode ¶

func (p *PartitionedBloomFilter) GobEncode() ([]byte, error)

GobEncode implements gob.GobEncoder interface.

func (*PartitionedBloomFilter) K ¶

func (p *PartitionedBloomFilter) K() uint

K returns the number of hash functions.

func (*PartitionedBloomFilter) OptimalCount ¶

func (p *PartitionedBloomFilter) OptimalCount() uint

OptimalCount returns the optimal number of distinct items that can be stored in this filter.

func (*PartitionedBloomFilter) ReadFrom ¶

func (p *PartitionedBloomFilter) ReadFrom(stream io.Reader) (int64, error)

ReadFrom reads a binary representation of PartitionedBloomFilter (such as might have been written by WriteTo()) from an i/o stream. It returns the number of bytes read.

func (*PartitionedBloomFilter) Reset ¶

func (p *PartitionedBloomFilter) Reset() *PartitionedBloomFilter

Reset restores the Bloom filter to its original state. It returns the filter to allow for chaining.

func (*PartitionedBloomFilter) SetHash ¶

func (p *PartitionedBloomFilter) SetHash(h hash.Hash64)

SetHash sets the hashing function used in the filter. For the effect on false positive rates see: https://github.com/tylertreat/BoomFilters/pull/1

func (*PartitionedBloomFilter) Test ¶

func (p *PartitionedBloomFilter) Test(data []byte) bool

Test will test for membership of the data and returns true if it is a member, false if not. This is a probabilistic test, meaning there is a non-zero probability of false positives but a zero probability of false negatives. Due to the way the filter is partitioned, the probability of false positives is uniformly distributed across all elements.

func (*PartitionedBloomFilter) TestAndAdd ¶

func (p *PartitionedBloomFilter) TestAndAdd(data []byte) bool

TestAndAdd is equivalent to calling Test followed by Add. It returns true if the data is a member, false if not.

func (*PartitionedBloomFilter) UpdateCount ¶

func (p *PartitionedBloomFilter) UpdateCount() float64

Since duplicates can be added to a bloom filter, we update the count via the following formula via https://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf

n ≈ −m ln(1 − p). This prevents the count from being off by a large amount when duplicates are added. NOTE: this calls FillRatio which calculates the hamming weight from across all bits which can be relatively expensive. Returns current exact fill ratio

func (*PartitionedBloomFilter) WriteTo ¶

func (p *PartitionedBloomFilter) WriteTo(stream io.Writer) (int64, error)

WriteTo writes a binary representation of the PartitionedBloomFilter to an i/o stream. It returns the number of bytes written.

type ScalableBloomFilter ¶

type ScalableBloomFilter struct {
	// contains filtered or unexported fields
}

ScalableBloomFilter implements a Scalable Bloom Filter as described by Almeida, Baquero, Preguica, and Hutchison in Scalable Bloom Filters:

http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf

A Scalable Bloom Filter dynamically adapts to the number of elements in the data set while enforcing a tight upper bound on the false-positive rate. This works by adding Bloom filters with geometrically decreasing false-positive rates as filters become full. The tightening ratio, r, controls the filter growth. The compounded probability over the whole series converges to a target value, even accounting for an infinite series.

Scalable Bloom Filters are useful for cases where the size of the data set isn't known a priori and memory constraints aren't of particular concern. For situations where memory is bounded, consider using Inverse or Stable Bloom Filters.

func NewDefaultScalableBloomFilter ¶

func NewDefaultScalableBloomFilter(fpRate float64) *ScalableBloomFilter

NewDefaultScalableBloomFilter creates a new Scalable Bloom Filter with the specified target false-positive rate and an optimal tightening ratio.

func NewScalableBloomFilter ¶

func NewScalableBloomFilter(hint uint, fpRate, r float64) *ScalableBloomFilter

NewScalableBloomFilter creates a new Scalable Bloom Filter with the specified target false-positive rate and tightening ratio. Use NewDefaultScalableBloomFilter if you don't want to calculate these parameters.

func (*ScalableBloomFilter) Add ¶

func (s *ScalableBloomFilter) Add(data []byte) Filter

Add will add the data to the Bloom filter. It returns the filter to allow for chaining.

func (*ScalableBloomFilter) Capacity ¶

func (s *ScalableBloomFilter) Capacity() uint

Capacity returns the current Scalable Bloom Filter capacity, which is the sum of the capacities for the contained series of Bloom filters.

func (*ScalableBloomFilter) DecodeFrom ¶

func (s *ScalableBloomFilter) DecodeFrom(data []byte) (int64, error)

DecodeFrom reads a binary representation of ScalableBloomFilter (such as might have been written by WriteTo()) from a buffer. Whereas ReadFrom() calls PartitionedBloomFilter.ReadFrom() hence making a copy of the data, DecodeFrom calls PartitionedBloomFilter.DecodeFrom which keeps a reference to the original data buffer.

func (*ScalableBloomFilter) FillRatio ¶

func (s *ScalableBloomFilter) FillRatio() float64

FillRatio returns the average ratio of set bits across every filter.

func (*ScalableBloomFilter) GobDecode ¶

func (s *ScalableBloomFilter) GobDecode(data []byte) error

GobDecode implements gob.GobDecoder interface.

func (*ScalableBloomFilter) GobEncode ¶

func (s *ScalableBloomFilter) GobEncode() ([]byte, error)

GobEncode implements gob.GobEncoder interface.

func (*ScalableBloomFilter) K ¶

func (s *ScalableBloomFilter) K() uint

K returns the number of hash functions used in each Bloom filter. Returns the highest value (the last filter)

func (*ScalableBloomFilter) ReadFrom ¶

func (s *ScalableBloomFilter) ReadFrom(stream io.Reader) (int64, error)

ReadFrom reads a binary representation of ScalableBloomFilter (such as might have been written by WriteTo()) from an i/o stream. It returns the number of bytes read.

func (*ScalableBloomFilter) Reset ¶

func (s *ScalableBloomFilter) Reset() *ScalableBloomFilter

Reset restores the Bloom filter to its original state. It returns the filter to allow for chaining.

func (*ScalableBloomFilter) SetHash ¶

func (s *ScalableBloomFilter) SetHash(h hash.Hash64)

SetHash sets the hashing function used in the filter. For the effect on false positive rates see: https://github.com/tylertreat/BoomFilters/pull/1

func (*ScalableBloomFilter) Test ¶

func (s *ScalableBloomFilter) Test(data []byte) bool

Test will test for membership of the data and returns true if it is a member, false if not. This is a probabilistic test, meaning there is a non-zero probability of false positives but a zero probability of false negatives.

func (*ScalableBloomFilter) TestAndAdd ¶

func (s *ScalableBloomFilter) TestAndAdd(data []byte) bool

TestAndAdd is equivalent to calling Test followed by Add. It returns true if the data is a member, false if not.

func (*ScalableBloomFilter) WriteTo ¶

func (s *ScalableBloomFilter) WriteTo(stream io.Writer) (int64, error)

WriteTo writes a binary representation of the ScalableBloomFilter to an i/o stream. It returns the number of bytes written.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL