deduplication

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 12, 2021 License: Apache-2.0 Imports: 1 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type AtomicKeyCheckSetter

type AtomicKeyCheckSetter interface {
	// CheckAndSetKey takes a key and checks if it exists. If does, then this
	// method returns true. If the key is not set and needs to be, then this
	// method returns false.
	CheckAndSetKey(key string) (bool, error)
}

AtomicKeyCheckSetter is a type that can check if they key has been added before, then add it if it does not exist as an atomic operation.

type BloomFilter

type BloomFilter struct {
	// contains filtered or unexported fields
}

BloomFilter is an implementation of KeySetter and KeyChecker that uses a bloom filter data structure to mark keys as seen and to check if keys have been seen before. Bloom filters have the possibility for false positives, so make sure this is something you can accept before using this implementation for data deduplication.

This implementation is geared towards a simple key existance, the key being a composite of data points about an object that must be recognized in the future.

See https://en.wikipedia.org/wiki/Bloom_filter

func NewBloomFilter

func NewBloomFilter(delegate bloomFilterBackend, opts ...BloomFilterOption) *BloomFilter

NewBloomFilter takes a bloomFilterBackend and returns a new BloomFilter.

func (*BloomFilter) CheckAndSetKey

func (bf *BloomFilter) CheckAndSetKey(key string) (bool, error)

CheckAndSetKey checks if the key exists in the set of known keys. If it does not exist, then it is added. This method returns whether or not the key already existed.

func (*BloomFilter) KeyHasBeenSeen

func (bf *BloomFilter) KeyHasBeenSeen(key string) (bool, error)

KeyHasBeenSeen checks if the key exists in the set of known keys. If you want to set they key using SetKeyAsSeen after this, you probably want the atomic CheckAndSetKey instead.

func (*BloomFilter) SetKeyAsSeen

func (bf *BloomFilter) SetKeyAsSeen(key string) error

SetKeyAsSeen adds the key to the set of known keys. If you are doing an exists check before this method, you probably want to use the atomic CheckAndSetKey instead.

type BloomFilterOption

type BloomFilterOption interface {
	// contains filtered or unexported methods
}

BloomFilterOption takes a pointer to bloomFilterOptions and apply's some value to one or more fields.

func WithBloomFilterKeyValidators

func WithBloomFilterKeyValidators(vs ...KeyValidatorFunc) BloomFilterOption

WithBloomFilterKeyValidators sets one or more key validators that are used to validate keys passed to BloomFilter methods.

type KeyChecker

type KeyChecker interface {
	// KeyHasBeenSeen checks if the key has been set in the implementation's
	// storage backend.
	KeyHasBeenSeen(key string) (bool, error)
}

KeyChecker is a type that can take a key and report back information about it. This is not a key/value store, but rather a way to acknowledge the existence of a key.

type KeySetter

type KeySetter interface {
	// SetKeyAsSeen marks key as seen in the implementation's storage backend.
	// It is expected that any keys set remain set indefinitely and cannot be
	// removed (unless the entire storage backend is purged).
	SetKeyAsSeen(key string) error
}

KeySetter is a type that can take a key and record its use. Keys here do not have associated values. This is not a key/value store, but rather a way to acknowledge the existence of a key. Keys are largely arbitrary strings, but should be passed into KeyValidators set by the implementation.

type KeyValidatorFunc

type KeyValidatorFunc func(key string) error

KeyValidatorFunc takes a key and checks it against a simple validation rule.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL