roaring

package
v2.9.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 4, 2021 License: Apache-2.0 Imports: 16 Imported by: 0

README

The Fuzzer

For complete documentation on go-fuzz, please see: https://github.com/dvyukov/go-fuzz

The fuzzer in relation to the roaring package checks the Bitmap.UnmarshalBinary function found in roaring.go. In order to use the fuzzer, you can follow these steps:

cd $GOPATH/src/github.com/pilosa/pilosa/roaring

go-fuzz-build ./

You must now make the workdir/corpus directory. This is achieved by:

mkdir workdir/corpus

The fuzzer needs some input to start the fuzzing with. Copy some sample Pilosa fragments into the workdir/corpus folder. For example:

cp ~/.pilosa/my-index/my-field/views/standard/fragments/0 workdir/corpus

Once you have copied your sample inputs, you are ready to run the fuzzer:

go-fuzz -bin=roaring-fuzz.zip -workdir=workdir -func=FuzzBitmapUnmarshalBinary

Understanding the Fuzzer Output

The fuzzer will output something similar to the follwoing:

2015/04/25 12:39:53 workers: 8, corpus: 124 (12s ago), crashers: 37, restarts: 1/15, execs: 35342 (2941/sec), cover: 403, uptime: 12s

The most important part of the output is the crashers and cover. The crashers records how many combinations were discovered that fail and the cover tells you how much code is being accessed. For a complete explanation of the output, please see: https://github.com/dvyukov/go-fuzz.

The fuzzer will document the crashers in a folder labeled "crashers." It will record the fragment and the error that was produced in two separate files within this folder. This is the final product.

Happy Fuzzing!

Documentation

Overview

Package roaring implements roaring bitmaps with support for incremental changes.

Index

Constants

View Source
const (
	// MagicNumber is an identifier, in bytes 0-1 of the file.
	MagicNumber = uint32(12348)

	MaxContainerVal = 0xffff
)
View Source
const (
	ContainerNil    byte = iota // no container
	ContainerArray              // slice of bit position values
	ContainerBitmap             // slice of 1024 uint64s
	ContainerRun                // container of run-encoded bits
)
View Source
const ArrayMaxSize = 4096

ArrayMaxSize represents the maximum size of array containers.

Variables

View Source
var ContainerArchetypeNames = []string{
	"Empty",
	"Ary1",
	"Ary16",
	"Ary256",
	"Ary512",
	"Ary1024",
	"Ary4096",
	"RunFull",
	"RunSplit",
	"Run16",
	"Run16Small",
	"Run256",
	"Run256Small",
	"Run1024",
	"BM512",
	"BM1024",
	"BM4096",
	"BM4097",
	"BM32768",
	"BM65000",
}

ContainerArchetypeNames is the list of supported container archetypes used in some testing. This is exported for use in a benchmark-analysis tool.

View Source
var NewFileBitmap = NewBTreeBitmap

NewFileBitmap returns a Bitmap with an initial set of values, used for file storage.

Functions

func ApplyFilterToIterator added in v2.7.0

func ApplyFilterToIterator(filter BitmapFilter, iter ContainerIterator) error

ApplyFilterToIterator is a simplistic implementation that applies a bitmap filter to a ContainerIterator, returning an error if it encounters an error.

This mostly exists for testing purposes; a Tx implementation where generating containers is expensive should almost certainly implement a better way to use filters which only generates data if it needs to.

func ArrayCountRange added in v2.2.0

func ArrayCountRange(array []uint16, start, end int32) (n int32)

func AsArray added in v2.2.0

func AsArray(c *Container) []uint16

func AsBitmap added in v2.2.0

func AsBitmap(c *Container) []uint64

func BinSearchRuns added in v2.2.0

func BinSearchRuns(v uint16, a []Interval16) (int32, bool)

BinSearchRuns returns the index of the run containing v, and true, when v is contained; or the index of the next run starting after v, and false, when v is not contained.

func BitmapCountRange added in v2.2.0

func BitmapCountRange(bitmap []uint64, start, end int32) int32

func BitmapsToRoaring

func BitmapsToRoaring(bitmaps []*Bitmap) []byte

BitmapsToRoaring renders a series of non-overlapping bitmaps as a unified roaring file.

func CompareBitmapMap

func CompareBitmapMap(b *Bitmap, vals map[uint64]struct{}) (bool, error)

CompareBitmapMap checks whether a bitmap has the same values in it that a provided map[uint64]struct{} has as keys.

func CompareBitmapSlice

func CompareBitmapSlice(b *Bitmap, vals []uint64) (bool, error)

CompareBitmapSlice checks whether a bitmap has the same values in it that a provided slice does.

func ContainerType added in v2.2.0

func ContainerType(c *Container) byte

func InitContainerArchetypes added in v2.3.0

func InitContainerArchetypes() ([][]*Container, error)

InitContainerArchetypes ensures that createContainerArchetypes has been called, and returns the results of that one call.

func IntersectionAny added in v2.3.0

func IntersectionAny(a, b *Container) bool

IntersectionAny checks whether two containers have any overlap without counting past the first bit found.

func IntersectionCount added in v2.4.0

func IntersectionCount(x, y *Container) int32

func Merge added in v2.2.0

func Merge(a, b []uint16)

func NewSliceContainers added in v2.2.0

func NewSliceContainers() *sliceContainers

func RunCountRange added in v2.2.0

func RunCountRange(runs []Interval16, start, end int32) (n int32)

RunCountRange returns the ranged bit count for RLE pairs.

Types

type AdvisoryError

type AdvisoryError interface {
	error
	AdvisoryOnly()
}

AdvisoryError is used for the special case where we probably want to *report* an error reading a file, but don't want to actually count the file as not being read. For instance, a partial ops-log entry is *probably* harmless; we probably crashed while writing (?) and as such didn't report the write as successful. We hope.

type Bitmap

type Bitmap struct {
	Containers Containers
	Source     Source

	// User-defined flags.
	Flags byte

	// Writer where operations are appended to.
	OpWriter io.Writer
	// contains filtered or unexported fields
}

Bitmap represents a roaring bitmap.

func Add added in v2.4.0

func Add(x, y []*Bitmap) []*Bitmap

Add two BSI bitmaps producing a new BSI bitmap.

func InspectBinary

func InspectBinary(data []byte, mapped bool, info *BitmapInfo) (b *Bitmap, mappedAny bool, err error)

InspectBinary reads a roaring bitmap, plus a possible ops log, and reports back on the contents, including distinguishing between the original ops log and the post-ops-log contents.

func NewBTreeBitmap

func NewBTreeBitmap(a ...uint64) *Bitmap

func NewBitmap

func NewBitmap(a ...uint64) *Bitmap

NewBitmap returns a Bitmap with an initial set of values.

func NewSliceBitmap

func NewSliceBitmap(a ...uint64) *Bitmap

NewSliceBitmap makes a new bitmap, explicitly selecting the slice containers type, which performs better in cases where we expect a contiguous block of containers added in ascending order, such as when extracting a range from another bitmap.

func RoaringToBitmaps

func RoaringToBitmaps(data []byte, shardWidth uint64) ([]*Bitmap, []uint64)

RoaringToBitmaps yields a series of bitmaps with specified shard keys, based on a single roaring file, with splits at multiples of shardWidth, which should be a multiple of container size.

func (*Bitmap) Add

func (b *Bitmap) Add(a ...uint64) (changed bool, err error)

Add adds values to the bitmap. TODO(2.0) deprecate - use the more general AddN (though be aware that it modifies 'a' in place).

func (*Bitmap) AddN

func (b *Bitmap) AddN(a ...uint64) (changed int, err error)

AddN adds values to the bitmap, appending them all to the op log in a batched write. It returns the number of changed bits. The input slice may be reordered, and the set of changed bits will end up in a[:changed].

func (*Bitmap) Any

func (b *Bitmap) Any() bool

Any checks whether there are any set bits within the bitmap.

func (*Bitmap) AsContainerMatrixString added in v2.7.0

func (b *Bitmap) AsContainerMatrixString() (r string)

AsContainerMatrixString returns a string showing the matrix of rows in a shard, showing the count of hot (1) bits in each container.

func (*Bitmap) BitwiseEqual

func (b *Bitmap) BitwiseEqual(c *Bitmap) (bool, error)

BitwiseEqual is used mostly in test cases to confirm that two bitmaps came out the same. It does not expect corresponding opN, or OpWriter, but expects identical bit contents. It does not expect identical representations; a bitmap container can be identical to an array container. It returns a boolean value, and also an explanation for a false value.

func (*Bitmap) Check

func (b *Bitmap) Check() error

Check performs a consistency check on the bitmap. Returns nil if consistent.

func (*Bitmap) Clone

func (b *Bitmap) Clone() *Bitmap

Clone returns a heap allocated copy of the bitmap. Note: The OpWriter IS NOT copied to the new bitmap.

func (*Bitmap) Contains

func (b *Bitmap) Contains(v uint64) bool

Contains returns true if v is in the bitmap.

func (*Bitmap) Count

func (b *Bitmap) Count() (n uint64)

Count returns the number of bits set in the bitmap.

func (*Bitmap) CountRange

func (b *Bitmap) CountRange(start, end uint64) (n uint64)

CountRange returns the number of bits set between [start, end).

func (*Bitmap) Difference

func (b *Bitmap) Difference(other ...*Bitmap) *Bitmap

Difference returns the difference of b and other.

func (*Bitmap) DifferenceInPlace

func (b *Bitmap) DifferenceInPlace(others ...*Bitmap)

DifferenceInPlace returns the bitwise difference of b and others, modifying b in place.

func (*Bitmap) DirectAdd

func (b *Bitmap) DirectAdd(v uint64) bool

DirectAdd adds a value to the bitmap by bypassing the op log. TODO(2.0) deprecate in favor of DirectAddN.

func (*Bitmap) DirectAddN

func (b *Bitmap) DirectAddN(a ...uint64) (changed int)

DirectAddN sets multiple bits in the bitmap, returning how many changed. It modifies the slice 'a' in place such that once it's complete a[:changed] will be list of changed bits. It is more efficient than repeated calls to DirectAdd for semi-dense sorted data because it reuses the container from the previous value if the new value has the same highbits instead of looking it up each time. TODO: if Containers implementations cached the last few Container objects returned from calls like Get and GetOrCreate, this optimization would be less useful.

func (*Bitmap) DirectRemoveN

func (b *Bitmap) DirectRemoveN(a ...uint64) (changed int)

DirectRemoveN behaves analgously to DirectAddN.

func (*Bitmap) Flip

func (b *Bitmap) Flip(start, end uint64) *Bitmap

Flip performs a logical negate of the bits in the range [start,end].

func (*Bitmap) ForEach

func (b *Bitmap) ForEach(fn func(uint64) error) error

ForEach executes fn for each value in the bitmap.

func (*Bitmap) ForEachRange

func (b *Bitmap) ForEachRange(start, end uint64, fn func(uint64) error) error

ForEachRange executes fn for each value in the bitmap between [start, end).

func (*Bitmap) Freeze

func (b *Bitmap) Freeze() *Bitmap

Freeze returns a shallow copy of the bitmap. The new bitmap is a distinct bitmap, with a new Containers object, but the actual containers it holds are the same as the parent's containers, but have been frozen.

func (*Bitmap) ImportRoaringBits

func (b *Bitmap) ImportRoaringBits(data []byte, clear bool, log bool, rowSize uint64) (changed int, rowSet map[uint64]int, err error)

ImportRoaringBits sets-or-clears bits based on a provided Roaring bitmap. This should be equivalent to unmarshalling the bitmap, then executing either `b = Union(b, newB)` or `b = Difference(b, newB)`, but with lower overhead. The log parameter controls whether to write to the op log; the answer should always be yes, except if you're calling using this to apply the op log.

If rowSize is non-zero, we should return a map of rows we altered, where "rows" are sets of rowSize containers. Otherwise the map isn't used. (This allows ImportRoaring to update caches; see fragment.go.)

func (*Bitmap) ImportRoaringRawIterator added in v2.2.0

func (b *Bitmap) ImportRoaringRawIterator(itr RoaringIterator, clear bool, log bool, rowSize uint64) (changed int, rowSet map[uint64]int, err error)

func (*Bitmap) Info

func (b *Bitmap) Info(includeContainers bool) BitmapInfo

Info returns stats for the bitmap.

func (*Bitmap) Intersect

func (b *Bitmap) Intersect(other *Bitmap) *Bitmap

Intersect returns the intersection of b and other.

func (*Bitmap) IntersectInPlace

func (b *Bitmap) IntersectInPlace(others ...*Bitmap)

IntersectInPlace returns the bitwise intersection of b and others, modifying b in place.

func (*Bitmap) IntersectionCount

func (b *Bitmap) IntersectionCount(other *Bitmap) uint64

IntersectionCount returns the number of set bits that would result in an intersection between b and other. It is more efficient than actually intersecting the two and counting the result.

func (*Bitmap) Iterator

func (b *Bitmap) Iterator() *Iterator

Iterator returns a new iterator for the bitmap.

func (*Bitmap) IteratorAt

func (b *Bitmap) IteratorAt(start uint64) *Iterator

func (*Bitmap) Max

func (b *Bitmap) Max() uint64

Max returns the highest value in the bitmap. Returns zero if the bitmap is empty.

func (*Bitmap) MergeRoaringRawIteratorIntoExists added in v2.9.0

func (b *Bitmap) MergeRoaringRawIteratorIntoExists(itr RoaringIterator, rowSize uint64) error

MergeRoaringRawIteratorIntoExists is a special merge iterator that flattens the results onto a single row which is used for determining existence. All row references are removed and only the column is considered.

func (*Bitmap) Min

func (b *Bitmap) Min() (uint64, bool)

Min returns the lowest value in the bitmap. Second return value is true if containers exist in the bitmap.

func (*Bitmap) MinAt

func (b *Bitmap) MinAt(start uint64) (uint64, bool)

MinAt returns the lowest value in the bitmap at least equal to its argument. Second return value is true if containers exist in the bitmap.

func (*Bitmap) OffsetRange

func (b *Bitmap) OffsetRange(offset, start, end uint64) *Bitmap

OffsetRange returns a new bitmap with a containers offset by start. The containers themselves are shared, so they get frozen so it will be safe to interact with them.

func (*Bitmap) Ops

func (b *Bitmap) Ops() (ops int, opN int)

Ops returns the number of write ops the bitmap is aware of in its ops log, and their total bit count.

func (*Bitmap) Optimize

func (b *Bitmap) Optimize()

Optimize converts array and bitmap containers to run containers as necessary.

func (*Bitmap) PreferMapping

func (b *Bitmap) PreferMapping(preferred bool)

func (*Bitmap) Put added in v2.2.0

func (b *Bitmap) Put(key uint64, c *Container)

func (*Bitmap) RemapRoaringStorage

func (b *Bitmap) RemapRoaringStorage(data []byte) (mappedAny bool, returnErr error)

RemapRoaringStorage tries to update all containers to refer to the roaring bitmap in the provided []byte. If any containers are marked as mapped, but do not match the provided storage, they will be unmapped. The boolean return indicates whether or not any containers were mapped to the given storage.

Regardless, after this function runs, no containers have mapped storage which does not refer to data; either they got mapped to the new storage, or storage was allocated for them.

func (*Bitmap) Remove

func (b *Bitmap) Remove(a ...uint64) (changed bool, err error)

Remove removes values from the bitmap (writing to the op log if available). TODO(2.0) deprecate - use the more general RemoveN (though be aware that it modifies 'a' in place).

func (*Bitmap) RemoveN

func (b *Bitmap) RemoveN(a ...uint64) (changed int, err error)

RemoveN behaves analagously to AddN.

func (*Bitmap) SanityCheckMapping

func (b *Bitmap) SanityCheckMapping(from, to uintptr) (mappedIn int64, mappedOut int64, unmappedIn int64, errs int, err error)

SanityCheckMapping is a debugging function which checks whether containers are *correctly* recorded as mapped or unmapped.

func (*Bitmap) SetOps

func (b *Bitmap) SetOps(ops int, opN int)

SetOps lets us reset the operation count in the weird case where we know we've changed an underlying file, without actually refreshing the bitmap.

func (*Bitmap) SetSource

func (b *Bitmap) SetSource(s Source)

SetSource tells the bitmap what source to associate with new things it creates. This is possibly logically incorrect.

func (*Bitmap) Shift

func (b *Bitmap) Shift(n int) (*Bitmap, error)

Shift shifts the contents of b by 1.

NOTE: This method is unsupported. See the `Shift()` method on `Row` in `row.go`.

func (*Bitmap) Size

func (b *Bitmap) Size() int

Size returns the number of bytes required for the bitmap.

func (*Bitmap) Slice

func (b *Bitmap) Slice() []uint64

Slice returns a slice of all integers in the bitmap.

func (*Bitmap) SliceRange

func (b *Bitmap) SliceRange(start, end uint64) []uint64

SliceRange returns a slice of integers between [start, end).

func (*Bitmap) String added in v2.7.0

func (b *Bitmap) String() (r string)

func (*Bitmap) Union

func (b *Bitmap) Union(others ...*Bitmap) *Bitmap

Union returns the bitwise union of b and others as a new bitmap.

func (*Bitmap) UnionInPlace

func (b *Bitmap) UnionInPlace(others ...*Bitmap)

UnionInPlace returns the bitwise union of b and others, modifying b in place.

func (*Bitmap) UnmarshalBinary

func (b *Bitmap) UnmarshalBinary(data []byte) (err error)

UnmarshalBinary reads Pilosa's format, or upstream roaring (mostly; it may not handle some edge cases), and decodes them into the given bitmap, replacing the existing contents.

func (*Bitmap) WriteTo

func (b *Bitmap) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes b to w.

func (*Bitmap) Xor

func (b *Bitmap) Xor(other *Bitmap) *Bitmap

Xor returns the bitwise exclusive or of b and other.

type BitmapBitmapFilter added in v2.7.0

type BitmapBitmapFilter struct {
	// contains filtered or unexported fields
}

BitmapBitmap filter builds a list of positions in the bitmap which match those in a provided bitmap. It is shard-agnostic; no matter what offsets the input bitmap's containers have, it matches them against corresponding keys.

func NewBitmapBitmapFilter added in v2.7.0

func NewBitmapBitmapFilter(filter *Bitmap, callback func(uint64) error) *BitmapBitmapFilter

NewBitmapBitmapFilter creates a filter which can report all the positions within a bitmap which are set, and which have positions corresponding to the specified columns. It calls the provided callback function on each value it finds, terminating early if that returns an error.

The input filter is assumed to represent one "row" of a shard's data, which is to say, a range of up to rowWidth consecutive containers starting at some multiple of rowWidth. We coerce that to the 0..rowWidth range because offset-within-row is what we care about.

func (*BitmapBitmapFilter) ConsiderData added in v2.7.0

func (b *BitmapBitmapFilter) ConsiderData(key FilterKey, data *Container) FilterResult

func (*BitmapBitmapFilter) ConsiderKey added in v2.7.0

func (b *BitmapBitmapFilter) ConsiderKey(key FilterKey, n int32) FilterResult

type BitmapColumnFilter added in v2.7.0

type BitmapColumnFilter struct {
	// contains filtered or unexported fields
}

BitmapColumnFilter is a BitmapFilter which checks for containers matching a given column within a row; thus, only the one container per row which matches the column needs to be evaluated, and it's evaluated as matching if it contains the relevant bit.

func (*BitmapColumnFilter) ConsiderData added in v2.7.0

func (f *BitmapColumnFilter) ConsiderData(key FilterKey, data *Container) FilterResult

func (*BitmapColumnFilter) ConsiderKey added in v2.7.0

func (f *BitmapColumnFilter) ConsiderKey(key FilterKey, n int32) FilterResult

type BitmapFilter added in v2.7.0

type BitmapFilter interface {
	ConsiderKey(key FilterKey, n int32) FilterResult
	ConsiderData(key FilterKey, data *Container) FilterResult
}

A BitmapFilter, given a series of key/data pairs, is considered to "match" some of those containers. Matching may be dependent on key values and cardinalities alone, or on the contents of the container.

The ConsiderData function must not retain the container, or the data from the container; if it needs access to that information later, it needs to make a copy.

Many filters are, by virtue of how they operate, able to predict their results on future keys. To accommodate this, and allow operations to avoid processing keys they don't need to process, the result of a filter operation can indicate not just whether a given key matches, but whether some upcoming keys will, or won't, match. If ConsiderKey yields a non-zero number of matches or non-matches for a given key, ConsiderData will not be called for that key.

If multiple filters are combined, they are only called if their input is needed to determine a value.

func NewBitmapColumnFilter added in v2.7.0

func NewBitmapColumnFilter(col uint64) BitmapFilter

func NewBitmapRowFilter added in v2.7.0

func NewBitmapRowFilter(callback func(uint64) error, filters ...BitmapFilter) BitmapFilter

BitmapRowLister returns a pointer to a slice which it will populate when invoked as a bitmap filter.

func NewBitmapRowFilterMultiFilter added in v2.7.0

func NewBitmapRowFilterMultiFilter(callback func(row uint64) error, filters ...BitmapFilter) BitmapFilter

BitmapRowFilterMultiFilter will call a

func NewBitmapRowsFilter added in v2.7.0

func NewBitmapRowsFilter(rows []uint64) BitmapFilter

type BitmapInfo

type BitmapInfo struct {
	OpN            int
	Ops            int
	OpDetails      []OpInfo `json:"OpDetails,omitempty"`
	BitCount       uint64
	ContainerCount int
	Containers     []ContainerInfo `json:"Containers,omitempty"`   // The containers found in the bitmap originally
	OpContainers   []ContainerInfo `json:"OpContainers,omitempty"` // The containers resulting from ops log changes.
	From, To       uintptr         // if set, indicates the address range used when unpacking
}

BitmapInfo represents a point-in-time snapshot of bitmap stats.

type BitmapIteratorFinder added in v2.2.0

type BitmapIteratorFinder struct {
	// contains filtered or unexported fields
}

func (*BitmapIteratorFinder) Close added in v2.2.0

func (bif *BitmapIteratorFinder) Close()

func (*BitmapIteratorFinder) FindIterator added in v2.2.0

func (bif *BitmapIteratorFinder) FindIterator(seek uint64) (ContainerIterator, bool)

type BitmapRangeFilter added in v2.7.0

type BitmapRangeFilter struct {
	// contains filtered or unexported fields
}

BitmapRangeFilter limits filter operations to a specified range, and performs key or data callbacks.

On seeing a key in its range: If the key callback is present, and returns true, match the key. Otherwise, if a data callback is present, request the data, and in the data handler, call the data callback, then match the single key. If neither is present, match the entire range at once.

func NewBitmapRangeFilter added in v2.7.0

func NewBitmapRangeFilter(min, max FilterKey, keyCallback func(FilterKey, int32) (bool, error), dataCallback func(FilterKey, *Container) error) *BitmapRangeFilter

func (*BitmapRangeFilter) ConsiderData added in v2.7.0

func (b *BitmapRangeFilter) ConsiderData(key FilterKey, data *Container) FilterResult

func (*BitmapRangeFilter) ConsiderKey added in v2.7.0

func (b *BitmapRangeFilter) ConsiderKey(key FilterKey, n int32) FilterResult

type BitmapRowFilterBase added in v2.7.0

type BitmapRowFilterBase struct {
	FilterResult
	// contains filtered or unexported fields
}

BitmapRowFilterBase is a generic form of a row-aware wrapper; it handles making decisions about keys once you tell it a yesKey and noKey that it should be using, and makes callbacks per row.

func NewBitmapRowFilterBase added in v2.7.0

func NewBitmapRowFilterBase(callback func(row uint64) error) *BitmapRowFilterBase

func (*BitmapRowFilterBase) ConsiderData added in v2.7.0

func (b *BitmapRowFilterBase) ConsiderData(key FilterKey, data *Container) FilterResult

This should probably never be reached?

func (*BitmapRowFilterBase) ConsiderKey added in v2.7.0

func (b *BitmapRowFilterBase) ConsiderKey(key FilterKey, n int32) FilterResult

Without a sub-filter, we always-succeed; if we get a key that isn't already answered by our YesKey/NoKey/lastRow, we will match this key, reject the rest of the row, and update our keys accordingly. We'll also hit the callback, and return an error from it if appropriate.

func (*BitmapRowFilterBase) DetermineByKey added in v2.7.0

func (b *BitmapRowFilterBase) DetermineByKey(key FilterKey) (FilterResult, bool)

DetermineByKey decides whether it can produce a meaningful FilterResult for a given key. This encapsulates all the logic for row callbacks and figuring out when to wrap a row.

func (*BitmapRowFilterBase) SetResult added in v2.7.0

func (b *BitmapRowFilterBase) SetResult(key FilterKey, result FilterResult) FilterResult

SetResult is a convenience function so that things embedding this can just call this instead of using a long series of dotted names. It returns the new result of DetermineByKey after this change.

type BitmapRowFilterMultiFilter added in v2.7.0

type BitmapRowFilterMultiFilter struct {
	BitmapRowFilterBase
	// contains filtered or unexported fields
}

BitmapRowFilterMultiFilter is a BitmapFilter which wraps other bitmap filters, calling a callback function once per row whenever it finds a container for which all the filters returned true.

func (*BitmapRowFilterMultiFilter) ConsiderData added in v2.7.0

func (b *BitmapRowFilterMultiFilter) ConsiderData(key FilterKey, data *Container) FilterResult

ConsiderData only gets called in cases where f.toDo had a list of filters for which we needed to get data to make a decision. That means that everything but the indexes in f.toDo must be a "yes" right now.

func (*BitmapRowFilterMultiFilter) ConsiderKey added in v2.7.0

func (b *BitmapRowFilterMultiFilter) ConsiderKey(key FilterKey, n int32) FilterResult

type BitmapRowFilterSingleFilter added in v2.7.0

type BitmapRowFilterSingleFilter struct {
	BitmapRowFilterBase
	// contains filtered or unexported fields
}

BitmapRowFilterSingleFilter is a row iterator with a single filter, which is simpler than one with multiple filters where it coincidentally turns out that N==1.

func NewBitmapRowFilterSingleFilter added in v2.7.0

func NewBitmapRowFilterSingleFilter(callback func(row uint64) error, filter BitmapFilter) *BitmapRowFilterSingleFilter

func (*BitmapRowFilterSingleFilter) ConsiderData added in v2.7.0

func (b *BitmapRowFilterSingleFilter) ConsiderData(key FilterKey, data *Container) FilterResult

func (*BitmapRowFilterSingleFilter) ConsiderKey added in v2.7.0

type BitmapRowLimitFilter added in v2.7.0

type BitmapRowLimitFilter struct {
	BitmapRowFilterBase
	// contains filtered or unexported fields
}

func NewBitmapRowLimitFilter added in v2.7.0

func NewBitmapRowLimitFilter(limit uint64) *BitmapRowLimitFilter

func (*BitmapRowLimitFilter) ConsiderData added in v2.7.0

func (b *BitmapRowLimitFilter) ConsiderData(key FilterKey, data *Container) FilterResult

This should probably never be reached?

func (*BitmapRowLimitFilter) ConsiderKey added in v2.7.0

func (b *BitmapRowLimitFilter) ConsiderKey(key FilterKey, n int32) FilterResult

Without a sub-filter, we always-succeed; if we get a key that isn't already answered by our YesKey/NoKey/lastRow, we will match the whole row.

type BitmapRowsFilter added in v2.7.0

type BitmapRowsFilter struct {
	// contains filtered or unexported fields
}

BitmapRowsFilter is a BitmapFilter which checks for containers that are in any of a provided list of rows. The row list should be sorted.

func (*BitmapRowsFilter) ConsiderData added in v2.7.0

func (f *BitmapRowsFilter) ConsiderData(key FilterKey, data *Container) FilterResult

func (*BitmapRowsFilter) ConsiderKey added in v2.7.0

func (f *BitmapRowsFilter) ConsiderKey(key FilterKey, n int32) FilterResult

type Container

type Container struct {
	// contains filtered or unexported fields
}

Container represents a Container for uint16 integers.

These are used for storing the low bits of numbers in larger sets of uint64. The high bits are stored in a Container's key which is tracked by a separate data structure. Integers in a Container can be encoded in one of three ways - the encoding used is usually whichever is most compact, though any Container type should be able to encode any set of integers safely. For containers with less than 4,096 values, an array is often used. Containers with long runs of integers would use run length encoding, and more random data usually uses bitmap encoding.

The Container type has somewhat magical semantics. Containers can be marked as "frozen" by the Freeze method, after which, nothing should ever modify that specific container object again, no matter what. Because of this, but also sometimes for Even More Esoteric Reasons, *no* container method should ever be assumed to be genuinely modifying the container it was called on, and *every* container method that might modify a container should return the "modified" *Container, which *may point to a different object*. The caller should always use this resulting container, and if you're storing a *Container in a data structure, you need to update the data structure's pointer too.

A nil *Container is a valid empty container.

In general, operations on containers which produce new containers *may* yield new containers, and *may* yield their operands.

The reason for all of this is to allow containers to have copy-on-write semantics, which allow us to reduce memory usage dramatically, and GC load even more dramatically.

func ConvertArrayToBitmap added in v2.2.0

func ConvertArrayToBitmap(c *Container) *Container

func ConvertRunToBitmap added in v2.2.0

func ConvertRunToBitmap(c *Container) *Container

func Difference added in v2.2.0

func Difference(a, b *Container) *Container

func Intersect added in v2.7.0

func Intersect(x, y *Container) *Container

func NewContainer

func NewContainer() *Container

NewContainer returns a new instance of container. This trivial function may later become more interesting.

func NewContainerArray

func NewContainerArray(set []uint16) *Container

NewContainerArray returns an array container using the provided set of values. It's okay if the slice is nil; that's a length of zero.

func NewContainerArrayCopy

func NewContainerArrayCopy(set []uint16) *Container

NewContainerArrayCopy returns an array container using the provided set of values. It's okay if the slice is nil; that's a length of zero. It copies the provided slice to new storage.

func NewContainerArrayN

func NewContainerArrayN(set []uint16, n int32) *Container

NewContainerArrayN returns an array container using the specified set of values, but overriding n. This is deprecated. It never worked in the first place. The provided value of n is ignored and instead derived from the set length.

func NewContainerBitmap

func NewContainerBitmap(n int, bitmap []uint64) *Container

NewContainerBitmap makes a bitmap container using the provided bitmap, or an empty one if provided bitmap is nil. If the provided bitmap is too short, it will be padded. This function's API is wrong; it should have been written as NewContainerBitmapN, and this should not take the n argument, but I did it wrong initially and now that would be a breaking change.

func NewContainerBitmapN

func NewContainerBitmapN(bitmap []uint64, n int32) *Container

NewContainerBitmapN makes a bitmap container using the provided bitmap, or an empty one if provided bitmap is nil. If the provided bitmap is too short, it will be padded. The container's count is specified directly.

func NewContainerRun

func NewContainerRun(set []Interval16) *Container

NewContainerRun creates a new run container using a provided (possibly nil) slice of intervals.

func NewContainerRunCopy

func NewContainerRunCopy(set []Interval16) *Container

NewContainerRunCopy creates a new run container using a provided (possibly nil) slice of intervals. It copies the provided slice to new storage.

func NewContainerRunN

func NewContainerRunN(set []Interval16, n int32) *Container

NewContainerRunN creates a new run array using a provided (possibly nil) slice of intervals. It overrides n using the provided value.

func Optimize added in v2.2.0

func Optimize(c *Container) *Container

Optimize yields a container with the same bits as c, but adjusted to the smallest-storage type by Roaring rules (thus, runs where that's smaller, otherwise arrays for N < 4096 and bitmaps for N >= 4096).

func RemakeContainerArray added in v2.7.0

func RemakeContainerArray(c *Container, array []uint16) *Container

func RemakeContainerBitmap added in v2.7.0

func RemakeContainerBitmap(c *Container, bitmap []uint64) *Container

func RemakeContainerRun added in v2.7.0

func RemakeContainerRun(c *Container, intervals []Interval16) *Container

func Union added in v2.2.0

func Union(a, b *Container) (c *Container)

func (*Container) Add added in v2.2.0

func (c *Container) Add(v uint16) (newC *Container, added bool)

Add yields a container identical to c, but with the given bit set; added is true if the bit wasn't previously set. It is unspecified whether the original container is modified.

func (*Container) AsBitmap

func (c *Container) AsBitmap(target []uint64) (out []uint64)

AsBitmap yields a 65k-bit bitmap, storing it in the target if a target is provided. The target should be zeroed, or this becomes an implicit union.

func (*Container) BitwiseCompare

func (c *Container) BitwiseCompare(c2 *Container) error

BitwiseCompare reports whether two containers are equal. It returns an error describing any difference it finds. This is mostly intended for use in tests that expect equality.

func (*Container) CheckN added in v2.3.0

func (c *Container) CheckN()

CheckN verifies that a container's cached count is correct, but there are two versions; this is the one which doesn't actually do anything, because the check is expensive. Which one you get is controlled by the roaringparanoia build tag.

func (*Container) Clone

func (c *Container) Clone() (out *Container)

Clone returns a copy of c.

func (*Container) Contains

func (c *Container) Contains(v uint16) bool

Contains returns true if v is in the container.

func (*Container) CountRange added in v2.2.0

func (c *Container) CountRange(start, end int32) (n int32)

func (*Container) Difference added in v2.2.0

func (c *Container) Difference(other *Container) *Container

func (*Container) Freeze

func (c *Container) Freeze() *Container

Freeze returns an unmodifiable container identical to c. This might be c, now marked unmodifiable, or might be a new container. If c is currently marked as "mapped", referring to a backing store that's not a conventional Go pointer, the storage may (or may not) be copied. Do not call Freeze on a temporarily-corrupt container, such as one returned from UnionInPlace but on which you haven't since called Repair.

func (*Container) Mapped

func (c *Container) Mapped() bool

Mapped returns the internal mapped field, which indicates whether the slice's backing store is believed to be associated with unwriteable mmapped space.

func (*Container) Max added in v2.2.0

func (c *Container) Max() uint16

func (*Container) N

func (c *Container) N() int32

N returns the 1-count of the container.

func (*Container) Remove added in v2.2.0

func (c *Container) Remove(v uint16) (c2 *Container, removed bool)

Add yields a container identical to c, but with the given bit cleared; removed is true if the bit was previously set. It is unspecified whether the original container is modified.

func (*Container) Repair

func (c *Container) Repair()

Repair repairs the cardinality of c if it has been corrupted by optimized operations.

func (*Container) SafeN added in v2.2.0

func (c *Container) SafeN() (int32, bool)

SafeN returns N, true if it can, otherwise it returns 0, false. For instance, a container subject to in-place operations can not know its current N, and it's not meaningful or safe to query it until a repair, so you can use this to get N "if it's available".

func (*Container) SetMapped added in v2.2.0

func (c *Container) SetMapped(mapped bool)

SetMapped marks a container as "mapped"; do this if you're setting a container's storage to something that it shouldn't write to, like mmapped memory.

func (*Container) Slice added in v2.7.0

func (c *Container) Slice() (r []uint16)

Slice returns an array of the values in the container as uint16. Do NOT modify the result; it could be the container's actual storage.

func (*Container) String

func (c *Container) String() string

func (*Container) Thaw

func (c *Container) Thaw() *Container

Thaw returns a modifiable container identical to c. This may be c, or it may be a new container with distinct backing store.

func (*Container) UnionInPlace added in v2.2.0

func (c *Container) UnionInPlace(other *Container) (r *Container)

UnionInPlace yields a container containing all the bits set in either c or other. It may, or may not, modify c. The resulting container's count, as returned by c.N(), may be incorrect; see (*Container).Repair(). Do not freeze a container produced by this operation before repairing it.

func (*Container) WriteTo

func (c *Container) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes c to w.

type ContainerInfo

type ContainerInfo struct {
	Key     uint64  // container key
	Type    string  // container type (array, bitmap, or run)
	Flags   string  // flag state
	N       int32   // number of bits
	Alloc   int     // memory used
	Pointer uintptr // address
	Mapped  bool    // whether this container thinks it is mmapped
}

ContainerInfo represents a point-in-time snapshot of container stats.

type ContainerIterator

type ContainerIterator interface {
	Next() bool
	Value() (uint64, *Container)
	Close()
}

type Containers

type Containers interface {
	// Get returns nil if the key does not exist.
	Get(key uint64) *Container

	// Put adds the container at key.
	Put(key uint64, c *Container)

	// Remove takes the container at key out.
	Remove(key uint64)

	// GetOrCreate returns the container at key, creating a new empty container if necessary.
	GetOrCreate(key uint64) *Container

	// Clone does a deep copy of Containers, including cloning all containers contained.
	Clone() Containers

	// Freeze creates a shallow copy of Containers, freezing all the containers
	// contained. The new copy is a distinct Containers, but the individual containers
	// are shared (but marked as frozen).
	Freeze() Containers

	// Last returns the highest key and associated container.
	Last() (key uint64, c *Container)

	// Size returns the number of containers stored.
	Size() int

	// Update calls fn (existing-container, existed), and expects
	// (new-container, write). If write is true, the container is used to
	// replace the given container.
	Update(key uint64, fn func(*Container, bool) (*Container, bool))

	// UpdateEvery calls fn (existing-container, existed), and expects
	// (new-container, write). If write is true, the container is used to
	// replace the given container.
	UpdateEvery(fn func(uint64, *Container, bool) (*Container, bool))

	// Iterator returns a ContainterIterator which after a call to Next(), a call to Value() will
	// return the first container at or after key. found will be true if a
	// container is found at key.
	Iterator(key uint64) (citer ContainerIterator, found bool)

	Count() uint64

	// Reset clears the containers collection to allow for recycling during snapshot
	Reset()
	// ResetN clears the collection but hints at a needed size.
	ResetN(int)

	// Repair will repair the cardinality of any containers whose cardinality were corrupted
	// due to optimized operations.
	Repair()
}

type ErrorList

type ErrorList []error

ErrorList represents a list of errors.

func (*ErrorList) Append

func (a *ErrorList) Append(err error)

Append appends an error to the list. If err is an ErrorList then all errors are appended.

func (*ErrorList) AppendWithPrefix

func (a *ErrorList) AppendWithPrefix(err error, prefix string)

AppendWithPrefix appends an error to the list and includes a prefix.

func (ErrorList) Error

func (a ErrorList) Error() string

type FileShouldBeTruncatedError

type FileShouldBeTruncatedError interface {
	AdvisoryError
	SuggestedLength() int64
}

type FilterKey added in v2.7.0

type FilterKey uint64

func (FilterKey) Add added in v2.7.0

func (f FilterKey) Add(x uint64) FilterKey

Add adds an offset to a key.

func (FilterKey) Done added in v2.7.0

func (f FilterKey) Done() FilterResult

Done indicates that nothing can ever match.

func (FilterKey) Fail added in v2.7.0

func (f FilterKey) Fail(err error) FilterResult

Fail() reports a fatal error that should terminate processing.

func (FilterKey) Failf added in v2.7.0

func (f FilterKey) Failf(msg string, args ...interface{}) FilterResult

Failf() is just like Errorf, etc

func (FilterKey) MatchOne added in v2.7.0

func (f FilterKey) MatchOne() FilterResult

func (FilterKey) MatchOneRejectRow added in v2.7.0

func (f FilterKey) MatchOneRejectRow() FilterResult

MatchOneRejectRow indicates that this item matched but no further items in this row can match.

func (FilterKey) MatchOneUntilOffset added in v2.7.0

func (f FilterKey) MatchOneUntilOffset(offset uint64) FilterResult

MatchUntilOffset matches the current container, then skips any other containers until the given offset.

func (FilterKey) MatchOneUntilSameOffset added in v2.7.0

func (f FilterKey) MatchOneUntilSameOffset() FilterResult

Match the current container, then skip any others until the same offset is reached again.

func (FilterKey) MatchReject added in v2.7.0

func (f FilterKey) MatchReject(y, n FilterKey) FilterResult

MatchReject just sets Yes and No appropriately.

func (FilterKey) MatchRow added in v2.7.0

func (f FilterKey) MatchRow() FilterResult

MatchRow indicates that the current row matches the filter.

func (FilterKey) MatchRowAndDone added in v2.7.0

func (f FilterKey) MatchRowAndDone() FilterResult

MatchRowAndDone matches this row and nothing after that.

func (FilterKey) MatchRowUntilRow added in v2.7.0

func (f FilterKey) MatchRowUntilRow(rowID uint64) FilterResult

MatchRowUntilRow matches this row, then rejects everything else until the given row ID.

func (FilterKey) NeedData added in v2.7.0

func (f FilterKey) NeedData() FilterResult

NeedData() is only really meaningful for ConsiderKey, and indicates that a decision can't be made from the key alone.

func (FilterKey) Reject added in v2.7.0

func (f FilterKey) Reject(n uint64) FilterResult

Reject rejects N items.

func (FilterKey) RejectOne added in v2.7.0

func (f FilterKey) RejectOne() FilterResult

Reject rejects this item only.

func (FilterKey) RejectRow added in v2.7.0

func (f FilterKey) RejectRow() FilterResult

RejectRow indicates that this entire row is rejected.

func (FilterKey) RejectUntil added in v2.7.0

func (f FilterKey) RejectUntil(until FilterKey) FilterResult

RejectUntil rejects everything up to the given key.

func (FilterKey) RejectUntilOffset added in v2.7.0

func (f FilterKey) RejectUntilOffset(offset uint64) FilterResult

RejectUntilOffset rejects this container, and any others until the given in-row offset.

func (FilterKey) RejectUntilRow added in v2.7.0

func (f FilterKey) RejectUntilRow(rowID uint64) FilterResult

RejectUntilRow rejects everything until the given row ID.

func (FilterKey) Row added in v2.7.0

func (f FilterKey) Row() uint64

Row() computes the row number of a key.

func (FilterKey) Sub added in v2.7.0

func (f FilterKey) Sub(o FilterKey) uint64

Sub determines the distance from o to f.

type FilterResult added in v2.7.0

type FilterResult struct {
	YesKey FilterKey // The lowest container key this filter is known NOT to match.
	NoKey  FilterKey // The highest container key after YesKey that this filter is known to not match.
	Err    error     // An error which should terminate processing.
}

FilterResult represents the results of a BitmapFilter considering a key, or data. The values are represented as exclusive upper bounds on a series of matches followed by a series of rejections. So for instance, if called on key 23, the result {YesKey: 23, NoKey: 24} indicates that key 23 is a "no". This may seem confusing but it makes the math a lot easier to write. It can also report an error, which indicates that the entire operation should be stopped with that error.

type Interval16 added in v2.2.0

type Interval16 struct {
	Start uint16
	Last  uint16
}

func AsRuns added in v2.2.0

func AsRuns(c *Container) []Interval16

type Iterator

type Iterator struct {
	// contains filtered or unexported fields
}

Iterator represents an iterator over a Bitmap.

func NewIterator added in v2.2.0

func NewIterator(f IteratorFinder) *Iterator

NewIterator requires f as an IteratorFinder, it will crash if f is nil.

func (*Iterator) Close added in v2.2.0

func (itr *Iterator) Close()

func (*Iterator) Next

func (itr *Iterator) Next() (v uint64, eof bool)

Next returns the next value in the bitmap. Returns eof as true if there are no values left in the iterator.

func (*Iterator) Seek

func (itr *Iterator) Seek(seek uint64)

Seek moves to the first value equal to or greater than `seek`.

type IteratorFinder added in v2.2.0

type IteratorFinder interface {
	FindIterator(uint64) (ContainerIterator, bool)
	Close()
}

type OpInfo

type OpInfo struct {
	Type string
	OpN  int
	Size int
}

OpInfo is a description of an op.

type RoaringIterator added in v2.2.0

type RoaringIterator interface {
	// Len reports the number of containers total.
	Len() (count int64)
	// Next yields the information about the next container
	Next() (key uint64, cType byte, n int, length int, pointer *uint16, err error)
	// Remaining yields the bytes left over past the end of the roaring data,
	// which is typically an ops log in our case, and also its offset in case
	// we need to talk about truncation.
	Remaining() ([]byte, int64)

	// NextContainer is a helper that is used in place of Next(). It will
	// allocate a Container from the output of its internal call to Next(),
	// and return the key and container rc. If Next returns an error, then
	// NextContainer will return 0, nil.
	NextContainer() (key uint64, rc *Container)

	// Data returns the underlying data, esp for the Ops log.
	Data() []byte

	// Clone copies the iterator, preserving it at this point in the iteration.
	// It may well share much underlying data.
	Clone() RoaringIterator

	// ContainerKeys provides all the
	// container keys that the iterator will return.
	// The current implementation requires that the underlying header
	// lists the keys in ascending order.
	// If there are no keys, then an empty slice will be returned.
	ContainerKeys() (slc []uint64)

	// Skip will move the iterator forward by 1 without
	// materializing the container.
	Skip()
}

RoaringIterator represents something which can iterate through a roaring bitmap and yield information about containers, including type, size, and the location of their data structures.

func NewRoaringIterator added in v2.2.0

func NewRoaringIterator(data []byte) (RoaringIterator, error)

type Source

type Source interface {
	ID() string
	Dead() bool
}

A Source represents the source a given bitmap gets its data from, such as a memory-mapped file. When combining bitmaps, we might track them together in a single combined-source of some sort.

func MergeSources

func MergeSources(sources ...Source) Source

MergeSources combines sources. If you have two bitmaps, and you're combining them, then the combination's source is a combination of those two sources.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL