roaring

package
v1.4.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 8, 2020 License: Apache-2.0 Imports: 12 Imported by: 46

README

The Fuzzer

For complete documentation on go-fuzz, please see: https://github.com/dvyukov/go-fuzz

The fuzzer in relation to the roaring package checks the Bitmap.UnmarshalBinary function found in roaring.go. In order to use the fuzzer, you can follow these steps:

cd $GOPATH/src/github.com/pilosa/pilosa/roaring

go-fuzz-build ./

You must now make the workdir/corpus directory. This is achieved by:

mkdir workdir/corpus

The fuzzer needs some input to start the fuzzing with. Copy some sample Pilosa fragments into the workdir/corpus folder. For example:

cp ~/.pilosa/my-index/my-field/views/standard/fragments/0 workdir/corpus

Once you have copied your sample inputs, you are ready to run the fuzzer:

go-fuzz -bin=roaring-fuzz.zip -workdir=workdir -func=FuzzBitmapUnmarshalBinary

Understanding the Fuzzer Output

The fuzzer will output something similar to the follwoing:

2015/04/25 12:39:53 workers: 8, corpus: 124 (12s ago), crashers: 37, restarts: 1/15, execs: 35342 (2941/sec), cover: 403, uptime: 12s

The most important part of the output is the crashers and cover. The crashers records how many combinations were discovered that fail and the cover tells you how much code is being accessed. For a complete explanation of the output, please see: https://github.com/dvyukov/go-fuzz.

The fuzzer will document the crashers in a folder labeled "crashers." It will record the fragment and the error that was produced in two separate files within this folder. This is the final product.

Happy Fuzzing!

Documentation

Overview

Package roaring implements roaring bitmaps with support for incremental changes.

Index

Constants

View Source
const ArrayMaxSize = 4096

ArrayMaxSize represents the maximum size of array containers.

View Source
const (
	// MagicNumber is an identifier, in bytes 0-1 of the file.
	MagicNumber = uint32(12348)
)

Variables

View Source
var NewFileBitmap func(a ...uint64) *Bitmap = NewBTreeBitmap

NewFileBitmap returns a Bitmap with an initial set of values, used for file storage. By default, this is a copy of NewBitmap, but is replaced with B+Tree in server/enterprise.go

Functions

This section is empty.

Types

type Bitmap

type Bitmap struct {
	Containers Containers

	// User-defined flags.
	Flags byte

	// Writer where operations are appended to.
	OpWriter io.Writer
	// contains filtered or unexported fields
}

Bitmap represents a roaring bitmap.

func NewBTreeBitmap added in v1.3.0

func NewBTreeBitmap(a ...uint64) *Bitmap

func NewBitmap

func NewBitmap(a ...uint64) *Bitmap

NewBitmap returns a Bitmap with an initial set of values.

func NewSliceBitmap added in v1.4.0

func NewSliceBitmap(a ...uint64) *Bitmap

NewSliceBitmap makes a new bitmap, explicitly selecting the slice containers type, which performs better in cases where we expect a contiguous block of containers added in ascending order, such as when extracting a range from another bitmap.

func (*Bitmap) Add

func (b *Bitmap) Add(a ...uint64) (changed bool, err error)

Add adds values to the bitmap. TODO(2.0) deprecate - use the more general AddN (though be aware that it modifies 'a' in place).

func (*Bitmap) AddN added in v1.3.0

func (b *Bitmap) AddN(a ...uint64) (changed int, err error)

AddN adds values to the bitmap, appending them all to the op log in a batched write. It returns the number of changed bits.

func (*Bitmap) Any added in v1.3.0

func (b *Bitmap) Any() bool

Any returns "b.Count() > 0"... but faster than doing that.

func (*Bitmap) BitwiseEqual added in v1.4.0

func (b *Bitmap) BitwiseEqual(c *Bitmap) (bool, error)

CompareEquality is used mostly in test cases to confirm that two bitmaps came out the same. It does not expect corresponding opN, or OpWriter, but expects identical bit contents. It does not expect identical representations; a bitmap container can be identical to an array container. It returns a boolean value, and also an explanation for a false value.

func (*Bitmap) Check

func (b *Bitmap) Check() error

Check performs a consistency check on the bitmap. Returns nil if consistent.

func (*Bitmap) Clone

func (b *Bitmap) Clone() *Bitmap

Clone returns a heap allocated copy of the bitmap. Note: The OpWriter IS NOT copied to the new bitmap.

func (*Bitmap) Contains

func (b *Bitmap) Contains(v uint64) bool

Contains returns true if v is in the bitmap.

func (*Bitmap) Count

func (b *Bitmap) Count() (n uint64)

Count returns the number of bits set in the bitmap.

func (*Bitmap) CountRange

func (b *Bitmap) CountRange(start, end uint64) (n uint64)

CountRange returns the number of bits set between [start, end).

func (*Bitmap) Difference

func (b *Bitmap) Difference(other *Bitmap) *Bitmap

Difference returns the difference of b and other.

func (*Bitmap) DirectAdd added in v1.2.0

func (b *Bitmap) DirectAdd(v uint64) bool

DirectAdd adds a value to the bitmap by bypassing the op log. TODO(2.0) deprecate in favor of DirectAddN.

func (*Bitmap) DirectAddN added in v1.3.0

func (b *Bitmap) DirectAddN(a ...uint64) (changed int)

DirectAddN sets multiple bits in the bitmap, returning how many changed. It modifies the slice 'a' in place such that once it's complete a[:changed] will be list of changed bits. It is more efficient than repeated calls to DirectAdd for semi-dense sorted data because it reuses the container from the previous value if the new value has the same highbits instead of looking it up each time. TODO: if Containers implementations cached the last few Container objects returned from calls like Get and GetOrCreate, this optimization would be less useful.

func (*Bitmap) DirectRemoveN added in v1.3.0

func (b *Bitmap) DirectRemoveN(a ...uint64) (changed int)

DirectRemoveN behaves analgously to DirectAddN.

func (*Bitmap) Flip added in v0.4.0

func (b *Bitmap) Flip(start, end uint64) *Bitmap

Flip performs a logical negate of the bits in the range [start,end].

func (*Bitmap) ForEach

func (b *Bitmap) ForEach(fn func(uint64))

ForEach executes fn for each value in the bitmap.

func (*Bitmap) ForEachRange

func (b *Bitmap) ForEachRange(start, end uint64, fn func(uint64))

ForEachRange executes fn for each value in the bitmap between [start, end).

func (*Bitmap) Freeze added in v1.4.0

func (b *Bitmap) Freeze() *Bitmap

Freeze returns a shallow copy of the bitmap. The new bitmap is a distinct bitmap, with a new Containers object, but the actual containers it holds are the same as the parent's containers, but have been frozen.

func (*Bitmap) ImportRoaringBits added in v1.4.0

func (b *Bitmap) ImportRoaringBits(data []byte, clear bool, log bool, rowSize uint64) (changed int, rowSet map[uint64]int, err error)

ImportRoaringBits sets-or-clears bits based on a provided Roaring bitmap. This should be equivalent to unmarshalling the bitmap, then executing either `b = Union(b, newB)` or `b = Difference(b, newB)`, but with lower overhead. The log parameter controls whether to write to the op log; the answer should always be yes, except if you're calling using this to apply the op log.

If rowSize is non-zero, we should return a map of rows we altered, where "rows" are sets of rowSize containers. Otherwise the map isn't used. (This allows ImportRoaring to update caches; see fragment.go.)

func (*Bitmap) Info

func (b *Bitmap) Info() bitmapInfo

Info returns stats for the bitmap.

func (*Bitmap) Intersect

func (b *Bitmap) Intersect(other *Bitmap) *Bitmap

Intersect returns the intersection of b and other.

func (*Bitmap) IntersectionCount

func (b *Bitmap) IntersectionCount(other *Bitmap) uint64

IntersectionCount returns the number of set bits that would result in an intersection between b and other. It is more efficient than actually intersecting the two and counting the result.

func (*Bitmap) Iterator

func (b *Bitmap) Iterator() *Iterator

Iterator returns a new iterator for the bitmap.

func (*Bitmap) Max

func (b *Bitmap) Max() uint64

Max returns the highest value in the bitmap. Returns zero if the bitmap is empty.

func (*Bitmap) Min added in v1.4.0

func (b *Bitmap) Min() (uint64, bool)

Min returns the lowest value in the bitmap. Second return value is true if containers exist in the bitmap.

func (*Bitmap) OffsetRange

func (b *Bitmap) OffsetRange(offset, start, end uint64) *Bitmap

OffsetRange returns a new bitmap with a containers offset by start. The containers themselves are shared, so they get frozen so it will be safe to interact with them.

func (*Bitmap) Ops added in v1.4.0

func (b *Bitmap) Ops() (ops int, opN int)

Ops returns the number of write ops the bitmap is aware of in its ops log, and their total bit count.

func (*Bitmap) Optimize added in v0.6.0

func (b *Bitmap) Optimize()

Optimize converts array and bitmap containers to run containers as necessary.

func (*Bitmap) PreferMapping added in v1.4.0

func (b *Bitmap) PreferMapping(preferred bool)

func (*Bitmap) RemapRoaringStorage added in v1.4.0

func (b *Bitmap) RemapRoaringStorage(data []byte) (mappedAny bool, returnErr error)

RemapRoaringStorage tries to update all containers to refer to the roaring bitmap in the provided []byte. If any containers are marked as mapped, but do not match the provided storage, they will be unmapped. The boolean return indicates whether or not any containers were mapped to the given storage.

Regardless, after this function runs, no containers have mapped storage which does not refer to data; either they got mapped to the new storage, or storage was allocated for them.

func (*Bitmap) Remove

func (b *Bitmap) Remove(a ...uint64) (changed bool, err error)

Remove removes values from the bitmap (writing to the op log if available). TODO(2.0) deprecate - use the more general RemoveN (though be aware that it modifies 'a' in place).

func (*Bitmap) RemoveN added in v1.3.0

func (b *Bitmap) RemoveN(a ...uint64) (changed int, err error)

RemoveN behaves analagously to AddN.

func (*Bitmap) SetOps added in v1.4.0

func (b *Bitmap) SetOps(ops int, opN int)

SetOps lets us reset the operation count in the weird case where we know we've changed an underlying file, without actually refreshing the bitmap.

func (*Bitmap) Shift added in v1.3.0

func (b *Bitmap) Shift(n int) (*Bitmap, error)

Shift shifts the contents of b by 1.

func (*Bitmap) Size added in v1.3.0

func (b *Bitmap) Size() int

Size returns the number of bytes required for the bitmap.

func (*Bitmap) Slice

func (b *Bitmap) Slice() []uint64

Slice returns a slice of all integers in the bitmap.

func (*Bitmap) SliceRange

func (b *Bitmap) SliceRange(start, end uint64) []uint64

SliceRange returns a slice of integers between [start, end).

func (*Bitmap) Union

func (b *Bitmap) Union(others ...*Bitmap) *Bitmap

Union returns the bitwise union of b and others as a new bitmap.

func (*Bitmap) UnionInPlace added in v1.2.0

func (b *Bitmap) UnionInPlace(others ...*Bitmap)

UnionInPlace returns the bitwise union of b and others, modifying b in place.

func (*Bitmap) UnmarshalBinary

func (b *Bitmap) UnmarshalBinary(data []byte) error

UnmarshalBinary decodes b from a binary-encoded byte slice. data can be in either official roaring format or Pilosa's roaring format.

func (*Bitmap) WriteTo

func (b *Bitmap) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes b to w.

func (*Bitmap) Xor added in v0.4.0

func (b *Bitmap) Xor(other *Bitmap) *Bitmap

Xor returns the bitwise exclusive or of b and other.

type Container added in v0.10.0

type Container struct {
	// contains filtered or unexported fields
}

Container represents a Container for uint16 integers.

These are used for storing the low bits of numbers in larger sets of uint64. The high bits are stored in a Container's key which is tracked by a separate data structure. Integers in a Container can be encoded in one of three ways - the encoding used is usually whichever is most compact, though any Container type should be able to encode any set of integers safely. For containers with less than 4,096 values, an array is often used. Containers with long runs of integers would use run length encoding, and more random data usually uses bitmap encoding.

func NewContainer added in v0.10.0

func NewContainer() *Container

NewContainer returns a new instance of container. This trivial function may later become more interesting.

func NewContainerArray added in v1.3.0

func NewContainerArray(set []uint16) *Container

NewContainerArray returns an array container using the provided set of values. It's okay if the slice is nil; that's a length of zero.

func NewContainerArrayCopy added in v1.4.0

func NewContainerArrayCopy(set []uint16) *Container

NewContainerArrayCopy returns an array container using the provided set of values. It's okay if the slice is nil; that's a length of zero. It copies the provided slice to new storage.

func NewContainerArrayN added in v1.4.0

func NewContainerArrayN(set []uint16, n int32) *Container

NewContainerArrayN returns an array container using the specified set of values, but overriding n.

func NewContainerBitmap added in v1.3.0

func NewContainerBitmap(n int, bitmap []uint64) *Container

NewContainerBitmap makes a bitmap container using the provided bitmap, or an empty one if provided bitmap is nil. If the provided bitmap is too short, it will be padded. This function's API is wrong; it should have been written as NewContainerBitmapN, and this should not take the n argument, but I did it wrong initially and now that would be a breaking change.

func NewContainerBitmapN added in v1.4.0

func NewContainerBitmapN(bitmap []uint64, n int32) *Container

NewContainerBitmapN makes a bitmap container using the provided bitmap, or an empty one if provided bitmap is nil. If the provided bitmap is too short, it will be padded. The container's count is specified directly.

func NewContainerRun added in v1.3.0

func NewContainerRun(set []interval16) *Container

NewContainerRun creates a new run container using a provided (possibly nil) slice of intervals.

func NewContainerRunCopy added in v1.4.0

func NewContainerRunCopy(set []interval16) *Container

NewContainerRunCopy creates a new run container using a provided (possibly nil) slice of intervals. It copies the provided slice to new storage.

func NewContainerRunN added in v1.4.0

func NewContainerRunN(set []interval16, n int32) *Container

NewContainerRunN creates a new run array using a provided (possibly nil) slice of intervals. It overrides n using the provided value.

func (*Container) Clone added in v0.10.0

func (c *Container) Clone() (out *Container)

Clone returns a copy of c.

func (*Container) Contains added in v1.1.0

func (c *Container) Contains(v uint16) bool

Contains returns true if v is in the container.

func (*Container) Freeze added in v1.4.0

func (c *Container) Freeze() *Container

Freeze returns an unmodifiable container identical to c. This might be c, now marked unmodifiable, or might be a new container. If c is currently marked as "mapped", referring to a backing store that's not a conventional Go pointer, the storage may be copied.

func (*Container) Mapped added in v0.10.0

func (c *Container) Mapped() bool

Mapped returns the internal mapped field, which indicates whether the slice's backing store is believed to be associated with unwriteable mmapped space.

func (*Container) N added in v1.0.0

func (c *Container) N() int32

N returns the 1-count of the container.

func (*Container) Repair added in v1.2.0

func (c *Container) Repair()

Repair repairs the cardinality of c if it has been corrupted by optimized operations.

func (*Container) String added in v1.4.0

func (c *Container) String() string

func (*Container) Thaw added in v1.4.0

func (c *Container) Thaw() *Container

Thaw returns a modifiable container identical to c. This may be c, or it may be a new container with distinct backing store.

func (*Container) Update added in v0.10.0

func (c *Container) Update(typ byte, n int32, mapped bool)

Update updates the container if possible. It is an error to call Update on a frozen container.

func (*Container) UpdateOrMake added in v1.4.0

func (c *Container) UpdateOrMake(typ byte, n int32, mapped bool) *Container

UpdateOrMake updates the container, yielding a new container if necessary.

func (*Container) WriteTo added in v0.10.0

func (c *Container) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes c to w.

type ContainerIterator added in v0.10.0

type ContainerIterator interface {
	Next() bool
	Value() (uint64, *Container)
}

type Containers added in v0.10.0

type Containers interface {
	// Get returns nil if the key does not exist.
	Get(key uint64) *Container

	// Put adds the container at key.
	Put(key uint64, c *Container)

	// PutContainerValues updates an existing container at key.
	// If a container does not exist for key, a new one is allocated.
	// TODO(2.0) make n  int32
	PutContainerValues(key uint64, typ byte, n int, mapped bool)

	// Remove takes the container at key out.
	Remove(key uint64)

	// GetOrCreate returns the container at key, creating a new empty container if necessary.
	GetOrCreate(key uint64) *Container

	// Clone does a deep copy of Containers, including cloning all containers contained.
	Clone() Containers

	// Freeze creates a shallow copy of Containers, freezing all the containers
	// contained. The new copy is a distinct Containers, but the individual containers
	// are shared (but marked as frozen).
	Freeze() Containers

	// Last returns the highest key and associated container.
	Last() (key uint64, c *Container)

	// Size returns the number of containers stored.
	Size() int

	// Update calls fn (existing-container, existed), and expects
	// (new-container, write). If write is true, the container is used to
	// replace the given container.
	Update(key uint64, fn func(*Container, bool) (*Container, bool))

	// UpdateEvery calls fn (existing-container, existed), and expects
	// (new-container, write). If write is true, the container is used to
	// replace the given container.
	UpdateEvery(fn func(uint64, *Container, bool) (*Container, bool))

	// Iterator returns a Contiterator which after a call to Next(), a call to Value() will
	// return the first container at or after key. found will be true if a
	// container is found at key.
	Iterator(key uint64) (citer ContainerIterator, found bool)

	Count() uint64

	// Reset clears the containers collection to allow for recycling during snapshot
	Reset()
	// ResetN clears the collection but hints at a needed size.
	ResetN(int)

	// Repair will repair the cardinality of any containers whose cardinality were corrupted
	// due to optimized operations.
	Repair()
}

type ErrorList

type ErrorList []error

ErrorList represents a list of errors.

func (*ErrorList) Append

func (a *ErrorList) Append(err error)

Append appends an error to the list. If err is an ErrorList then all errors are appended.

func (*ErrorList) AppendWithPrefix

func (a *ErrorList) AppendWithPrefix(err error, prefix string)

AppendWithPrefix appends an error to the list and includes a prefix.

func (ErrorList) Error

func (a ErrorList) Error() string

type Iterator

type Iterator struct {
	// contains filtered or unexported fields
}

Iterator represents an iterator over a Bitmap.

func (*Iterator) Next

func (itr *Iterator) Next() (v uint64, eof bool)

Next returns the next value in the bitmap. Returns eof as true if there are no values left in the iterator.

func (*Iterator) Seek

func (itr *Iterator) Seek(seek uint64)

Seek moves to the first value equal to or greater than `seek`.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL