cockroach: github.com/cockroachdb/cockroach/pkg/col/coldata Index | Files

package coldata

import "github.com/cockroachdb/cockroach/pkg/col/coldata"

Index

Package Files

batch.go bytes.go datum_vec.go native_types.go nulls.go testutils.go vec.eg.go vec.go

Constants

const BytesInitialAllocationFactor = 64

BytesInitialAllocationFactor is an estimate of how many bytes each []byte slice takes up. It is used during the initialization of Bytes.

const FlatBytesOverhead = unsafe.Sizeof(Bytes{})

FlatBytesOverhead is the overhead of Bytes in bytes.

const MaxBatchSize = 4096

MaxBatchSize is the maximum acceptable size of batches.

Variables

var ZeroBatch = &zeroBatch{
    MemBatch: NewMemBatchWithSize(
        nil, 0, StandardColumnFactory,
    ).(*MemBatch),
}

ZeroBatch is a schema-less Batch of length 0.

func AssertEquivalentBatches Uses

func AssertEquivalentBatches(t testingT, expected, actual Batch)

AssertEquivalentBatches is a testing function that asserts that expected and actual are equivalent.

func BatchSize Uses

func BatchSize() int

BatchSize is the maximum number of tuples that fit in a column batch.

func BytesFromArrowSerializationFormat Uses

func BytesFromArrowSerializationFormat(b *Bytes, data []byte, offsets []int32)

BytesFromArrowSerializationFormat takes an Arrow byte slice and accompanying offsets and populates b.

func GetValueAt Uses

func GetValueAt(v Vec, rowIdx int) interface{}

GetValueAt is an inefficient helper to get the value in a Vec when the type is unknown.

func ResetBatchSizeForTests Uses

func ResetBatchSizeForTests()

ResetBatchSizeForTests resets the batchSize variable to the default batch size. It should only be used in tests.

func SetBatchSizeForTests Uses

func SetBatchSizeForTests(newBatchSize int) error

SetBatchSizeForTests modifies batchSize variable. It should only be used in tests. batch sizes greater than MaxBatchSize will return an error.

func SetValueAt Uses

func SetValueAt(v Vec, elem interface{}, rowIdx int)

SetValueAt is an inefficient helper to set the value in a Vec when the type is unknown.

type Batch Uses

type Batch interface {
    // Length returns the number of values in the columns in the batch.
    Length() int
    // SetLength sets the number of values in the columns in the batch.
    SetLength(int)
    // Width returns the number of columns in the batch.
    Width() int
    // ColVec returns the ith Vec in this batch.
    ColVec(i int) Vec
    // ColVecs returns all of the underlying Vecs in this batch.
    ColVecs() []Vec
    // Selection, if not nil, returns the selection vector on this batch: a
    // densely-packed list of the indices in each column that have not been
    // filtered out by a previous step.
    Selection() []int
    // SetSelection sets whether this batch is using its selection vector or not.
    SetSelection(bool)
    // AppendCol appends the given Vec to this batch.
    AppendCol(Vec)
    // ReplaceCol replaces the current Vec at the provided index with the
    // provided Vec. The original and the replacement vectors *must* be of the
    // same type.
    ReplaceCol(Vec, int)
    // Reset modifies the caller in-place to have the given length and columns
    // with the given types. If it's possible, Reset will reuse the existing
    // columns and allocations, invalidating existing references to the Batch or
    // its Vecs. However, Reset does _not_ zero out the column data.
    //
    // NOTE: Reset can allocate a new Batch, so when calling from the vectorized
    // engine consider either allocating a new Batch explicitly via
    // colexec.Allocator or calling ResetInternalBatch.
    Reset(typs []*types.T, length int, factory ColumnFactory)
    // ResetInternalBatch resets a batch and its underlying Vecs for reuse. It's
    // important for callers to call ResetInternalBatch if they own internal
    // batches that they reuse as not doing this could result in correctness
    // or memory blowup issues.
    ResetInternalBatch()
    // String returns a pretty representation of this batch.
    String() string
}

Batch is the type that columnar operators receive and produce. It represents a set of column vectors (partial data columns) as well as metadata about a batch, like the selection vector (which rows in the column batch are selected).

func NewMemBatch Uses

func NewMemBatch(typs []*types.T, factory ColumnFactory) Batch

NewMemBatch allocates a new in-memory Batch. An unsupported type will create a placeholder Vec that may not be accessed. TODO(jordan): pool these allocations.

func NewMemBatchNoCols Uses

func NewMemBatchNoCols(typs []*types.T, size int) Batch

NewMemBatchNoCols creates a "skeleton" of new in-memory Batch. It allocates memory for the selection vector but does *not* allocate any memory for the column vectors - those will have to be added separately.

func NewMemBatchWithSize Uses

func NewMemBatchWithSize(typs []*types.T, size int, factory ColumnFactory) Batch

NewMemBatchWithSize allocates a new in-memory Batch with the given column size. Use for operators that have a precisely-sized output batch.

type Bools Uses

type Bools []bool

Bools is a slice of bool.

func (Bools) Get Uses

func (c Bools) Get(idx int) bool

Get returns the element at index idx of the vector. The element cannot be used anymore once the vector is modified. gcassert:inline

func (Bools) Len Uses

func (c Bools) Len() int

Len returns the length of the vector.

type Bytes Uses

type Bytes struct {
    // contains filtered or unexported fields
}

Bytes is a wrapper type for a two-dimensional byte slice ([][]byte).

func NewBytes Uses

func NewBytes(n int) *Bytes

NewBytes returns a Bytes struct with enough capacity for n zero-length []byte values. It is legal to call Set on the returned Bytes at this point, but Get is undefined until at least one element is Set.

func (*Bytes) AppendSlice Uses

func (b *Bytes) AppendSlice(src *Bytes, destIdx, srcStartIdx, srcEndIdx int)

AppendSlice appends srcStartIdx inclusive and srcEndIdx exclusive []byte values from src into the receiver starting at destIdx.

func (*Bytes) AppendVal Uses

func (b *Bytes) AppendVal(v []byte)

AppendVal appends the given []byte value to the end of the receiver. A nil value will be "converted" into an empty byte slice.

func (*Bytes) AssertOffsetsAreNonDecreasing Uses

func (b *Bytes) AssertOffsetsAreNonDecreasing(n int)

AssertOffsetsAreNonDecreasing asserts that all b.offsets[:n+1] are non-decreasing.

func (*Bytes) CopySlice Uses

func (b *Bytes) CopySlice(src *Bytes, destIdx, srcStartIdx, srcEndIdx int)

CopySlice copies srcStartIdx inclusive and srcEndIdx exclusive []byte values from src into the receiver starting at destIdx. Similar to the copy builtin, min(dest.Len(), src.Len()) values will be copied. Note that if the length of the receiver is greater than the length of the source, bytes will have to be physically moved. Consider the following example: dest Bytes: "helloworld", offsets: []int32{0, 5}, lengths: []int32{5, 5} src Bytes: "a", offsets: []int32{0}, lengths: []int32{1} If we copy src into the beginning of dest, we will have to move "world" so that the result is: result Bytes: "aworld", offsets: []int32{0, 1}, lengths: []int32{1, 5} Similarly, if "a", is instead "alongerstring", "world" would have to be shifted right.

func (*Bytes) Get Uses

func (b *Bytes) Get(i int) []byte

Get returns the ith []byte in Bytes. Note that the returned byte slice is unsafe for reuse if any write operation happens. NOTE: if ith element was never set in any way, the behavior of Get is undefined. gcassert:inline

func (*Bytes) Len Uses

func (b *Bytes) Len() int

Len returns how many []byte values the receiver contains.

func (*Bytes) ProportionalSize Uses

func (b *Bytes) ProportionalSize(n int64) uintptr

ProportionalSize returns the size of the receiver in bytes that is attributed to only first n out of Len() elements.

func (*Bytes) Reset Uses

func (b *Bytes) Reset()

Reset resets the underlying Bytes for reuse. Note that this zeroes out the underlying bytes but doesn't change the length (see #42054 for the discussion on why simply truncating b.data and setting b.maxSetIndex to 0 is not sufficient). TODO(asubiotto): Move towards removing Set in favor of AppendVal. At that point we can reset the length to 0.

func (*Bytes) Set Uses

func (b *Bytes) Set(i int, v []byte)

Set sets the ith []byte in Bytes. Overwriting a value that is not at the end of the Bytes is not allowed since it complicates memory movement to make/take away necessary space in the flat buffer. Note that a nil value will be "converted" into an empty byte slice.

func (*Bytes) SetLength Uses

func (b *Bytes) SetLength(l int)

SetLength sets the length of this Bytes. Note that it will panic if there is not enough capacity.

func (*Bytes) Size Uses

func (b *Bytes) Size() uintptr

Size returns the total size of the receiver in bytes.

func (*Bytes) String Uses

func (b *Bytes) String() string

String is used for debugging purposes.

func (*Bytes) ToArrowSerializationFormat Uses

func (b *Bytes) ToArrowSerializationFormat(n int) ([]byte, []int32)

ToArrowSerializationFormat returns a bytes slice and offsets that are Arrow-compatible. n is the number of elements to serialize.

func (*Bytes) UpdateOffsetsToBeNonDecreasing Uses

func (b *Bytes) UpdateOffsetsToBeNonDecreasing(n int)

UpdateOffsetsToBeNonDecreasing makes sure that b.offsets[:n+1] are non-decreasing which is an invariant that we need to maintain. It must be called by the colexec.Operator that is modifying this Bytes before returning it as an output. A convenient place for this is Batch.SetLength() method - we assume that *always*, before returning a batch, the length is set on it.

func (*Bytes) Window Uses

func (b *Bytes) Window(start, end int) *Bytes

Window creates a "window" into the receiver. It behaves similarly to Golang's slice, but the returned object is *not* allowed to be modified - it is read-only. Window is a lightweight operation that doesn't involve copying the underlying data.

type Column Uses

type Column interface{}

Column is an interface that represents a raw array of a Go native type.

type ColumnFactory Uses

type ColumnFactory interface {
    MakeColumn(t *types.T, n int) Column
}

ColumnFactory is an interface that can construct columns for Batches.

var StandardColumnFactory ColumnFactory = &defaultColumnFactory{}

StandardColumnFactory is a factory that produces columns of types that are explicitly supported by the vectorized engine (i.e. not datum-backed).

type CopySliceArgs Uses

type CopySliceArgs struct {
    SliceArgs
    // SelOnDest, if true, uses the selection vector as a lens into the
    // destination as well as the source. Normally, when SelOnDest is false, the
    // selection vector is applied to the source vector, but the results are
    // copied densely into the destination vector.
    SelOnDest bool
}

CopySliceArgs represents the extension of SliceArgs that is passed in to Vec.Copy.

type Datum Uses

type Datum interface{}

Datum is abstract type for elements inside DatumVec, this type in reality should be tree.Datum. However, in order to avoid pulling in 'tree' package into the 'coldata' package, we use a runtime cast instead.

type DatumVec Uses

type DatumVec interface {
    // Get returns the datum at index i in the vector. The datum cannot be used
    // anymore once the vector is modified.
    Get(i int) Datum
    // Set sets the datum at index i in the vector. It must check whether the
    // provided datum is compatible with the type that the DatumVec stores.
    Set(i int, v Datum)
    // Slice creates a "window" into the vector. It behaves similarly to
    // Golang's slice.
    Slice(start, end int) DatumVec
    // CopySlice copies srcStartIdx inclusive and srcEndIdx exclusive
    // tree.Datum values from src into the vector starting at destIdx.
    CopySlice(src DatumVec, destIdx, srcStartIdx, srcEndIdx int)
    // AppendSlice appends srcStartIdx inclusive and srcEndIdx exclusive
    // tree.Datum values from src into the vector starting at destIdx.
    AppendSlice(src DatumVec, destIdx, srcStartIdx, srcEndIdx int)
    // AppendVal appends the given tree.Datum value to the end of the vector.
    AppendVal(v Datum)
    // SetLength sets the length of the vector.
    SetLength(l int)
    // Len returns the length of the vector.
    Len() int
    // Cap returns the underlying capacity of the vector.
    Cap() int
    // MarshalAt returns the marshaled representation of datum at index i.
    MarshalAt(i int) ([]byte, error)
    // UnmarshalTo unmarshals the byte representation of a datum and sets it at
    // index i.
    UnmarshalTo(i int, b []byte) error
}

DatumVec is the interface for a specialized vector that operates on tree.Datums in the vectorized engine. In order to avoid import of 'tree' package the implementation of DatumVec lives in 'coldataext' package.

type Decimals Uses

type Decimals []apd.Decimal

Decimals is a slice of apd.Decimal.

func (Decimals) Get Uses

func (c Decimals) Get(idx int) apd.Decimal

Get returns the element at index idx of the vector. The element cannot be used anymore once the vector is modified. gcassert:inline

func (Decimals) Len Uses

func (c Decimals) Len() int

Len returns the length of the vector.

type Durations Uses

type Durations []duration.Duration

Durations is a slice of duration.Duration.

func (Durations) Get Uses

func (c Durations) Get(idx int) duration.Duration

Get returns the element at index idx of the vector. The element cannot be used anymore once the vector is modified. gcassert:inline

func (Durations) Len Uses

func (c Durations) Len() int

Len returns the length of the vector.

type Float64s Uses

type Float64s []float64

Float64s is a slice of float64.

func (Float64s) Get Uses

func (c Float64s) Get(idx int) float64

Get returns the element at index idx of the vector. The element cannot be used anymore once the vector is modified. gcassert:inline

func (Float64s) Len Uses

func (c Float64s) Len() int

Len returns the length of the vector.

type Int16s Uses

type Int16s []int16

Int16s is a slice of int16.

func (Int16s) Get Uses

func (c Int16s) Get(idx int) int16

Get returns the element at index idx of the vector. The element cannot be used anymore once the vector is modified. gcassert:inline

func (Int16s) Len Uses

func (c Int16s) Len() int

Len returns the length of the vector.

type Int32s Uses

type Int32s []int32

Int32s is a slice of int32.

func (Int32s) Get Uses

func (c Int32s) Get(idx int) int32

Get returns the element at index idx of the vector. The element cannot be used anymore once the vector is modified. gcassert:inline

func (Int32s) Len Uses

func (c Int32s) Len() int

Len returns the length of the vector.

type Int64s Uses

type Int64s []int64

Int64s is a slice of int64.

func (Int64s) Get Uses

func (c Int64s) Get(idx int) int64

Get returns the element at index idx of the vector. The element cannot be used anymore once the vector is modified. gcassert:inline

func (Int64s) Len Uses

func (c Int64s) Len() int

Len returns the length of the vector.

type MemBatch Uses

type MemBatch struct {
    // contains filtered or unexported fields
}

MemBatch is an in-memory implementation of Batch.

func (*MemBatch) AppendCol Uses

func (m *MemBatch) AppendCol(col Vec)

AppendCol implements the Batch interface.

func (*MemBatch) ColVec Uses

func (m *MemBatch) ColVec(i int) Vec

ColVec implements the Batch interface.

func (*MemBatch) ColVecs Uses

func (m *MemBatch) ColVecs() []Vec

ColVecs implements the Batch interface.

func (*MemBatch) Length Uses

func (m *MemBatch) Length() int

Length implements the Batch interface.

func (*MemBatch) ReplaceCol Uses

func (m *MemBatch) ReplaceCol(col Vec, colIdx int)

ReplaceCol implements the Batch interface.

func (*MemBatch) Reset Uses

func (m *MemBatch) Reset(typs []*types.T, length int, factory ColumnFactory)

Reset implements the Batch interface.

func (*MemBatch) ResetInternalBatch Uses

func (m *MemBatch) ResetInternalBatch()

ResetInternalBatch implements the Batch interface.

func (*MemBatch) Selection Uses

func (m *MemBatch) Selection() []int

Selection implements the Batch interface.

func (*MemBatch) SetLength Uses

func (m *MemBatch) SetLength(n int)

SetLength implements the Batch interface.

func (*MemBatch) SetSelection Uses

func (m *MemBatch) SetSelection(b bool)

SetSelection implements the Batch interface.

func (*MemBatch) String Uses

func (m *MemBatch) String() string

String returns a pretty representation of this batch.

func (*MemBatch) Width Uses

func (m *MemBatch) Width() int

Width implements the Batch interface.

type Nulls Uses

type Nulls struct {
    // contains filtered or unexported fields
}

Nulls represents a list of potentially nullable values using a bitmap. It is intended to be used alongside a slice (e.g. in the Vec interface) -- if the ith bit is off, then the ith element in that slice should be treated as NULL.

func NewNulls Uses

func NewNulls(len int) Nulls

NewNulls returns a new nulls vector, initialized with a length.

func (*Nulls) Copy Uses

func (n *Nulls) Copy() Nulls

Copy returns a copy of n which can be modified independently.

func (*Nulls) MaybeHasNulls Uses

func (n *Nulls) MaybeHasNulls() bool

MaybeHasNulls returns true if the column possibly has any null values, and returns false if the column definitely has no null values.

func (*Nulls) NullAt Uses

func (n *Nulls) NullAt(i int) bool

NullAt returns true if the ith value of the column is null.

func (*Nulls) NullBitmap Uses

func (n *Nulls) NullBitmap() []byte

NullBitmap returns the null bitmap.

func (*Nulls) Or Uses

func (n *Nulls) Or(n2 *Nulls) *Nulls

Or returns a new Nulls vector where NullAt(i) iff n1.NullAt(i) or n2.NullAt(i).

func (*Nulls) SetNull Uses

func (n *Nulls) SetNull(i int)

SetNull sets the ith value of the column to null.

func (*Nulls) SetNullBitmap Uses

func (n *Nulls) SetNullBitmap(bm []byte, size int)

SetNullBitmap sets the null bitmap. size corresponds to how many elements this bitmap represents. The bits past the end of this size will be set to valid.

func (*Nulls) SetNullRange Uses

func (n *Nulls) SetNullRange(startIdx int, endIdx int)

SetNullRange sets all the values in [startIdx, endIdx) to null.

func (*Nulls) SetNulls Uses

func (n *Nulls) SetNulls()

SetNulls sets the column to have only null values.

func (*Nulls) Slice Uses

func (n *Nulls) Slice(start int, end int) Nulls

Slice returns a new Nulls representing a slice of the current Nulls from [start, end).

func (*Nulls) Truncate Uses

func (n *Nulls) Truncate(start int)

Truncate sets all values with index greater than or equal to start to null.

func (*Nulls) UnsetNull Uses

func (n *Nulls) UnsetNull(i int)

UnsetNull unsets the ith values of the column.

func (*Nulls) UnsetNullRange Uses

func (n *Nulls) UnsetNullRange(startIdx, endIdx int)

UnsetNullRange unsets all the nulls in the range [startIdx, endIdx). After using UnsetNullRange, n might not contain any null values, but maybeHasNulls could still be true.

func (*Nulls) UnsetNulls Uses

func (n *Nulls) UnsetNulls()

UnsetNulls sets the column to have no null values.

func (*Nulls) UnsetNullsAfter Uses

func (n *Nulls) UnsetNullsAfter(idx int)

UnsetNullsAfter sets all values with index greater than or equal to idx to non-null.

type SliceArgs Uses

type SliceArgs struct {
    // Src is the data being appended.
    Src Vec
    // Sel is an optional slice specifying indices to append to the destination
    // slice. Note that Src{Start,End}Idx apply to Sel.
    Sel []int
    // DestIdx is the first index that Append will append to.
    DestIdx int
    // SrcStartIdx is the index of the first element in Src that Append will
    // append.
    SrcStartIdx int
    // SrcEndIdx is the exclusive end index of Src. i.e. the element in the index
    // before SrcEndIdx is the last element appended to the destination slice,
    // similar to Src[SrcStartIdx:SrcEndIdx].
    SrcEndIdx int
}

SliceArgs represents the arguments passed in to Vec.Append and Nulls.set.

type Times Uses

type Times []time.Time

Times is a slice of time.Time.

func (Times) Get Uses

func (c Times) Get(idx int) time.Time

Get returns the element at index idx of the vector. The element cannot be used anymore once the vector is modified. gcassert:inline

func (Times) Len Uses

func (c Times) Len() int

Len returns the length of the vector.

type Vec Uses

type Vec interface {
    // Type returns the type of data stored in this Vec. Consider whether
    // CanonicalTypeFamily() should be used instead.
    Type() *types.T
    // CanonicalTypeFamily returns the canonical type family of data stored in
    // this Vec.
    CanonicalTypeFamily() types.Family

    // Bool returns a bool list.
    Bool() Bools
    // Int16 returns an int16 slice.
    Int16() Int16s
    // Int32 returns an int32 slice.
    Int32() Int32s
    // Int64 returns an int64 slice.
    Int64() Int64s
    // Float64 returns a float64 slice.
    Float64() Float64s
    // Bytes returns a flat Bytes representation.
    Bytes() *Bytes
    // Decimal returns an apd.Decimal slice.
    Decimal() Decimals
    // Timestamp returns a time.Time slice.
    Timestamp() Times
    // Interval returns a duration.Duration slice.
    Interval() Durations
    // Datum returns a vector of Datums.
    Datum() DatumVec

    // Col returns the raw, typeless backing storage for this Vec.
    Col() interface{}

    // SetCol sets the member column (in the case of mutable columns).
    SetCol(interface{})

    // TemplateType returns an []interface{} and is used for operator templates.
    // Do not call this from normal code - it'll always panic.
    TemplateType() []interface{}

    // Append uses SliceArgs to append elements of a source Vec into this Vec.
    // It is logically equivalent to:
    // destVec = append(destVec[:args.DestIdx], args.Src[args.SrcStartIdx:args.SrcEndIdx])
    // An optional Sel slice can also be provided to apply a filter on the source
    // Vec.
    // Refer to the SliceArgs comment for specifics and TestAppend for examples.
    Append(SliceArgs)

    // Copy uses CopySliceArgs to copy elements of a source Vec into this Vec. It is
    // logically equivalent to:
    // copy(destVec[args.DestIdx:], args.Src[args.SrcStartIdx:args.SrcEndIdx])
    // An optional Sel slice can also be provided to apply a filter on the source
    // Vec.
    // Refer to the CopySliceArgs comment for specifics and TestCopy for examples.
    Copy(CopySliceArgs)

    // Window returns a "window" into the Vec. A "window" is similar to Golang's
    // slice of the current Vec from [start, end), but the returned object is NOT
    // allowed to be modified (the modification might result in an undefined
    // behavior).
    Window(start int, end int) Vec

    // MaybeHasNulls returns true if the column possibly has any null values, and
    // returns false if the column definitely has no null values.
    MaybeHasNulls() bool

    // Nulls returns the nulls vector for the column.
    Nulls() *Nulls

    // SetNulls sets the nulls vector for this column.
    SetNulls(*Nulls)

    // Length returns the length of the slice that is underlying this Vec.
    Length() int

    // SetLength sets the length of the slice that is underlying this Vec. Note
    // that the length of the batch which this Vec belongs to "takes priority".
    SetLength(int)

    // Capacity returns the capacity of the Golang's slice that is underlying
    // this Vec. Note that if there is no "slice" (like in case of flat bytes),
    // the "capacity" of such object is undefined, so is the behavior of this
    // method.
    Capacity() int
}

Vec is an interface that represents a column vector that's accessible by Go native types.

func NewMemColumn Uses

func NewMemColumn(t *types.T, n int, factory ColumnFactory) Vec

NewMemColumn returns a new memColumn, initialized with a length using the given column factory.

Package coldata imports 13 packages (graph) and is imported by 46 packages. Updated 2020-07-10. Refresh now. Tools for package owners.