cockroach: github.com/cockroachdb/cockroach/pkg/sql/colcontainer Index | Files

package colcontainer

import "github.com/cockroachdb/cockroach/pkg/sql/colcontainer"

Index

Package Files

diskqueue.go partitionedqueue.go

type DiskQueueCacheMode Uses

type DiskQueueCacheMode int

DiskQueueCacheMode specifies a pattern that a DiskQueue should use regarding its cache.

const (
    // DiskQueueCacheModeDefault is the default mode for DiskQueue cache behavior.
    // The cache (DiskQueueCfg.BufferSizeBytes) will be divided as follows:
    // - 1/3 for buffered writes (before compression)
    // - 1/3 for compressed writes, this is distinct from the previous 1/3 because
    //   it is a requirement of the snappy library that the compressed memory may
    //   not overlap with the uncompressed memory. This memory is reused to read
    //   compressed bytes from disk.
    // - 1/3 for buffered reads after decompression. Kept separate from the write
    //   memory to allow for Enqueues to come in while unread batches are held in
    //   memory.
    // In this mode, Enqueues and Dequeues may happen in any order.
    DiskQueueCacheModeDefault DiskQueueCacheMode = iota
    // DiskQueueCacheModeReuseCache imposes a limitation that all Enqueues happen
    // before all Dequeues to be able to reuse more memory. In this mode the cache
    // will be divided as follows:
    // - 1/2 for buffered writes and buffered reads.
    // - 1/2 for compressed write and reads (given the limitation that this memory
    //   has to be non-overlapping.
    DiskQueueCacheModeReuseCache
    // DiskQueueCacheModeClearAndReuseCache is the same as
    // DiskQueueCacheModeReuseCache with the additional behavior that when a
    // coldata.ZeroBatch is Enqueued, the cache will be released to the GC.
    DiskQueueCacheModeClearAndReuseCache
)

type DiskQueueCfg Uses

type DiskQueueCfg struct {
    // FS is the filesystem interface to use.
    FS  fs.FS
    // Path is where the temporary directory that will contain this DiskQueue's
    // files should be created. The directory name will be a UUID.
    Path string
    // CacheMode defines the way a DiskQueue should use its cache. Refer to the
    // comment of DiskQueueCacheModes for more information.
    CacheMode DiskQueueCacheMode
    // BufferSizeBytes is the number of bytes to buffer before compressing and
    // writing to disk.
    BufferSizeBytes int
    // MaxFileSizeBytes is the maximum size an on-disk file should reach before
    // rolling over to a new one.
    MaxFileSizeBytes int

    // OnNewDiskQueueCb is an optional callback function that will be called when
    // NewDiskQueue is called.
    OnNewDiskQueueCb func()

    // TestingKnobs are used to test the queue implementation.
    TestingKnobs struct {
        // AlwaysCompress, if true, will skip a check that determines whether
        // compression is used for a given write or not given the percentage size
        // improvement. This allows us to test compression.
        AlwaysCompress bool
    }
}

DiskQueueCfg is a struct holding the configuration options for a DiskQueue.

func (*DiskQueueCfg) EnsureDefaults Uses

func (cfg *DiskQueueCfg) EnsureDefaults() error

EnsureDefaults ensures that optional fields are set to reasonable defaults. If any necessary options have been elided, an error is returned.

func (*DiskQueueCfg) SetDefaultBufferSizeBytesForCacheMode Uses

func (cfg *DiskQueueCfg) SetDefaultBufferSizeBytesForCacheMode()

SetDefaultBufferSizeBytesForCacheMode sets the default BufferSizeBytes according to the set CacheMode.

type PartitionedDiskQueue Uses

type PartitionedDiskQueue struct {
    // contains filtered or unexported fields
}

PartitionedDiskQueue is a PartitionedQueue whose partitions are on-disk.

func NewPartitionedDiskQueue Uses

func NewPartitionedDiskQueue(
    typs []*types.T,
    cfg DiskQueueCfg,
    fdSemaphore semaphore.Semaphore,
    partitionerStrategy PartitionerStrategy,
    diskAcc *mon.BoundAccount,
) *PartitionedDiskQueue

NewPartitionedDiskQueue creates a PartitionedDiskQueue whose partitions are all on-disk queues. Note that diskQueues will be lazily created when enqueueing to a new partition. Each new partition will use cfg.BufferSizeBytes, so memory usage may increase in an unbounded fashion if used unmethodically. The file descriptors are acquired through fdSemaphore. If fdSemaphore is nil, the partitioned disk queue will not Acquire or Release file descriptors. Do this if the caller knows that it will use a constant maximum number of file descriptors and wishes to acquire these up front. Note that actual file descriptors open may be less than, but never more than the number acquired through the semaphore.

func (*PartitionedDiskQueue) Close Uses

func (p *PartitionedDiskQueue) Close(ctx context.Context) error

Close closes all the PartitionedDiskQueue's partitions. If an error is encountered, the PartitionedDiskQueue will attempt to close all partitions anyway and return the last error encountered.

func (*PartitionedDiskQueue) CloseAllOpenReadFileDescriptors Uses

func (p *PartitionedDiskQueue) CloseAllOpenReadFileDescriptors() error

CloseAllOpenReadFileDescriptors closes all open read file descriptors belonging to partitions that are being Dequeued from. If Dequeue is called on a closed partition, it will be reopened.

func (*PartitionedDiskQueue) CloseAllOpenWriteFileDescriptors Uses

func (p *PartitionedDiskQueue) CloseAllOpenWriteFileDescriptors(ctx context.Context) error

CloseAllOpenWriteFileDescriptors closes all open write file descriptors belonging to partitions that are being Enqueued to. Once this method is called, existing partitions may not be enqueued to again.

func (*PartitionedDiskQueue) CloseInactiveReadPartitions Uses

func (p *PartitionedDiskQueue) CloseInactiveReadPartitions(ctx context.Context) error

CloseInactiveReadPartitions closes all partitions that were Dequeued from and either Dequeued a coldata.ZeroBatch or were closed through CloseAllOpenReadFileDescriptors. This method call Closes the underlying DiskQueue to remove its files, so a partition may never be used again.

func (*PartitionedDiskQueue) Dequeue Uses

func (p *PartitionedDiskQueue) Dequeue(
    ctx context.Context, partitionIdx int, batch coldata.Batch,
) error

Dequeue dequeues a batch from partition partitionIdx, returns a coldata.ZeroBatch if that partition does not exist. If the partition exists and a coldata.ZeroBatch is returned, that partition is closed.

func (*PartitionedDiskQueue) Enqueue Uses

func (p *PartitionedDiskQueue) Enqueue(
    ctx context.Context, partitionIdx int, batch coldata.Batch,
) error

Enqueue enqueues a batch at partition partitionIdx.

type PartitionedQueue Uses

type PartitionedQueue interface {
    // Enqueue adds the batch to the end of the partitionIdx'th partition. If a
    // partition at that index does not exist, a new one is created. Existing
    // partitions may not be Enqueued to after calling
    // CloseAllOpenWriteFileDescriptors.
    Enqueue(ctx context.Context, partitionIdx int, batch coldata.Batch) error
    // Dequeue removes and returns the batch from the front of the
    // partitionIdx'th partition. If the partition is empty, or no partition at
    // that index was Enqueued to, a zero-length batch is returned. Note that
    // it is illegal to call Enqueue on a partition after Dequeue.
    Dequeue(ctx context.Context, partitionIdx int, batch coldata.Batch) error
    // CloseAllOpenWriteFileDescriptors notifies the PartitionedQueue that it can
    // close all open write file descriptors. After this point, only new
    // partitions may be Enqueued to.
    CloseAllOpenWriteFileDescriptors(ctx context.Context) error
    // CloseAllOpenReadFileDescriptors closes the open read file descriptors
    // belonging to partitions. These partitions may still be Dequeued from,
    // although this will trigger files to be reopened.
    CloseAllOpenReadFileDescriptors() error
    // CloseInactiveReadPartitions closes all partitions that have been Dequeued
    // from and have either been temporarily closed through
    // CloseAllOpenReadFileDescriptors or have returned a coldata.ZeroBatch from
    // Dequeue. This close removes the underlying files.
    CloseInactiveReadPartitions(ctx context.Context) error
    // Close closes all partitions created.
    Close(ctx context.Context) error
}

PartitionedQueue is the abstraction for on-disk storage.

type PartitionerStrategy Uses

type PartitionerStrategy int

PartitionerStrategy describes a strategy used by the PartitionedQueue during its operation.

const (
    // PartitionerStrategyDefault is a partitioner strategy in which the
    // PartitionedQueue will keep all partitions open for writing.
    // Note that this uses up as many file descriptors as partitions.
    PartitionerStrategyDefault PartitionerStrategy = iota
    // PartitionerStrategyCloseOnNewPartition is a partitioner strategy that
    // closes an open partition for writing if a new partition is created. This
    // ensures that the total number of file descriptors remains at 1. However,
    // note that closed partitions may never be written to again, only read.
    PartitionerStrategyCloseOnNewPartition
)

type Queue Uses

type Queue interface {
    // Enqueue enqueues a coldata.Batch to this queue. A zero-length batch should
    // be enqueued when no more elements will be enqueued.
    // WARNING: Selection vectors are ignored.
    Enqueue(context.Context, coldata.Batch) error
    // Dequeue dequeues a coldata.Batch from the queue into the batch that is
    // passed in. The boolean returned specifies whether the queue was not empty
    // (i.e. whether there was a batch to Dequeue). If true is returned and the
    // batch has a length of zero, the Queue is finished and will not be Enqueued
    // to. If an error is returned, the batch and boolean returned are
    // meaningless.
    Dequeue(context.Context, coldata.Batch) (bool, error)
    // CloseRead closes the read file descriptor. If Dequeued, the file may be
    // reopened.
    CloseRead() error
    // Close closes any resources associated with the Queue.
    Close(context.Context) error
}

Queue describes a simple queue interface to which coldata.Batches can be Enqueued and Dequeued.

func NewDiskQueue Uses

func NewDiskQueue(
    ctx context.Context, typs []*types.T, cfg DiskQueueCfg, diskAcc *mon.BoundAccount,
) (Queue, error)

NewDiskQueue creates a Queue that spills to disk.

type RewindableQueue Uses

type RewindableQueue interface {
    Queue
    // Rewind resets the Queue so that it Dequeues all Enqueued batches from the
    // start.
    Rewind() error
}

RewindableQueue is a Queue that can be read from multiple times. Note that in order for this Queue to return the same data after rewinding, all Enqueueing *must* occur before any Dequeueing.

func NewRewindableDiskQueue Uses

func NewRewindableDiskQueue(
    ctx context.Context, typs []*types.T, cfg DiskQueueCfg, diskAcc *mon.BoundAccount,
) (RewindableQueue, error)

NewRewindableDiskQueue creates a RewindableQueue that spills to disk.

Package colcontainer imports 16 packages (graph) and is imported by 7 packages. Updated 2020-08-12. Refresh now. Tools for package owners.