package mergeplan

import ""

Package mergeplan provides a segment merge planning approach that's inspired by Lucene's and descriptions like


const MaxSegmentSizeLimit = 1<<31 - 1

MaxSegmentSizeLimit represents the maximum size of a segment, this limit comes with hit-1 optimisation/max encoding limit uint31.


var DefaultMergePlanOptions = MergePlanOptions{
    MaxSegmentsPerTier:   10,
    MaxSegmentSize:       5000000,
    TierGrowth:           10.0,
    SegmentsPerMergeTask: 10,
    FloorSegmentSize:     2000,
    ReclaimDeletesWeight: 2.0,

DefaultMergePlanOptions suggests the default options.

var ErrMaxSegmentSizeTooLarge = errors.New("MaxSegmentSize exceeds the size limit")

ErrMaxSegmentSizeTooLarge is returned when the size of the segment exceeds the MaxSegmentSizeLimit

func CalcBudget Uses

func CalcBudget(totalSize int64, firstTierSize int64, o *MergePlanOptions) (
    budgetNumSegments int)

Compute the number of segments that would be needed to cover the totalSize, by climbing up a logarithmically growing staircase of segment tiers.

func ScoreSegments Uses

func ScoreSegments(segments []Segment, o *MergePlanOptions) float64

Smaller result score is better.

func ToBarChart Uses

func ToBarChart(prefix string, barMax int, segments []Segment, plan *MergePlan) string

ToBarChart returns an ASCII rendering of the segments and the plan. The barMax is the max width of the bars in the bar chart.

func ValidateMergePlannerOptions Uses

func ValidateMergePlannerOptions(options *MergePlanOptions) error

ValidateMergePlannerOptions validates the merge planner options

type MergePlan Uses

type MergePlan struct {
    Tasks []*MergeTask

A MergePlan is the result of the Plan() API.

The planner doesn’t know how or whether these tasks are executed -- that’s up to a separate merge execution system, which might execute these tasks concurrently or not, and which might execute all the tasks or not.

func Plan Uses

func Plan(segments []Segment, o *MergePlanOptions) (*MergePlan, error)

Plan() will functionally compute a merge plan. A segment will be assigned to at most a single MergeTask in the output MergePlan. A segment not assigned to any MergeTask means the segment should remain unmerged.

type MergePlanOptions Uses

type MergePlanOptions struct {
    // Max # segments per logarithmic tier, or max width of any
    // logarithmic “step”.  Smaller values mean more merging but fewer
    // segments.  Should be >= SegmentsPerMergeTask, else you'll have
    // too much merging.
    MaxSegmentsPerTier int

    // Max size of any segment produced after merging.  Actual
    // merging, however, may produce segment sizes different than the
    // planner’s predicted sizes.
    MaxSegmentSize int64

    // The growth factor for each tier in a staircase of idealized
    // segments computed by CalcBudget().
    TierGrowth float64

    // The number of segments in any resulting MergeTask.  e.g.,
    // len(result.Tasks[ * ].Segments) == SegmentsPerMergeTask.
    SegmentsPerMergeTask int

    // Small segments are rounded up to this size, i.e., treated as
    // equal (floor) size for consideration.  This is to prevent lots
    // of tiny segments from resulting in a long tail in the index.
    FloorSegmentSize int64

    // Controls how aggressively merges that reclaim more deletions
    // are favored.  Higher values will more aggressively target
    // merges that reclaim deletions, but be careful not to go so high
    // that way too much merging takes place; a value of 3.0 is
    // probably nearly too high.  A value of 0.0 means deletions don't
    // impact merge selection.
    ReclaimDeletesWeight float64

    // Optional, defaults to mergeplan.CalcBudget().
    CalcBudget func(totalSize int64, firstTierSize int64,
        o *MergePlanOptions) (budgetNumSegments int)

    // Optional, defaults to mergeplan.ScoreSegments().
    ScoreSegments func(segments []Segment, o *MergePlanOptions) float64

    // Optional.
    Logger func(string)

The MergePlanOptions is designed to be reusable between planning calls.

func (*MergePlanOptions) RaiseToFloorSegmentSize Uses

func (o *MergePlanOptions) RaiseToFloorSegmentSize(s int64) int64

Returns the higher of the input or FloorSegmentSize.

type MergeTask Uses

type MergeTask struct {
    Segments []Segment

A MergeTask represents several segments that should be merged together into a single segment.

type Segment Uses

type Segment interface {
    // Unique id of the segment -- used for sorting.
    Id() uint64

    // Full segment size (the size before any logical deletions).
    FullSize() int64

    // Size of the live data of the segment; i.e., FullSize() minus
    // any logical deletions.
    LiveSize() int64

A Segment represents the information that the planner needs to calculate segment merging.

