beam: Index | Files

package sdf

import ""

Package contains interfaces used specifically for splittable DoFns.

Warning: Splittable DoFns are still experimental, largely untested, and likely to have bugs.


Package Files


type RTracker Uses

type RTracker interface {
    // TryClaim attempts to claim the block of work located in the given position of the
    // restriction. This method must be called in ProcessElement to claim work before it can be
    // processed. Processing work without claiming it first can lead to incorrect output.
    // The position type is up to individual implementations, and will usually be related to the
    // kind of restriction used. For example, a simple restriction representing a numeric range
    // might use an int64. A more complex restriction, such as one representing a multidimensional
    // space, might use a more complex type.
    // If the claim is successful, the DoFn must process the entire block. If the claim is
    // unsuccessful ProcessElement method of the DoFn must return without performing
    // any additional work or emitting any outputs.
    // If the claim fails due to an error, that error is stored and will be automatically emitted
    // when the RTracker is validated, or can be manually retrieved with GetError.
    // This pseudocode example illustrates the typical usage of TryClaim:
    // 	pos = position of first block within the restriction
    // 	for TryClaim(pos) == true {
    // 		// Do all work in the claimed block and emit outputs.
    // 		pos = position of next block within the restriction
    // 	}
    // 	return
    TryClaim(pos interface{}) (ok bool)

    // GetError returns the error that made this RTracker stop executing, and returns nil if no
    // error occurred. This is the error that is emitted if automated validation fails.
    GetError() error

    // TrySplit splits the current restriction into a primary (currently executing work) and
    // residual (work to be split off) based on a fraction of work remaining. The split is performed
    // at the first valid split point located after the given fraction of remaining work.
    // For example, a fraction of 0.5 means to split at the halfway point of remaining work only. If
    // 50% of work is done and 50% remaining, then a fraction of 0.5 would split after 75% of work.
    // This method modifies the underlying restriction in the RTracker to reflect the primary. It
    // then returns a copy of the newly modified restriction as a primary, and returns a new
    // restriction for the residual. If the split would produce an empty residual (i.e. the only
    // split point is the end of the restriction), then the returned residual is nil.
    // If an error is returned, some catastrophic failure occurred and the entire bundle will fail.
    TrySplit(fraction float64) (primary, residual interface{}, err error)

    // GetProgress returns two abstract scalars representing the amount of done and remaining work.
    // These values have no specific units, but are used to estimate work in relation to each other
    // and should be self-consistent.
    GetProgress() (done float64, remaining float64)

    // IsDone returns a boolean indicating whether all blocks inside the restriction have been
    // claimed. This method is called by the SDK Harness to validate that a splittable DoFn has
    // correctly processed all work in a restriction before finishing. If this method returns false
    // then GetError is expected to return a non-nil error.
    IsDone() bool

RTracker is an interface used to interact with restrictions while processing elements in splittable DoFns (specifically, in the ProcessElement method). Each RTracker tracks the progress of a single restriction.

All RTracker methods should be thread-safe for dynamic splits to function correctly.

Package sdf is imported by 3 packages. Updated 2020-05-29. Refresh now. Tools for package owners.