Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Children ¶
type Children struct { Sequential Sequential Parallel Parallel }
type Parallel ¶
type Parallel Sequential //map[string]ProcessorNode
type Pipeline ¶
type Pipeline struct { Name string Graph TaskGraph `json:"-"` Data interface{} Spec PipelineSpec // contains filtered or unexported fields }
Pipeline represents one Wiz Tasks Framework pipeline
func NewPipeline ¶
func (*Pipeline) AssignRunIDs ¶
func (Pipeline) CheckValidity ¶
CheckValidity ensures that the pipeline has 1. an input 2. input data and 3. an output and returns an error if it doesn't. Returns nil if OK Do we even need this?
func (*Pipeline) UpdateFromSpec ¶
func (p *Pipeline) UpdateFromSpec()
UpdateFromSpec recursively builds the internal graph representation of the task graph
func (*Pipeline) UpdateInitialDataFlags ¶
func (p *Pipeline) UpdateInitialDataFlags()
func (Pipeline) Walk ¶
func (p Pipeline) Walk(f func(p ProcessorNode) error) (err error)
Walk does a breadth-first traversal of the pipeline's graph starting at the root node interrupts and return any errors that occur TODO: could just replace with IterateChildren(p.Nodes). I guess BFS can have some specific use-cases, not really for us though
type PipelineSpec ¶
type PipelineSpec Children
PipelineSpec defines how a pipeline should be structured/serialized
type Processor ¶
type Processor struct { ID string // the unique name for the processor Version string // the semantic version of the processor if required Type string // the category of the processor: either input, output, or transformation - nil means transform Configuration interface{} }
Processor represents a Wiz Tasks node which can process data. This is how it is serialized for input It will actually be mapped to the processor specified by ID and Version and a generated RunID TODO: maybe make separate structs for the internal Tasks framework representation and the ones that are serialized from YAML e.g. ID in the lib and Name in Yaml
type ProcessorNode ¶
type ProcessorNode struct { // The public name of the node Name string // Whether the node should receive the initial data specified in the pipeline. By default, this only gets enabled for the direct children of the root node. GetsInitialData bool // The assigned runID that this processor instance gets RunID string // ProcID is in .Processor.ID Processor Processor Children Children // contains filtered or unexported fields }
ProcessorNode represents a single ETL processor in the pipeline TODO: deal with data merging
func (*ProcessorNode) DOTID ¶
func (p *ProcessorNode) DOTID() string
func (*ProcessorNode) ID ¶
func (p *ProcessorNode) ID() int64
type Sequential ¶
type Sequential []*ProcessorNode
type TaskGraph ¶
type TaskGraph interface { graph.DirectedBuilder }