grapher

package
v1.0.12 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 6, 2020 License: Apache-2.0 Imports: 10 Imported by: 0

Documentation

Overview

Package Grapher provides capabilities to construct directed acyclic graphs from configuration yaml files. Specifically, the graphs that Grapher can create are structure with the intent to organize workflows, or 'job chains.'

The implementation will read in a yaml file (structure defined below) and will return a graph struct. Traversing the graph struct can be done via two methods: a linked list structure and an adjacency list (also described below).

Basics

* Graph - The graph is defined as a set of Vertices and a set of directed edges between vertices. The graph has some restrictions on it in addition to those of a traditional DAG. The graph must:

  • be acyclic.
  • ave only single edges between vertices.
  • contain exacly one source node and one sink node.
  • each node in the graph must be reachable from the source node.
  • the sink node must be reachable from every other node in the graph.

+ The user need not explicitly define a single start/end node in the config file. Grapher will insert "no-op" nodes at the start/end of sequences to enforce the single source/sink rule.

* Node - A single vertex in the graph.

* Sequence - A set of Nodes that form a subset of the main Graph. Each sequence in a graph will also follow the same properties that a graph has (described above).

Config Structure

The yaml file to encode the graph types are defined with the outermost object being the Sequence. Sequences is a map of "sequence name" -> Sequence Spec. Sequence Specs have two fields: "args" and "nodes." The "args" field is made up of required, optional, and static arguments. The "nodes" field is a map of node structures.

The nodes list is a map of node name -> node spec. Each node spec is defined as:

  • category: "job"|"sequence" - this indicates whether the specified node represents a single vertex, or actually refers to another sequence defined.

  • type: this refers to the actual type of data to assign to the node, and will be stored in the Payload of the node.

  • args: refers to the arguments that must be provided to the node during creation. There is another level of mapping arguments here. Each argument consists of a "expected" field, which is the name of the variable that is expected by the node creator, and the "given" field, which indicates the variable that is actually given by grapher to the job creator.

  • sets: this defines the variables that will be set on creation of a Node. This, used in conjunction with the "args" of later nodes can be used to pass data from one node to the other on creation of a graph.

  • deps: This lists the nodes that have out edges leading into the node defined by this spec.

  • each: This signifies to grapher that the node (or sequence) needs to be repeated in parallel a variable number of times. The format of this struct is given as `each: foos:foo`, and can be read as "repeat for each foo in foos" The variable `foos` must necessarily be an argument given to the node.

  • retries: the number of times that a node will be retried if it fails. This field only applies to nodes with category="job" (i.e., does not apply to sequences).

  • retryDelay: the delay, in seconds, between retries of a node. This field only applies to nodes with category="job" (i.e., does not apply to sequences).

Also included in the config file, is

Example Config

The full example file can be foind at grapher/test/example-requests.yaml

sequences:
	decommission-cluster:
	  args:
	    required:
	      - name: cluster
	      - name: env
	    optional:
	      - name: something
	        default: 100
		static:
		  - name: somethingelse
			default: abcd
	  nodes:
	    get-instances:
	      category: job
	      type: get-cluster-instances
	      args:
	        - expects: cluster
	          given: cluster
	      sets: [instances]
	      deps: []
	      retries: 3
	      retryDelay: 10
	    delete-job:
	      category: job
	      type: delete-job-1
	      args:
	        - expects: cluster
	          given: cluster
	        - expects: env
	          given: env
	        - expects: instances
	          given: instances
	      sets: []
	      deps: [pre-flight-checks]
	    pre-flight-checks:
	      category: sequence
	      type: check-instance-is-ok
	      each: instances:instance # repeat for each instance in instances
	      args:
	        - expects: instances
	          given: instances
	      deps: [get-instances]
	check-instance-is-ok:
	  args:
	    required:
	      - name: instance
	    optional:
		static:
	  nodes:
	    check-ok:
	      category: job
	      type: check-ok-1
	      args:
	        - expects: container
	          given: instance
	      sets: [physicalhost]
	      deps: []
noop-node:
  category: job
  type: no-op    # this is left up to the user to define in their jobs repo

Index

Constants

View Source
const DEFAULT = "default"

Variables

This section is empty.

Functions

This section is empty.

Types

type ACL

type ACL struct {
	Role  string   `yaml:"role"`  // user-defined role
	Admin bool     `yaml:"admin"` // all ops allowed if true
	Ops   []string `yaml:"ops"`   // proto.REQUEST_OP_*
}

ACL represents one role-based ACL entry. Every auth.Caller (from the user-provided auth plugin Authenticate method) is authorized with a matching ACL, else the request is denied with HTTP 401 unauthorized. Roles are user-defined. If Admin is true, Ops cannot be set.

type ArgSpec

type ArgSpec struct {
	Name    string  `yaml:"name"`
	Desc    string  `yaml:"desc"`
	Default *string `yaml:"default"`
}

ArgSpec defines the structure expected from the config to define sequence args.

type Config

type Config struct {
	Sequences map[string]*SequenceSpec `yaml:"sequences"`
}

All Sequences in the yaml. Also contains the user defined no-op job.

func ReadConfig

func ReadConfig(configFile string) (Config, error)

ReadConfig will read from configFile and return a Config that the user can then use for NewGrapher(). configFile is expected to be in the yaml format specified.

type Graph

type Graph struct {
	Name     string              // Name of the Graph
	First    *Node               // The source node of the graph
	Last     *Node               // The sink node of the graph
	Vertices map[string]*Node    // All vertices in the graph (node id -> node)
	Edges    map[string][]string // All edges (source node id -> sink node id)
}

Graph represents a graph. It represents a graph via Vertices, a map of vertex name -> Node, and Edges, an adjacency list. Also contained in Graph are the First and Last Nodes in the graph.

func (*Graph) AdjacencyListMatchesLL

func (g *Graph) AdjacencyListMatchesLL() bool

Checks that the adjacency list (given by g.Vertices and g.Edges) matches the linked list structure provided through node.Next and node.Prev.

func (*Graph) HasCycles

func (g *Graph) HasCycles() bool

returns true iff the graph has at least one cycle in it

func (*Graph) IsConnected

func (g *Graph) IsConnected() bool

returns true iff every node is reachable from the start node, and every path terminates at the end node

func (*Graph) IsValidGraph

func (g *Graph) IsValidGraph() bool

Asserts that g is a valid graph (according to Grapher's use case). Ensures that g is acyclic, is connected (not fully connected), and the adjacency list matches its linked list.

func (*Graph) PrintDot

func (g *Graph) PrintDot()

Prints out g in DOT graph format. Copy and paste output into http://www.webgraphviz.com/

type Grapher

type Grapher struct {
	AllSequences map[string]*SequenceSpec // All sequences that were read in from the Config
	JobFactory   job.Factory              // factory to create nodes' jobs.
	// contains filtered or unexported fields
}

The Grapher struct contains the sequence specs required to construct graphs. The user must handle the creation of the Sequence Specs.

CreateGraph will create a graph. The user must provide a Sequence Type, to indicate what graph will be created.

func NewGrapher

func NewGrapher(req proto.Request, nf job.Factory, cfg Config, idgen id.Generator) *Grapher

NewGrapher returns a new Grapher struct. The caller of NewGrapher must provide a Job Factory for Grapher to create the jobs that will be stored at each node. An id generator must also be provided (used for generating ids for nodes).

A new Grapher should be made for every request.

func (*Grapher) CreateGraph

func (o *Grapher) CreateGraph(sequenceName string, args map[string]interface{}) (*Graph, error)

CreateGraph will create a graph. The user must provide a Sequence Name, to indicate what graph will be created. The caller must also provide the first set of args.

func (*Grapher) RequestArgs

func (o *Grapher) RequestArgs(requestType string, args map[string]interface{}) ([]proto.RequestArg, error)

func (*Grapher) Sequences

func (o *Grapher) Sequences() map[string]*SequenceSpec

type GrapherFactory

type GrapherFactory interface {
	// Make makes a Grapher. A new grapher should be made for every request.
	Make(proto.Request) *Grapher
}

A GrapherFactory makes Graphers.

func NewGrapherFactory

func NewGrapherFactory(jf job.Factory, cfg Config, idf id.GeneratorFactory) GrapherFactory

NewGrapherFactory creates a GrapherFactory.

type MockGrapherFactory

type MockGrapherFactory struct {
	MakeFunc func(proto.Request) *Grapher
}

Mock grapher factory for testing.

func (*MockGrapherFactory) Make

func (gf *MockGrapherFactory) Make(req proto.Request) *Grapher

type Node

type Node struct {
	Datum             job.Job                // Data stored at this Node
	Next              map[string]*Node       // out edges ( node id -> Node )
	Prev              map[string]*Node       // in edges ( node id -> Node )
	Name              string                 // the name of the node
	Args              map[string]interface{} // the args the node was created with
	Retry             uint                   // the number of times to retry a node
	RetryWait         string                 // the time to sleep between retries
	SequenceId        string                 // ID for first node in sequence
	SequenceRetry     uint                   // Number of times to retry a sequence. Only set for first node in sequence.
	SequenceRetryWait string                 // the time to sleep between sequence retries
}

Node represents a single vertex within a Graph. Each node consists of a Payload (i.e. the data that the user cares about), a list of next and prev Nodes, and other information about the node such as the number of times it should be retried on error. Next defines all the out edges from Node, and Prev defines all the in edges to Node.

type NodeArg

type NodeArg struct {
	Expected string `yaml:"expected"` // the name of the argument that this job expects
	Given    string `yaml:"given"`    // the name of the argument that will be given to this job
}

NodeArg defines the structure expected from the yaml file to define a job's args.

type NodeSpec

type NodeSpec struct {
	Name         string            `yaml:"name"`      // unique name assigned to this node
	Category     string            `yaml:"category"`  // "job", "sequence", or "conditional"
	NodeType     string            `yaml:"type"`      // the type of job or sequence to create
	Each         []string          `yaml:"each"`      // arguments to repeat over
	Args         []*NodeArg        `yaml:"args"`      // expected arguments
	Parallel     *uint             `yaml:"parallel"`  // max number of sequences to run in parallel
	Sets         []string          `yaml:"sets"`      // expected job args to be set
	Dependencies []string          `yaml:"deps"`      // nodes with out-edges leading to this node
	Retry        uint              `yaml:"retry"`     // the number of times to retry a "job" that fails
	RetryWait    string            `yaml:"retryWait"` // the time to sleep between "job" retries
	If           string            `yaml:"if"`        // the name of the jobArg to check for a conditional value
	Eq           map[string]string `yaml:"eq"`        // conditional values mapping to appropriate sequence names
}

NodeSpec defines the structure expected from the yaml file to define each nodes.

type SequenceArgs

type SequenceArgs struct {
	Required []*ArgSpec `yaml:"required"`
	Optional []*ArgSpec `yaml:"optional"`
	Static   []*ArgSpec `yaml:"static"`
}

SequenceArgs defines the structure expected from the config file to define a sequence's arguments. A sequence can have required arguments; any arguments on this list that are missing will result in an error from Grapher. A sequence can also have optional arguemnts; arguments on this list that are missing will not result in an error. Additionally optional arguments can have default values that will be used if not explicitly given.

type SequenceSpec

type SequenceSpec struct {
	/* Read in from yaml. */
	Name    string               `yaml:"name"`    // name of the sequence
	Args    SequenceArgs         `yaml:"args"`    // arguments to the sequence
	Nodes   map[string]*NodeSpec `yaml:"nodes"`   // list of nodes that are a part of the sequence
	Request bool                 `yaml:"request"` // whether or not the sequence spec is a user request
	ACL     []ACL                `yaml:"acl"`     // allowed caller roles (optional)
	/* Information-passing fields. */
	Retry     uint   `yaml:"-"` // the number of times to retry the sequence if it fails
	RetryWait string `yaml:"-"` // the time to sleep between sequence retries
}

SequenceSpec defines the structure expected from the config yaml file to define each sequence If a field is in the yaml, it appears here, but the reverse is not true; some fields here are only for information-passing purposes, and not read in from the yaml

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL