taskgraph

package module
v0.0.0-...-22135bc Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 7, 2015 License: Apache-2.0 Imports: 4 Imported by: 14

README

TaskGraph

Build Status

Build Status

TaskGraph is a framework for writing fault tolerent distributed applications. It assumes that application consists of a network of tasks, which are inter-connected based on certain topology (hence graph). TaskGraph assume for each task (logical unit of work), there are one primary node, and zero or more backup nodes. TaskGraph then help with two types of node failure: failure happens to nodes from different task, failure happens to the nodes from the same task. Framework monitors the task/node's health, and take care of restarting the failed tasks, and also pass on a standard set of events (typed neighbors fail/restart, primary/backup switch) to task implementation so that it can do application dependent recovery.

TaskGraph supports an event driven pull model for communication. When one task want to talk to some other task, it set a flag via framework, and framework will notify recipient, which can decide whether or when to fetch data via Framework. Framework will handle communication failures via automatic retry, reconnect to new task node, etc. In another word, it provides exactly-once semantic on data request.

An TaskGraph application usually has three layers. And application implementation need to configure TaskGraph in driver layer and also implement Task/TaskBuilder/Topology based on application logic.

  1. In driver (main function), applicaiton need to configure the task graph. This include setting up TaskBuilder, which specify what task need to run as each node. One also need to specify the network topology which specify who connect to whom at each iteration. Then FrameWork.Start is called so that every node will get into event loop.

  2. TaskGraph framework handles fault tolerency within the framework. It uses etcd and/or kubernetes for this purpose. It should also handle the communication between logic task so that it can hide the communication between master and hot standby.

  3. Application need to implement Task interface, to specify how they should react to parent/child dia/restart event to carry out the correct application logic. Note that application developer need to implement TaskBuilder/Topology that suit their need (implementaion of these three interface are wired together in the driver).

For an example of driver and task implementation, check example/regression.

Note, for now, the completion of TaskGraph is not defined explicitly. Instead, each application will have their way of exit based on application dependent logic. As an example, the above application can stop if task 0 stops. We have the hooks in Framework so any node can potentially exit the entire TaskGraph. We also have hook in Task so that task implementation get a change to save the work.

Documentation

Overview

TaskBuilder is also implemented by application developer and used by framework implementation so decide which task implementation one should use at any give node. It should be called only once at node initialization.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type BackedUpFramework

type BackedUpFramework interface {
	// Ask framework to do update on this update on this task, which consists
	// of one primary and some backup copies.
	Update(taskID uint64, log UpdateLog)
}

Note that framework can decide how update can be done, and how to serve the updatelog.

type Backupable

type Backupable interface {
	// Some hooks that need for master slave etc.
	BecamePrimary()
	BecameBackup()

	// Framework notify this copy to update. This should be the only way that
	// one update the state of copy.
	Update(log UpdateLog)
}

Backupable is an interface that task need to implement if they want to have hot standby copy. This is another can of beans.

type Bootstrap

type Bootstrap interface {
	// These allow application developer to set the task configuration so framework
	// implementation knows which task to invoke at each node.
	SetTaskBuilder(taskBuilder TaskBuilder)

	// This allow user add their own link type of Topology into framework topology
	AddLinkage(linkType string, topology Topology)

	// After all the configure is done, driver need to call start so that all
	// nodes will get into the event loop to run the application.
	Start()
}

This interface is used by application during taskgraph configuration phase.

type Bootup

type Bootup interface {
	// Blocking call to run the task until it finishes.
	Start()
}

type Datum

type Datum interface{}

Datum is the interface for basic loading and transformation.

type DatumIterator

type DatumIterator interface {
	HasNext() bool
	Next() Datum
}

DatumIerator allow one to iterate through all the datum in the set.

type DatumIteratorBuilder

type DatumIteratorBuilder interface {
	Build(path string) DatumIterator
}

This can be used to build a sequence of Datum from source.

type DatumStore

type DatumStore struct {
	Cache []Datum
}

DatumStore host a set of Datum in the memory.

type DatumTransformer

type DatumTransformer interface {
	Transform(old Datum) Datum
}

Transform Datum from one format to another.

type Framework

type Framework interface {
	// This allow the task implementation query its neighbors.
	GetTopology() map[string]Topology

	// Kill the framework itself.
	// As epoch changes, some nodes isn't needed anymore
	Kill()

	// Some task can inform all participating tasks to shutdown.
	// If successful, all tasks will be gracefully shutdown.
	// TODO: @param status
	ShutdownJob()

	GetLogger() *log.Logger

	// This is used to figure out taskid for current node
	GetTaskID() uint64

	// This is useful for task to inform the framework their status change.
	// metaData has to be really small, since it might be stored in etcd.
	// Set meta flag to notify meta to all nodes of linkType to this node.
	FlagMeta(ctx context.Context, linkType, meta string)

	// Some task can inform all participating tasks to new epoch
	IncEpoch(ctx context.Context)

	// Request data from task toID with specified linkType and meta.
	DataRequest(ctx context.Context, toID uint64, method string, input proto.Message)
	CheckGRPCContext(ctx context.Context) error
}

Framework hides distributed system complexity and provides users convenience of high level features.

type GRPCHandlerInterceptor

type GRPCHandlerInterceptor interface {
	// Currently grpc doesn't support interceptor functionality. We need to rely on user
	// to call this at handler implementation.
	// The workflow would be
	//   C:Notify -> S:Intercept -> S:OnNotify
	Intercept(ctx context.Context, method string, input proto.Message) (proto.Message, error)
}

type GRPCHelper

type GRPCHelper interface {
	CreateOutputMessage(methodName string) proto.Message
	CreateServer() *grpc.Server
}

type MasterFrame

type MasterFrame interface {
	// User can use this interface to simplify sending the messages to worker. By keeping
	// track of workers' states, user can make decisions on logical worker and communicate it
	// using proto messages.
	NotifyWorker(ctx context.Context, workerID uint64, method string, input proto.Message) (proto.Message, error)
	GRPCHandlerInterceptor
}

type MasterTask

type MasterTask interface {
	Setup(framework MasterFrame)
	Run(ctx context.Context)

	// Corresponds to NotifyMaster
	OnNotify(ctx context.Context, workerID uint64, method string, input proto.Message) (proto.Message, error)
	GRPCHelper
}

type Node

type Node interface {
	// return the ID of this node
	ID() uint64
	// return the task this node associated to
	TaskID() uint64
	// return the status of this node
	// possible status: no associated to any task
	//                  master of a task
	//                  slave of a task
	Status() uint64
	// return a connection string of this node
	// scheme://host:port
	Connection() string
}

type Task

type Task interface {
	// This is useful to bring the task up to speed from scratch or if it recovers.
	Init(taskID uint64, framework Framework)

	// Task is finished up for exit. Last chance to save some task specific work.
	Exit()

	// Framework tells user task what current epoch is.
	// This give the task an opportunity to cleanup and regroup.
	EnterEpoch(ctx context.Context, epoch uint64)

	// The meta/data notifications obey exactly-once semantics. Note that the same
	// meta string will be notified only once even if you flag the meta more than once.
	// TODO: one can also get this from channel.
	MetaReady(ctx context.Context, fromID uint64, linkType, meta string)

	// This is the callback when data from server is ready.
	DataReady(ctx context.Context, fromID uint64, method string, output proto.Message)

	CreateOutputMessage(methodName string) proto.Message
	CreateServer() *grpc.Server
}

All event handler functions and should be non-blocking.

type TaskBuilder

type TaskBuilder interface {
	// This method is called once by framework implementation to get the
	// right task implementation for given node/task.
	GetTask(taskID uint64) Task
}

type Topology

type Topology interface {
	// This method is called once by framework implementation. So that
	// we can get the local topology for each epoch later.
	SetTaskID(taskID uint64)

	// This returns the neighbors of given link for this node at this epoch.
	GetNeighbors(epoch uint64) []uint64
}

The Topology will be implemented by the application. Each Topology might have many epochs. The topology of each epoch might be different.

type UpdateLog

type UpdateLog interface {
	UpdateID()
}

type WorkerFrame

type WorkerFrame interface {
	// It usually send states, etc. information to master in order to get further decision.
	NotifyMaster(ctx context.Context, method string, input proto.Message) (proto.Message, error)
	// Worker-worker data flow
	DataRequest(ctx context.Context, workerID uint64, method string, input proto.Message) (proto.Message, error)
	GRPCHandlerInterceptor
}

type WorkerTask

type WorkerTask interface {
	Setup(framework WorkerFrame, workerID uint64)
	Run(ctx context.Context)

	// Corresponds to NotifyWorker
	OnNotify(ctx context.Context, method string, input proto.Message) (proto.Message, error)
	// Corresponds to DataRequest
	ServeData(ctx context.Context, workerID uint64, method string, input proto.Message) (proto.Message, error)
	GRPCHelper
}

Directories

Path Synopsis
example
regression/proto
Package proto is a generated protocol buffer package.
Package proto is a generated protocol buffer package.
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL