boltzmann

package module
v0.0.0-...-b1a8e8c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 13, 2023 License: Apache-2.0 Imports: 2 Imported by: 0

README

Boltzmann

Boltzmann is an open-source distributed lightweight task orchestrator.

Based on the Scheduler Agent Supervisor Cloud Pattern, Boltzmann is a master-less service used to schedule a batch of task in a parallel and distributed way.

Depending on the configuration, a Boltzmann node might be stateless or stateful as tasks states may be stored in a embedded or external database (e.g. Redis).

Worker pools (i.e. a Boltzmann node) are ensured for correctness even in a distributed environment by using leases (i.e. distributed mutex lock) and a small leader election consensus algorithm.

Moreover, Leases are implemented using either a RedLock algorithm or through storage engine's built-in data structure (e.g. etcd leases).

Architecture

High-Level Archictecture Diagram

Task Scheduler

The Scheduler arranges for the steps that make up the task to be executed and orchestrates their operation. These steps can be combined into a pipeline or workflow. The Scheduler is responsible for ensuring that the steps in this workflow are performed in the right order.

As each step is performed, the Scheduler records the state of the workflow, such as "step not yet started," "step running," or "step completed." The state information should also include an upper limit of the time allowed for the step to finish, called the complete-by time.

If a step requires access to a remote service or resource, the Scheduler invokes the appropriate Agent, passing it the details of the work to be performed. The Scheduler typically communicates with an Agent using asynchronous request/response messaging.

Agent

The Agent contains logic that encapsulates a call to a remote service, or access to a remote resource referenced by a step in a task. Each Agent typically wraps calls to a single service or resource, implementing the appropriate error handling and retry logic (subject to a timeout constraint, described later).

Supervisor

The Supervisor monitors the status of the steps in the task being performed by the Scheduler. It runs periodically (the frequency will be system-specific), and examines the status of steps maintained by the Scheduler. If it detects any that have timed out or failed, it arranges for the appropriate Agent to recover the step or execute the appropriate remedial action (this might involve modifying the status of a step).

Note that the recovery or remedial actions are implemented by the Scheduler and Agents. The Supervisor should simply request that these actions be performed.

Usage

Till this day, there are two ways available to use Boltzmann (which are not mutually exclusive):

  • A HTTP REST API (HTTP/1.1).
  • A gRCP Streaming API (HTTP/2, multiplexed).

Documentation

Index

Constants

View Source
const (
	RootModuleName = "boltzmann"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Task

type Task struct {
	TaskID            string
	CorrelationID     string
	Driver            string
	ResourceURI       string
	AgentArguments    map[string]string
	Payload           []byte
	Status            TaskStatus
	Response          []byte
	FailureMessage    string
	ScheduleTime      time.Time
	StartTime         time.Time
	EndTime           time.Time
	ExecutionDuration time.Duration
}

type TaskStatus

type TaskStatus uint8
const (
	TaskStatusScheduled TaskStatus
	TaskStatusStarted
	TaskStatusFailed
	TaskStatusSucceed
)

func NewTaskStatus

func NewTaskStatus(status string) TaskStatus

func (TaskStatus) GoString

func (s TaskStatus) GoString() string

func (TaskStatus) String

func (s TaskStatus) String() string

Directories

Path Synopsis
cmd
test

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL