server

package
v0.0.1-alpha Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 28, 2022 License: Apache-2.0 Imports: 25 Imported by: 0

README

Scoot Scheduler

Components:

  • Scheduler interface (scheduler.go) - scheduler interface
  • statefulScheduler (stateful_scheduler.go) - scheduler implementation
  • clusterState (cluster_state.go) - keeps track of the available nodes and nodes running tasks
  • jobState (job_state.go) - tracks the state of each job (running, waiting and completed tasks)
  • taskRunner (task_runner.go) - starts tasks running and collects the results
  • LoadBasedAlg (load_based_sched_alg.go) - the algorithm for computing the list of tasks to start/stop with each scheduler loop iteration.

Scheduling Algorithm

(load_based_scheduling_alg.go)

This scheduling algorithm computes the list of tasks to start and stop:

Given

  • a defined set of job classes with load % for each class, such that the load %s define the number of scoot workers targeted for running the class's tasks
  • a mapping of job requestors to job classes (using reg exp to match requestor values)
  • jobs with
    • list of tasks waiting to start (in descending duration order)
  • we have the number of idle workers waiting to run a task

Overview
The algorithm selects the list of tasks that should be started on the idle workers. The objective of the algorithm is to maintain the target load %. When a given job class does not have enough tasks to use all the workers in that class's target load %, the unused workers will be used to run other class tasks (we call this loaning a worker).

It may be the case that long running tasks running on 'loaned' workers make it impossible for the algorithm to bring the task allocations back in line with the original target load %s. To address this, the algorithm has a 'rebalancing' feature that when turned on, will cancel the most recently started tasks for classes using loaned workers.

GetTasksToBeAssigned() (the top level entry to the algorithm):

  • determine if the system needs rebalancing (as per the rebalancing thresholds - see rebalancing below)
  • if the system needs rebalancing,
    • compute the tasks that should be stopped,
  • compute the number tasks to start for each class
  • compute the list of tasks that should be started
  • return the list of tasks that should be started and list of tasks that should be stopped

Get number of tasks to start, and list of tasks to stop:

(rebalance happens first, but it is described below since we've never turned it on)

compute number of tasks to start for each class as per the load % 'entitlements'

(entitlementTasksToStart())

A class's entitlement is the number of workers * the class's load % - it is the number of workers that should be allocated to that tasks in that class. When the algorithm is computing tasks to start, it will try to meet each class' entitlement.

  1. For each class with tasks waiting to start, it's unused entitlement is the class's entitlement - number of that class's tasks currently running (minimum unused entitlement is 0). In addition, if a class does not have tasks waiting to start, it's unused entitlement is 0.
  2. Compute each class's unused entitlement %: the class's _unused entitlement / sum(all classes _unused entitlement_s)
  3. compute the number of tasks to start for each class: min(number tasks waiting to start, _unused entitlement % * number of idle (unallocated) workers)

When a class's number of waiting tasks < the number to start, there will still be available workers after computing the number of tasks to start for each class.When this happens the algorithm repeats the steps listed above computing the additional number of tasks to start for each class. Each iteration will either allocate all available workers, all of at least one class's waiting tasks or a class's full entitlement. When all available workers are allocated or all class's waiting tasks or entitlements have been allocated the iteration stops and the entitlement computation is complete.

It may be the case that some classes are under-utilizing their entitlements, but other classes have more tasks than their entitlement waiting. When this happens the entitlement allocation will complete, but there will still be unallocated workers (the total number of tasks to start is still less than the number of idle workers) and tasks waiting to start. When this happens the algorithm proceeds to compute the number of workers to 'loan' to each class:

loaned workers

(workerLoanAllocation())

The loan part of the algorithm computes the loan distribution %s as follows:

  1. normalize the original load %s for classes with waiting tasks.
  2. compute the total workers that will be loaned (sum of current loaned workers + number of idle workers that would still be unassigned after the entitlement distribution above).
  3. compute what the worker loan distribution would be if all of the currently loaned workers plus newly available workers were distributed as per the normalized loan %s
  4. adjust the worker loan distribution by subtracting the number of currently loaned workers for each class. If classes are exceeding their loan distribution (from step 3) then their adjusted loan distribution is 0
  5. compute class final loan %s as each class's adjusted loan distribution / sum of the adjusted loan distributions

Each class's loan amount is computed as the class final loan % (from step 5) * number of workers still available or number of waiting tasks whichever is smaller

When the loan amount for a class is larger than the number of waiting tasks in that class, there will still be available workers after processing each class. When this happens the algorithm repeats the loan computation for the still unallocated workers. The iteration finishes when all unallocated workers have been assigned to a class or when all the classes waiting tasks have been allocated to a worker.

Note the entitlement and loan computation does not assign a specific worker to a task, it simply computes the number of workers that can be used to start tasks in each class.

rebalance

(rebalanceClassTasks())

Each scheduling iteration naturally brings the running task allocations back to the original class entitlements, but it could be the case that long running tasks holding on to loaned workers slowly cause the number of running tasks for each class to be far from the target load percents. When this occurs, the algorithm re-balances the running workers back toward the original target %s by stopping tasks that have been started on loaned workers. It will select the most recently started tasks till the running task to class allocation meets the original entitlement targets.

The re-balancing is triggered by computing the max difference of the percent under/over class load for each class with waiting tasks. We call this the delta entitlement spread. When the delta entitlement spread has been over a threshold for a set period of time, the algorithm computes the list of tasks to stop and start to bring the classes to their entitled number of workers.

Example

example 1: Test_Class_Task_Start_Cnts(), scenario 1 - takes 2 iterations to allocate all workers based on entitlement

totalWorkers: 1000, 710 running tasks, 290 idle workers

class load % running tasks waiting tasks entitlement
c0 30% 200 290 300
c1 25% 300 230 250
c2 20% 0 150 200
c3 15% 100 150 150
c4 10% 110 90 100
c5 0% 0 328 0
iter 1 iter 2 start
290 idle workers 174 allocated 16 idle workers 16 allocated
class entitled tasks normalized % entitled tasks normalized %
c0 300-200=100 100/350=29% .29*290=84 300-284=16 16/26=62% .62*16=10 94
c1 250-300-> 0 0% 0 0 0% 0 0
c2 200-0=200 200/350=57% .57*290=165 -> 150 tasks waiting 0 0% 0 150
c3 150-100=50 50/350=14% .14*290=40 150-140=10 10/26=38% .38*16=6 46
example 2: Test_Class_Task_Start_Cnts(), scenario 3 - entitlement plus loan

totalWorkers: 1000, 710 running tasks, 290 idle workers

class load % running tasks waiting tasks entitlement loaned
c0 30% 200 10 300 0
c1 25% 300 230 250 50
c2 20% 0 0 200 0
c3 15% 100 50 150 0
c4 10% 110 90 100 10
entitlement loan start
290 idle workers 60 allocated 230 idle+60 loaned 230 allocated
class entitled tasks normalized % normalized %
c0 300-200=100 100/150=67% .67*290=194 -> 10 tasks waiting 0 0 10
c1 250-300-> 0 0% 0 25/35=72% .72*290=207, 206-50=157 157
c2 0 waiting tasks 0% 0 0 0 0
c3 150-100=50 50/150=33% .33*290=98-> 50 tasks waiting 0 0 50
c4 100-110-> 0 0% 0 10/35=28% .29*290=83, 84-10=73 73

selecting tasks to start

Selecting tasks to start in a class uses a round robin approach selecting tasks from jobs with the least number of running tasks. This makes sure that all jobs in a given task are using an equitable number of the workers allocate to that class (one job with many tasks won't use up all the workers allocated to that class).

exposed scheduling parameters

Scheduler api for the algorithm:

  • GetClassLoadPercents - returns the map of className:%
  • SetClassLoadPercents - sets the load %s from a map of className:%
  • GetRequestorMap - returns the map of className:requestor_re, where requestor_re is the regular expression for matching requestor values in the job defs
  • SetRequestorMap - sets the map of className:requestor_re, where requestor_re is the regular expression for matching requestor values in the job defs
  • GetRebalanceMinimumDuration - return minimum number of minutes the system must be over the delta entitlement spread before triggering re-balancing
  • SetRebalanceMinimumDuration - set the minimum number of minutes the system must be over the delta entitlement spread before triggering re-balancing
  • GetRebalanceThreshold - return the threshold the delta entitlement spread must be over before triggering re-balancing
  • SetRebalanceThreshold - set the threshold the delta entitlement spread must be over before triggering re-balancing

Documentation

Overview

Package server provides the main job scheduling interface for Scoot

Package server is a generated GoMock package.

Index

Constants

View Source
const (
	// Clients will check for this string to differentiate between scoot and user initiated actions.
	UserRequestedErrStr = "UserRequested"

	// Nothing should run forever by default, use this timeout as a fallback.
	DefaultDefaultTaskTimeout = 30 * time.Minute

	// Allow extra time when waiting for a task response.
	// This includes network time and the time to upload logs to bundlestore.
	DefaultTaskTimeoutOverhead = 15 * time.Second

	// Number of different requestors that can run jobs at any given time.
	DefaultMaxRequestors = 10

	// Number of jobs any single requestor can have (to prevent spamming, not for scheduler fairness).
	DefaultMaxJobsPerRequestor = 100

	// Set the maximum number of tasks we'd expect to queue to a nonzero value (it'll be overridden later).
	DefaultSoftMaxSchedulableTasks = 1

	// Threshold for jobs considered long running
	LongJobDuration = 4 * time.Hour

	// How often Scheduler step is called in loop
	TickRate = 250 * time.Millisecond

	// The max job priority we respect (higher priority is untested and disabled)
	MaxPriority = domain.P2

	// Max number of requestors to track tag history, and max number of tags per requestor to track
	DefaultMaxRequestorHistories = 1000000
	DefaultMaxHistoryTags        = 100

	// Max number of task IDs to track durations for
	DefaultMaxTaskDurations = 1000000
)
View Source
const DeadLetterTrailer = " -> Error(s) encountered, canceling task."
View Source
const RebalanceRequestedErrStr = "RebalanceRequested"

Clients will check for this string to differentiate between scoot and user initiated actions.

Variables

View Source
var (
	DefaultLoadBasedSchedulerClassPercents = map[string]int32{
		"land":       40,
		"diff":       25,
		"sandbox":    10,
		"regression": 17,
		"ktf":        3,
		"coverage":   2,
		"tryout":     2,
		"unknown":    1,
	}
	DefaultRequestorToClassMap = map[string]string{
		"land.*":       "land",
		"diff.*":       "diff",
		"sandbox.*":    "sandbox",
		"regression.*": "regression",
		"CI.*":         "regression",
		"jenkins.*":    "ktf",
		"ktf.*":        "ktf",
		"coverage.*":   "coverage",
		"tryout.*":     "tryout",
	}
	DefaultMinRebalanceTime = time.Duration(4 * time.Minute)
	MaxTaskDuration         = time.Duration(4 * time.Hour)
)

defaults for the LoadBasedScheduler algorithm: only one class and all jobs map to that class

Functions

func GetRequestorClass

func GetRequestorClass(requestor string, requestorToClassMap map[string]string) string

GetRequestorClass find the requestorToClass entry for requestor keys in requestorToClassEntry are regular expressions if no match is found, return "" for the class name

func NewStatefulScheduler

func NewStatefulScheduler(
	nodesUpdatesCh chan []cc.NodeUpdate,
	sc saga.SagaCoordinator,
	rf RunnerFactory,
	config SchedulerConfiguration,
	stat stats.StatsReceiver,
	persistor Persistor,
	durationKeyExtractorFn func(string) string) *statefulScheduler

Create a New StatefulScheduler that implements the Scheduler interface cc.Cluster - cluster of worker nodes saga.SagaCoordinator - the Saga Coordinator to log to and recover from RunnerFactory - Function which converts a node to a Runner SchedulerConfig - additional configuration settings for the scheduler StatsReceiver - stats receiver to log statistics to specifying debugMode true, starts the scheduler up but does not start the update loop. Instead the loop must be advanced manually by calling step(), intended for debugging and test cases If recoverJobsOnStartup is true Active Sagas in the saga log will be recovered and rescheduled, otherwise no recovery will be done on startup

Types

type LoadBasedAlg

type LoadBasedAlg struct {
	// contains filtered or unexported fields
}

LoadBasedAlg the scheduling algorithm computes the list of tasks to start and stop for the current iteration of the scheduler loop.

The algorithm uses the classLoadPercents, number of tasks currently running for each class, and number of available workers when computing the tasks to start/stop.

the algorithm has 3 main phases: rebalancing, entitlement allocation, loaning allocation.

- the rebalancing phase is only triggered when the number of tasks running for each class is very different than the class load percents. When rebalancing is run, the rebalancing computation will select the set of tasks that need to be stopped to bring classes that are over their target load percents back to the target load percents, and the tasks that can be started to replace the stopped tasks. Note: there may still be idle workers after the rebalance tasks are started/stopped. The next scheduling iteration will find tasks for these workers.

the entitlement part of the computation identifies the number of tasks to start to bring the number of running tasks for each class closer to its entitlement as defined in the classLoadPercents.

the loan part of the computation identifies the number of tasks over a class's entitlement that can be started to use up unused workers due to other classes not using their entitlement.

See README.md for more details

func NewLoadBasedAlg

func NewLoadBasedAlg(config *LoadBasedAlgConfig, tasksByJobClassAndStartTimeSec map[taskClassAndStartKey]taskStateByJobIDTaskID) *LoadBasedAlg

NewLoadBasedAlg allocate a new LoadBaseSchedAlg object.

func (*LoadBasedAlg) GetDataStructureSizeStats

func (lbs *LoadBasedAlg) GetDataStructureSizeStats() map[string]int

func (*LoadBasedAlg) GetTasksToBeAssigned

func (lbs *LoadBasedAlg) GetTasksToBeAssigned(jobsNotUsed []*jobState, stat stats.StatsReceiver, cs *clusterState,
	jobsByRequestor map[string][]*jobState) ([]*taskState, []*taskState)

GetTasksToBeAssigned - the entry point to the load based scheduling algorithm The algorithm uses the classLoadPercents, number of tasks currently running for each class and number of available workers when computing the tasks to start/stop. The algorithm has 3 main phases: rebalancing, entitlement allocation, loaning allocation.

func (*LoadBasedAlg) LocalCopyClassLoadPercents

func (lbs *LoadBasedAlg) LocalCopyClassLoadPercents() map[string]int

LocalCopyClassLoadPercents return a copy of the ClassLoadPercents leaving as int

type LoadBasedAlgConfig

type LoadBasedAlgConfig struct {
	// contains filtered or unexported fields
}

type MockScheduler

type MockScheduler struct {
	// contains filtered or unexported fields
}

MockScheduler is a mock of Scheduler interface

func NewMockScheduler

func NewMockScheduler(ctrl *gomock.Controller) *MockScheduler

NewMockScheduler creates a new mock instance

func (*MockScheduler) EXPECT

EXPECT returns an object that allows the caller to indicate expected use

func (*MockScheduler) GetClassLoadPercents

func (m *MockScheduler) GetClassLoadPercents() (map[string]int32, error)

GetClassLoadPercents mocks base method

func (*MockScheduler) GetRebalanceMinimumDuration

func (m *MockScheduler) GetRebalanceMinimumDuration() (time.Duration, error)

GetRebalanceMinimumDuration mocks base method

func (*MockScheduler) GetRebalanceThreshold

func (m *MockScheduler) GetRebalanceThreshold() (int32, error)

GetRebalanceThreshold mocks base method

func (*MockScheduler) GetRequestorToClassMap

func (m *MockScheduler) GetRequestorToClassMap() (map[string]string, error)

GetRequestorToClassMap mocks base method

func (*MockScheduler) GetSagaCoord

func (m *MockScheduler) GetSagaCoord() saga.SagaCoordinator

GetSagaCoord mocks base method

func (*MockScheduler) GetSchedulerStatus

func (m *MockScheduler) GetSchedulerStatus() (int, int)

GetSchedulerStatus mocks base method

func (*MockScheduler) KillJob

func (m *MockScheduler) KillJob(jobId string) error

KillJob mocks base method

func (*MockScheduler) OfflineWorker

func (m *MockScheduler) OfflineWorker(req domain.OfflineWorkerReq) error

OfflineWorker mocks base method

func (*MockScheduler) ReinstateWorker

func (m *MockScheduler) ReinstateWorker(req domain.ReinstateWorkerReq) error

ReinstateWorker mocks base method

func (*MockScheduler) ScheduleJob

func (m *MockScheduler) ScheduleJob(jobDef domain.JobDefinition) (string, error)

ScheduleJob mocks base method

func (*MockScheduler) SetClassLoadPercents

func (m *MockScheduler) SetClassLoadPercents(classLoads map[string]int32) error

SetClassLoadPercents mocks base method

func (*MockScheduler) SetRebalanceMinimumDuration

func (m *MockScheduler) SetRebalanceMinimumDuration(durationMin time.Duration) error

SetRebalanceMinimumDuration mocks base method

func (*MockScheduler) SetRebalanceThreshold

func (m *MockScheduler) SetRebalanceThreshold(durationMin int32) error

SetRebalanceThreshold mocks base method

func (*MockScheduler) SetRequestorToClassMap

func (m *MockScheduler) SetRequestorToClassMap(requestorToClassMap map[string]string) error

SetRequestorToClassMap mocks base method

func (*MockScheduler) SetSchedulerStatus

func (m *MockScheduler) SetSchedulerStatus(maxTasks int) error

SetSchedulerStatus mocks base method

type MockSchedulerMockRecorder

type MockSchedulerMockRecorder struct {
	// contains filtered or unexported fields
}

MockSchedulerMockRecorder is the mock recorder for MockScheduler

func (*MockSchedulerMockRecorder) GetClassLoadPercents

func (mr *MockSchedulerMockRecorder) GetClassLoadPercents() *gomock.Call

GetClassLoadPercents indicates an expected call of GetClassLoadPercents

func (*MockSchedulerMockRecorder) GetRebalanceMinimumDuration

func (mr *MockSchedulerMockRecorder) GetRebalanceMinimumDuration() *gomock.Call

GetRebalanceMinimumDuration indicates an expected call of GetRebalanceMinimumDuration

func (*MockSchedulerMockRecorder) GetRebalanceThreshold

func (mr *MockSchedulerMockRecorder) GetRebalanceThreshold() *gomock.Call

GetRebalanceThreshold indicates an expected call of GetRebalanceThreshold

func (*MockSchedulerMockRecorder) GetRequestorToClassMap

func (mr *MockSchedulerMockRecorder) GetRequestorToClassMap() *gomock.Call

GetRequestorToClassMap indicates an expected call of GetRequestorToClassMap

func (*MockSchedulerMockRecorder) GetSagaCoord

func (mr *MockSchedulerMockRecorder) GetSagaCoord() *gomock.Call

GetSagaCoord indicates an expected call of GetSagaCoord

func (*MockSchedulerMockRecorder) GetSchedulerStatus

func (mr *MockSchedulerMockRecorder) GetSchedulerStatus() *gomock.Call

GetSchedulerStatus indicates an expected call of GetSchedulerStatus

func (*MockSchedulerMockRecorder) KillJob

func (mr *MockSchedulerMockRecorder) KillJob(jobId interface{}) *gomock.Call

KillJob indicates an expected call of KillJob

func (*MockSchedulerMockRecorder) OfflineWorker

func (mr *MockSchedulerMockRecorder) OfflineWorker(req interface{}) *gomock.Call

OfflineWorker indicates an expected call of OfflineWorker

func (*MockSchedulerMockRecorder) ReinstateWorker

func (mr *MockSchedulerMockRecorder) ReinstateWorker(req interface{}) *gomock.Call

ReinstateWorker indicates an expected call of ReinstateWorker

func (*MockSchedulerMockRecorder) ScheduleJob

func (mr *MockSchedulerMockRecorder) ScheduleJob(jobDef interface{}) *gomock.Call

ScheduleJob indicates an expected call of ScheduleJob

func (*MockSchedulerMockRecorder) SetClassLoadPercents

func (mr *MockSchedulerMockRecorder) SetClassLoadPercents(classLoads interface{}) *gomock.Call

SetClassLoadPercents indicates an expected call of SetClassLoadPercents

func (*MockSchedulerMockRecorder) SetRebalanceMinimumDuration

func (mr *MockSchedulerMockRecorder) SetRebalanceMinimumDuration(durationMin interface{}) *gomock.Call

SetRebalanceMinimumDuration indicates an expected call of SetRebalanceMinimumDuration

func (*MockSchedulerMockRecorder) SetRebalanceThreshold

func (mr *MockSchedulerMockRecorder) SetRebalanceThreshold(durationMin interface{}) *gomock.Call

SetRebalanceThreshold indicates an expected call of SetRebalanceThreshold

func (*MockSchedulerMockRecorder) SetRequestorToClassMap

func (mr *MockSchedulerMockRecorder) SetRequestorToClassMap(requestorToClassMap interface{}) *gomock.Call

SetRequestorToClassMap indicates an expected call of SetRequestorToClassMap

func (*MockSchedulerMockRecorder) SetSchedulerStatus

func (mr *MockSchedulerMockRecorder) SetSchedulerStatus(maxTasks interface{}) *gomock.Call

SetSchedulerStatus indicates an expected call of SetSchedulerStatus

type MockSchedulingAlgorithm

type MockSchedulingAlgorithm struct {
	// contains filtered or unexported fields
}

MockSchedulingAlgorithm is a mock of SchedulingAlgorithm interface

func NewMockSchedulingAlgorithm

func NewMockSchedulingAlgorithm(ctrl *gomock.Controller) *MockSchedulingAlgorithm

NewMockSchedulingAlgorithm creates a new mock instance

func (*MockSchedulingAlgorithm) EXPECT

EXPECT returns an object that allows the caller to indicate expected use

func (*MockSchedulingAlgorithm) GetTasksToBeAssigned

func (m *MockSchedulingAlgorithm) GetTasksToBeAssigned(jobs []*jobState, stat stats.StatsReceiver, cs *clusterState, requestors map[string][]*jobState) ([]*taskState, []*taskState)

GetTasksToBeAssigned mocks base method

type MockSchedulingAlgorithmMockRecorder

type MockSchedulingAlgorithmMockRecorder struct {
	// contains filtered or unexported fields
}

MockSchedulingAlgorithmMockRecorder is the mock recorder for MockSchedulingAlgorithm

func (*MockSchedulingAlgorithmMockRecorder) GetTasksToBeAssigned

func (mr *MockSchedulingAlgorithmMockRecorder) GetTasksToBeAssigned(jobs, stat, cs, requestors interface{}) *gomock.Call

GetTasksToBeAssigned indicates an expected call of GetTasksToBeAssigned

type PersistedSettings

type PersistedSettings struct {
	ClassLoadPercents               map[string]int32  `json:"classLoadPercents"`
	RequestorToClassMap             map[string]string `json:"requestorToClassMap"`
	RebalanceMinimumDurationMinutes int               `json:"rebalanceMinimumDurationMinutes"`
	RebalanceThreshold              int               `json:"rebalanceThreshold"`
	Throttle                        int               `json:"throttle"`
}

PersistedSettings the persisted scheduler settings structure for encoding/decoding as json

type Persistor

type Persistor interface {
	PersistSettings(settings *PersistedSettings) error
	LoadSettings() (*PersistedSettings, error)
}

Persistor interface for persisting scheduler settings and initializing the scheduler from its persisted settings

type ReadyFn

type ReadyFn func(cc.Node) (ready bool, backoffDuration time.Duration)

Cluster will use this function to determine if newly added nodes are ready to be used.

type RunnerFactory

type RunnerFactory func(node cc.Node) runner.Service

type Scheduler

type Scheduler interface {
	ScheduleJob(jobDef domain.JobDefinition) (string, error)

	KillJob(jobId string) error

	GetSagaCoord() saga.SagaCoordinator

	OfflineWorker(req domain.OfflineWorkerReq) error

	ReinstateWorker(req domain.ReinstateWorkerReq) error

	SetSchedulerStatus(maxTasks int) error

	GetSchedulerStatus() (int, int)

	GetClassLoadPercents() (map[string]int32, error)

	SetClassLoadPercents(classLoads map[string]int32) error

	GetRequestorToClassMap() (map[string]string, error)

	SetRequestorToClassMap(requestorToClassMap map[string]string) error

	GetRebalanceMinimumDuration() (time.Duration, error)

	SetRebalanceMinimumDuration(durationMin time.Duration) error

	GetRebalanceThreshold() (int32, error)

	SetRebalanceThreshold(durationMin int32) error
}

type SchedulerConfiguration

type SchedulerConfiguration struct {
	MaxRetriesPerTask    int
	DebugMode            bool
	RecoverJobsOnStartup bool
	DefaultTaskTimeout   time.Duration
	TaskTimeoutOverhead  time.Duration
	RunnerRetryTimeout   time.Duration
	RunnerRetryInterval  time.Duration
	ReadyFnBackoff       time.Duration
	MaxRequestors        int
	MaxJobsPerRequestor  int
	TaskThrottle         int
	Admins               []string

	SchedAlgConfig interface{}
	SchedAlg       SchedulingAlgorithm
}

SchedulerConfiguration variables read at initialization MaxRetriesPerTask - the number of times to retry a failing task before

marking it as completed.

DebugMode - if true, starts the scheduler up but does not start

the update loop.  Instead the loop must be advanced manually
by calling step()

RecoverJobsOnStartup - if true, the scheduler recovers active sagas,

from the sagalog, and restarts them.

DefaultTaskTimeout -

default timeout for tasks.

TaskTimeoutOverhead

How long to wait for a response after the task has timed out.

RunnerRetryTimeout -

how long to keep retrying a runner req.

RunnerRetryInterval -

how long to sleep between runner req retries.

ReadyFnBackoff -

how long to wait between runner status queries to determine [init] status.

TaskThrottle -

	   requestors will try not to schedule jobs that make the scheduler exceed
    the TaskThrottle.  Note: Sickle may exceed it with retries.

func (*SchedulerConfiguration) String

func (sc *SchedulerConfiguration) String() string

type SchedulingAlgorithm

type SchedulingAlgorithm interface {
	GetTasksToBeAssigned(jobs []*jobState, stat stats.StatsReceiver, cs *clusterState,
		requestors map[string][]*jobState) (startTasks []*taskState, stopTasks []*taskState)
}

SchedulingAlgorithm interface for the scheduling algorithm. Implementations will compute the list of tasks to start and stop

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL