dispatchslurm

package
v0.0.0-...-288f078 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 26, 2024 License: AGPL-3.0, Apache-2.0, CC-BY-SA-3.0 Imports: 29 Imported by: 0

Documentation

Overview

Dispatcher service for Crunch that submits containers to the slurm queue.

Index

Constants

This section is empty.

Variables

View Source
var Command cmd.Handler = service.Command(arvados.ServiceNameDispatchSLURM, newHandler)

Functions

func NewSlurmCLI

func NewSlurmCLI() *slurmCLI

func SlurmNodeTypeFeatureKludge

func SlurmNodeTypeFeatureKludge(cc *arvados.Cluster)

SlurmNodeTypeFeatureKludge ensures SLURM accepts every instance type name as a valid feature name, even if no instances of that type have appeared yet.

It takes advantage of some SLURM peculiarities:

(1) A feature is valid after it has been offered by a node, even if it is no longer offered by any node. So, to make a feature name valid, we can add it to a dummy node ("compute0"), then remove it.

(2) To test whether a set of feature names are valid without actually submitting a job, we can call srun --test-only with the desired features.

SlurmNodeTypeFeatureKludge does a test-and-fix operation immediately, and then periodically, in case slurm restarts and forgets the list of valid features. It never returns (unless there are no node types configured, in which case it returns immediately), so it should generally be invoked with "go".

Types

type Dispatcher

type Dispatcher struct {
	*dispatch.Dispatcher

	Client arvados.Client
	// contains filtered or unexported fields
}

func (*Dispatcher) CheckHealth

func (disp *Dispatcher) CheckHealth() error

func (*Dispatcher) Done

func (disp *Dispatcher) Done() <-chan struct{}

func (*Dispatcher) ServeHTTP

func (disp *Dispatcher) ServeHTTP(w http.ResponseWriter, r *http.Request)

type Slurm

type Slurm interface {
	Batch(script io.Reader, args []string) error
	Cancel(name string) error
	QueueCommand(args []string) *exec.Cmd
	Release(name string) error
	Renice(name string, nice int64) error
}

type SqueueChecker

type SqueueChecker struct {
	Logger         logger
	Period         time.Duration
	PrioritySpread int64
	Slurm          Slurm
	// contains filtered or unexported fields
}

SqueueChecker implements asynchronous polling monitor of the SLURM queue using the command 'squeue'.

func (*SqueueChecker) All

func (sqc *SqueueChecker) All() []string

All waits for the next squeue invocation, and returns all job names reported by squeue.

func (*SqueueChecker) HasUUID

func (sqc *SqueueChecker) HasUUID(uuid string) bool

HasUUID checks if a given container UUID is in the slurm queue. This does not run squeue directly, but instead blocks until woken up by next successful update of squeue.

func (*SqueueChecker) SetPriority

func (sqc *SqueueChecker) SetPriority(uuid string, want int64)

SetPriority sets or updates the desired (Arvados) priority for a container.

func (*SqueueChecker) Stop

func (sqc *SqueueChecker) Stop()

Stop stops the squeue monitoring goroutine. Do not call HasUUID after calling Stop.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL