maintenance-controller

command module

v1.5.2 Latest Latest Go to latest Published: Mar 25, 2024 License: Apache-2.0 Imports: 29 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/sapcc/maintenance-controller

Links

Open Source Insights

README ¶

Maintenance Controller

GitHub Workflow Status Docker Pulls

A Kubernetes controller to manage node maintenance. Serves roughly 50 production clusters across SAP Converged Cloud.

Motivation

Sometimes nodes of a Kubernetes cluster need to be put into maintenance. There exist several reasons, like having to update the node's operating system or the kubelet daemon. Putting a node into maintenance requires to cordon and drain that node. Stateful applications might have special constraints regarding their termination, which cannot be handled easily using Kubernetes "PreStopHooks" (e.g. High Availability scenarios). In enterprise contexts, additional processes might influence, when a node maintenance is allowed to occur.

The maintenance controller supports enforcing maintenance processes, automating maintenance approvals and customization of termination logic. It is built with flexibility in mind and should be adaptable to different environments and requirements. This property is achieved with an extensible plugin system.

Concept

Kubernetes nodes are modelled as finite state machines, which can be in one of the following three states:

Operational
Maintenance Required
In Maintenance

A node's current state is shown in the cloud.sap/maintenance-state node label. Nodes transition to the state if a chain of configurable "check plugins" decides that the node's state should move on. Such plugin chains can be configured for each state individually via maintenance profiles. Cluster administrators can assign a maintenance profile to a node using the cloud.sap/maintenance-profile label. Before the transition is finished a chain of "trigger plugins" can be invoked, which can perform any action related to termination or startup logic. While a node is in a certain state, a chain of "notifications plugins" informs the cluster users and administrators regularly about the node being in that state. Multiple plugins exist, so one can check or alter labels, be notified via Slack and so on.

The maintenance-controller only does the decision making, whether a node can be maintained or not. Currently, most actual maintenance actions like Cordoning, Draining and Rebooting nodes are not carried out by the maintenance-controller and are instead delegated to inbuilt or external other controllers. Check out the additional integrations further down.

Installation

Docker Images are on GitHubs Container registry (and older images on DockerHub until they remove them). A helm chart can be found here. Alternatively, execute make deploy IMG=sapcc/maintenance-controller.

Configuration

General

The maintenance-controller contains multiple plugins, which are configurable themselves. checkLabel for example needs to know, which label needs to checked for which value. The combination of a plugin type like checkLabel and its specific configuration is referred to as an instance. Notification instances require a schedule, which describes when and how often to notify about state changes, additionally. These instances can be chained together to construct more complex check, trigger and notification actions. In that regard plugin chains refer to instances being used in conjunction.

Profiles describe a single maintenance workflow each by specifying how a node moves through the state machine. For each state a notification chain can be configured. Also transitions have to be defined. These consist of at least of a check chain and the state, which should follow next. Optionally, a trigger chain can be configured to perform actions, when a node moves from one state into the next one.

Debugging

By port-forwarding to the port specified by the metrics-addr flag (default 8080) of the currently active maintenance-controller one can access a webview on /, which shows details about the state the maintenance-controller has. It provides an overview about how many nodes are in a certain state regarding a certain profile. Also, individual check chain evaluations can be checked.

Format

There is a global configuration, which defines some general options, plugin instances and maintenance profiles. The global configuration should be named ./config/maintenance.yaml and should be placed relative to the controllers working directory preferably via a Kubernetes secret or a config map. A secret is recommend as some plugins may need authentication data. Environment variables can be interpolated in value positions using ${ENV_VAR}. The basic structure looks like the following:

intervals:
  # defines the minimum duration after which a node should be checked again
  requeue: 200ms
# plugin instances are the combination of a plugin and its configuration
instances:
  # notification plugin instances
  notify:
  - type: slack # the plugin type
    name: somenotificationplugin
    config:
      hook: slack-webhook
      channel: "#the_channel"
      message: the message
    # notification schedule
    schedule:
      type: periodic
      config:
        interval: 24h
  # check plugin instances
  check:
  - type: hasLabel # the plugin type
    # name of the instance, which is used in the plugin chain configurations
    # do not use spaces or other special characters, besides the underscore, which is allowed
    name: transition
    # the configuration for the plugin. That block depends on the plugin type
    config:
      key: transition
      value: "true"
  # trigger plugin instances
  trigger:
  - type: alterLabel
    name: alter
    config:
      key: alter
      value: "true"
      remove: false
profiles:
# define a maintenance profile called someprofile
- name: someprofile
  # define the plugin chains for the operational state
  operational:
    # the notification instances to invoke while in the operational state
    notify: somenotificationplugin
    transitions:
      # the exit condition for the operational state refers to the "transition" plugin instance defined in the instances section
    - check: transition
      # the trigger instances which are invoked when leaving the operational state
      trigger: alter
      # the following state after passing checks and executing triggers
      next: maintenance-required
  # define the plugin chains for the maintenance-required state
  maintenance-required:
    # define chains as shown with the operational state
    notify: null
    transitions: null
  # define plugin chains for the in-maintenance state
  in-maintenance:
    # multiple notification instances can be used
    notify: g && h
    transitions:
      # check chains support boolean operations which evaluate multiple instances
    - check: "transition && !(a || b)"
      # multiple trigger instances can be used also
      trigger: t && u

Chains can be undefined or empty. Trigger and Notification chains are configured by specifying the desired instance names separated by &&, e.g. alter && othertriggerplugin. Check chains are build using boolean expressions, e.g. transition && !(a || b). To attach a maintenance profile to a node, the label cloud.sap/maintenance-profile=NAME has to be assigned the desired profile name. If that label is not present on a node the controller will use the default profile, which does nothing at all. The default profile can be reconfigured, if it is defined within the config file. Multiple profiles can be assigned to a single node by setting cloud.sap/maintenance-profile=NAME1--NAME2--NAME3--.... These profiles are then executed concurrently with the only constraint being that only one profile can be in-maintenance at any point in time. That way specific maintenance workflows for different causes can be implemented. The controllers state is tracked with the cloud.sap/maintenance-state label and the cloud.sap/maintenance-data annotation.

Check Plugins

hasAnnotation: Checks if a node has an annotation with the given key. Optionally asserts the annotation value.

config:
  key: the annotation key, required
  value: the expected annotation value, if empty only the key is checked, optional

hasLabel: Checks if a node has a label with the given key. Optionally asserts the labels value.

config:
  key: the label key, required
  value: the expected label value, if empty only the key is checked, optional

anyLabel: Checks that at least one node in the cluster has a label with the given key. Optionally asserts that the label must match a certain value.

config:
  key: the label key, required
  value: the expected label value, if empty only the key is checked, optional

clusterSemver: Checks if a label containing a semantic version is less than the most up-to-date value in the cluster. Requires the checked node to have the specified label.

clusterSemver:
  key: a valid label key, required
  profileScoped: do not check against the whole cluster, but against all nodes, which match the current profile, optional

condition: Checks if a node condition has the defined status.

config:
  type: the node conditions type (usually one of Ready, MemoryPressure, DiskPressure, PIDPressure or NetworkUnavailable), required
  status: the expected condition status (usually one of True, False or Unknown), required

kubernikusCount: Checks that the node count on the Kubernetes API is greater or equal to the nodes specified on the Kubernikus API.

config:
  cluster: Kubernikus cluster name, required

maxMaintenance: Checks that less than the specified amount of nodes are in the in-maintenance state. Due to optimistic concurrency control of the API-Server this check might return the wrong result if more than one node is reconciled at any given time.

config:
  max: the limit of nodes that are in-maintenance, required
  profile: if set only consider nodes which do have the specified profile, optional
  skipAfter: if set only considers nodes, for which the time since the last transition does not exceed the specified duration, optional
  groupLabel: if set only considers nodes, which have the same value for the specified label, optional

prometheusInstant: Checks that the most recent value of a prometheus query satisfies a given expression.

config:
  url: prometheus url
  query: prometheus query, that yields a vector with exactly a single value
  expr: comparision where 'value' is fetched from prometheus, e.g. 'value <= 1'

stagger: Checks that a certain duration has passed since a previous node passed. This is implemented with coordination.k8s.io/Leases, which needs to be manually removed when the maintenance controller is removed from a cluster.

config:
  duration: the duration to await according to the rules of golangs time.ParseDuration(), required
  leaseName: name prefix of the lease, required
  leaseNamespace: namespace of the lease, required
  parallel: the amount of leases to use, optional (defaults to 1)

timeWindow: Checks if the current systemtime is within the specified weekly UTC time window.

config:
  start: the timewindows start time in "hh:mm" format, required
  end: the timewindows end time in "hh:mm" format, required
  weekdays: weekdays when the time window is valid as array, e.g. [monday, tuesday, wednesday, thursday, friday, saturday, sunday], required
  exclude: month/day combinations as array, when maintenances are not allowed to occur, e.g. ["Dec 24", "Oct 31"], optional

wait: Checks if a certain duration has passed since the last state transition.

config:
  duration: a duration according to the rules of golangs time.ParseDuration(), required

waitExclude: Checks if a certain duration has passed since the last state transition, while time does not progress on excluded days. This likely to have some inaccuracies, e.g. leap seconds due to the involved math.

config:
  duration: a duration according to the rules of golangs time.ParseDuration(), required
  exclude: weekdays when the time does not progress, e.g. [monday, tuesday, wednesday, thursday, friday, saturday, sunday], required

affinity: Pods are rescheduled, when a node is drained. While maintaining a whole cluster it is possible that are rescheduled onto nodes, which are subject to another drain soon. This effect can be reduced by specifying a preferred node affinity towards nodes in the operational state. The affinity check plugin prefers to send nodes into maintenance, which do not have pods matching exactly the node affinity below, so nodes with non-critical pods are maintained first to provide operational nodes for critical workloads. This is not perfect, because nodes enter the maintenance-required over a certain duration, but better than ignoring such scheduling issues at all. An instance of this check plugin can only be used for the maintenance-required state.

config:
  minOperational: minimum count of nodes that need to operational in the same profile to pass, ignoring the logic outlined above, setting it to zero (default) disables this, optional

nodeAffinity:
  preferredDuringSchedulingIgnoredDuringExecution:
  - weight: 1 # the weight is not relevant
    preference: # the preference has to match
      matchExpressions:
      - key: cloud.sap/maintenance-state
        operator: In
        values:
        - operational

nodeCount: Checks if the cluster has at least the specified of nodes.

config:
  count: the amount of nodes to present at least

Notification Plugins

mail: Sends an e-mail

config:
  auth: boolean value, which defines if the plugin should use plain auth or no auth at all, required
  address: address of the smtp server with port, required
  from: e-mail address of the sender, required
  identity: the identity used for authentication against the smtp server, optional
  subject: the subject of the mail
  message: the content of the mail, this supports golang templating e.g. {{ .State }} to get the current state as string or {{ .Node }} to access the node object, required
  password: the password used for authentication against the smtp server, optional
  to: array of recipients, required
  user: the user used for authentication against the smtp server, optional

slack: Sends a slack message

config:
  hook: an incoming slack webhook, required
  channel: the channel which the message should be send to, required
  message: the content of the slack message, this supports golang templating e.g. {{ .State }} to get the current state as string or {{ .Node }} to access the node object, required

slackThread: Sends slack messages and groups them in a thread if the given lease did not expire

config:
  token: slack api token, required
  channel: the channel which the message should be send to, required
  title: the content of the main slack message, this supports golang templating e.g. {{ .State }} to get the current state as string or {{ .Node }} to access the node object, required
  message: the content of the slack replies, this supports golang templating e.g. {{ .State }} to get the current state as string or {{ .Node }} to access the node object, required
  leaseName: name of the lease, required
  leaseNamespace: namespace of the lease, required
  period: after which period a new thread should be started, required

One can get the current profile in a template using {{ .Profile.Current }}. Be careful about using it in an instance that is invoked during the operational state, as all profiles attached to a node are considered for notification. {{ .Profile.Last }} can be used instead, which refers to profile that caused the last state transition.

Notification Schedules

oneshot: Notifies once after a state change if the configured delay passes.

type: oneshot
config:
  delay: a duration according to the rules of golangs time.ParseDuration(), defaults to 0, optional

periodic: Notifies after a state change and when the specified interval passed since the last notification if the node is currently not in the operational state. This reflects the old implicit notification behavior.

type: periodic
config:
  interval: a duration according to the rules of golangs time.ParseDuration(), required

scheduled: Notifies at a certain time only on specified weekdays.

type: scheduled
config:
  instant: the point in time, when the notification should be sent, "hh:mm" format, required
  weekdays: weekdays when notification should be sent, e.g. [monday, tuesday, wednesday, thursday, friday, saturday, sunday], required

Trigger Plugins

alterAnnotation: Adds, changes or removes an annotation

config:
  key: the annotations key, required
  value: the value to set, optional
  remove: boolean value, if true the annotation is removed, if false the annotation is added or changed, optional

alterLabel: Adds, changes or removes a label

config:
  key: the labels key, required
  value: the value to set, optional
  remove: boolean value, if true the label is removed, if false the label is added or changed, optional

Additional integrations

Support for VMware ESX maintenances
Support for Kubernikus
The maintenance controller exports a bunch of prometheus metrics, but especially
- maintenance_controller_shuffle_count: Counts pods in DaemonSets, Deployments and StatefulSets, that were likely shuffled by a node send into maintenance
- maintenance_controller_shuffles_per_replica: Count of pods in DaemonSets, Deployments and StatefulSets, that were likely shuffled by a node send into maintenance, divided by the replica count when the event occurred

Example configuration for flatcar update agents

This example requires that the Flatcar-Linux-Update-Agent is present on the nodes.

intervals:
  requeue: 60s
  notify: 5h
instances:
  notify:
  - type: slack
    name: approval_required
    config:
      hook: Your hook
      channel: Your channel
      message: |
        The node {{ .Node.Name }} requires maintenance. Manual approval is required.
        Approve to drain and reboot this node by running:
        `kubectl annotate node {{ .Node.Name }} cloud.sap/maintenance-approved=true`
    schedule:
      type: periodic
      config:
        interval: 24h
  - type: slack
    name: maintenance_started
    config:
      hook: Your hook
      channel: Your channel
      message: |
        Maintenance for node {{ .Node.Name }} has started.
    schedule:
      type: periodic
      config:
        interval: 24h
  check:
  - type: hasAnnotation
    name: reboot_needed
    config:
      key: flatcar-linux-update.v1.flatcar-linux.net/reboot-needed
      value: "true"
  - type: hasAnnotation
    name: check_approval
    config:
      key: cloud.sap/maintenance-approved
      value: "true"
  trigger:
  - type: alterAnnotation
    name: reboot-ok
    config:
      key: flatcar-linux-update.v1.flatcar-linux.net/reboot-ok
      value: "true"
  - type: alterAnnotation
    name: remove_approval
    config:
      key: cloud.sap/maintenance-approved
      remove: true
  - type: alterAnnotation
    name: remove_reboot_ok
    config:
      key: flatcar-linux-update.v1.flatcar-linux.net/reboot-ok
      remove: true
profiles:
- name: flatcar
  operational:
    transitions:
    - check: reboot_needed
      next: maintenance-required
  maintenance-required:
    notify: approval_required
    transitions:
    - check: check_approval
      trigger: remove_approval && reboot-ok
      next: in-maintenance
  in-maintenance:
    notify: maintenance_started
    transitions:
    - check: "!reboot_needed"
      trigger: remove_reboot_ok
      next: operational