k8s-countermeasures

command module
v0.0.0-...-f8f77fb Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 17, 2023 License: Apache-2.0 Imports: 24 Imported by: 0

README

Kubernetes CounterMeasures

Build Status

Project status: alpha Not all planned features are completed. The API, spec, status and other user facing objects may change, but in a backward compatible way.

Packaging scripts and instructions for deployment are still in progress and looking for contributors.

TL;DR

Kubernetes Operator which allows for automating manual actions, normally documeneted in applcation runbooks and executed by Ops or SRE staff, in reaction to an application alert. Simple examples include:

  • deleting/restarting a pod on application error that doesn't cause liveness/readiness probes to restart
  • taking a Java thread-dump or enabling profiler such as async-profiler on high CPU usage alert

For more detailed examples and use cases see the README in the docs folder.

Overview

This project aims to define a API and controller in Kubernetes to codify project runbooks, allowing for automation of actions that are manually taken when on on-call engineer receives an alert.

For example, imagine a Java application with a runbook that defines when an alert for high CPU is received, the on-call engineer is to take a thread-dump for analysis. Doing this manually may prove difficult depending on how long the high CPU event lasts and the engineer availability, and whether or not the container has the debug tools required.

This project allows for the automation of the above runbook task by using an operator written using the OperatorSDK and a few CRDs to define the event to monitor and the actions to take.

The operator allows for deployment of an event source, currently only Prometheus is supported, and a countermeasure that defines one or more actions. The event source will publish events into an internal event bus to be conssumed by the countermeasures.

Prerequisites

The Kubernetes CounterMeasures Operator uses Ephemeral Containers which was alpha in Kubernetes 1.22.0, beta in 1.23.0, and stable in >=1.25.0. Therefore it is recommended to use verion >=1.25.0, but development and testing was done with a Kubernetes cluster of version >=1.23.0.

CustomResourceDefinitions

A core feature of the Kubernetes CounterMeasures Operator is to monitor the Kubernetes API server for changes to specific objects and ensure that your application is monitored for any undesirable conditions and when detected the appropriate actions are taken as a counter measure. The Operator acts on the following custom resource definitions (CRDs):

  • CounterMeasure, which defines a condition to watch for and actions to take when it occurs.
  • Prometheus, which defines an event source that trigger the counter measures.

The Kubernetes CounterMeasures operator automatically detects changes in the Kubernetes API server to any of the above objects, and ensures your the monitors are updated.

To learn more about the CRDs introduced by the Kubernetes CounterMeasures Operator have a look at the documentation.

Dynamic Admission Control

To provide validation an admission webhook is provided to validate CRD resources upon initial creation or update or during dry run.

For more information on this feature, see the user guide.

Quickstart

To quickly try out the Kubernetes CounterMeasures Operator inside a Kind cluster, run the following command:

./hack/start-cluster.sh
make install
make deploy

To run the Operator outside of a cluster instead of running make deploy, use:

make run

Removal

To remove the operator, first delete any custom resources you created in each namespace.

for n in $(kubectl get namespaces -o jsonpath={..metadata.name}); do
  kubectl delete --all --namespace=$n countermeasure
done

After a couple of minutes you can go ahead and remove the operator itself.

make undeploy
make uninstall

Development

Prerequisites
  • golang environment
  • docker (used for creating container images, etc.)
  • kind (optional)
Testing
Running unit tests

make test

Debugging

To debug the controller locally against a running K8s cluster, add this entry to the /etc/hosts file so that the operator can communicate with Prometheus.

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
127.0.0.1 localhost
# Add for k8s-countermeasures debugging
127.0.0.1 prometheus-operated.monitoring.svc 

then enable port forwarding from the development host to the promtheus service:

kubectl -n monitoring port-forward service/prometheus-operated 9090:9090

Contributing

Many files (documentation, manifests, ...) in this repository are auto-generated. Before proposing a pull request:

  1. Commit your changes.
  2. Run make generate.
  3. Commit the generated changes.

Security

If you find a security vulnerability related to the Kubernetes CounterMeasures Operator, please do not report it by opening a GitHub issue, but instead please send an e-mail to the owner of this project.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
apis
countermeasure/v1alpha1
Package v1alpha1 contains API Schema definitions for the operator v1alpha1 API group +kubebuilder:object:generate=true +groupName=countermeasure.vilaverde.rocks
Package v1alpha1 contains API Schema definitions for the operator v1alpha1 API group +kubebuilder:object:generate=true +groupName=countermeasure.vilaverde.rocks
eventsource/v1alpha1
Package v1alpha1 contains API Schema definitions for the eventsource v1alpha1 API group +kubebuilder:object:generate=true +groupName=eventsource.vilaverde.rocks
Package v1alpha1 contains API Schema definitions for the eventsource v1alpha1 API group +kubebuilder:object:generate=true +groupName=eventsource.vilaverde.rocks
controllers
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL