operator

command module
v0.0.0-...-2955ef2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 27, 2024 License: Apache-2.0, BSD-3-Clause Imports: 12 Imported by: 0

README

ElasticJob Operator

ElasticJob can scale in/out resources for distributed deep learning, including the number of nodes, CPU and memory of each node.

Description

The operator contains 2 CRDs, elasticjob and scaleplan. Users don't need to set replica resource when they apply a elasticjob to train on a cluster. The elasticjob controller will create a EasyDL master Pod for each elasticjob. The master will generate a scaleplan with PS/worker resources to notify the controller to launch Pods for the training.

Getting Started

You’ll need a Kubernetes cluster to run against. You can use KIND to get a local cluster for testing, or run against a remote cluster. Note: Your controller will automatically use the current context in your kubeconfig file (i.e. whatever cluster kubectl cluster-info shows).

Running on the cluster
  1. Install Instances of Custom Resources:
kubectl apply -f config/crd/bases

We can deploy the controller with a released image.

make deploy IMG=easydl/elasticjob-controller:master
  1. Build and push your image to the location specified by IMG:
make docker-build docker-push IMG=<some-registry>/operator:tag
  1. Deploy the controller to the cluster with the image specified by IMG:
make deploy IMG=<some-registry>/operator:tag
Uninstall CRDs

To delete the CRDs from the cluster:

make uninstall
Undeploy controller

UnDeploy the controller to the cluster:

make undeploy

Contributing

You can feel free to submit PullRequest to support features or fix bugs.

How it works

This project aims to follow the Kubernetes Operator pattern

It uses Controllers which provides a reconcile function responsible for synchronizing resources untile the desired state is reached on the cluster

Test It Out
  1. Install the CRDs into the cluster:
make install
  1. Run your controller (this will run in the foreground, so switch to new terminal if you want to leave it running):
make run

NOTE: You can also run this in one step by running: make install run

Modifying the API definitions

If you are editing the API definitions, generate the manifests such as CRs or CRDs using:

make manifests

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
api
v1alpha1
Package v1alpha1 contains API Schema definitions for the elastic v1alpha1 API group +kubebuilder:object:generate=true +groupName=elastic.iml.github.io
Package v1alpha1 contains API Schema definitions for the elastic v1alpha1 API group +kubebuilder:object:generate=true +groupName=elastic.iml.github.io
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL