trainingjob-operator

module
v0.0.0-...-148db3b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 4, 2020 License: Apache-2.0

README

TrainingJob Operator README

Trainingjob operator is designed for EDL (elastic deep learning) and supports multiple frameworks. It supports automatic fault tolerance and flexible pod completion and termination strategies, and has been tested in paddlepadle, tensorflow, and python frameworks.

Getting started

Deploy trainingjob-operator

Trainingjob-operator can be deployed by compiling and executive

git clone https://github.com/elasticdeeplearning/trainingjob-operator.git
cd trainingjob-operator/cmd 
go build -o trainingjob-operator
./trainingjob-operator --master ${master_ip}:${port} --v 4 --thread-num 1000 --logtostderr --leader-elect=true --enable-creating-failed=true
Run an example trainingjob
Submit the trainingjob
kubectl apply -f https://raw.githubusercontent.com/elasticdeeplearning/trainingjob-operator/master/example/paddle-mnist.yaml
Monitor the status of the trainingjob
kubectl get aitj
kubectl describe aitj paddle-mnist
Delete the trainingjob
kubectl delete -f https://raw.githubusercontent.com/elasticdeeplearning/trainingjob-operator/master/example/paddle-mnist.yaml

Directories

Path Synopsis
cmd
app
pkg
apis/aitrainingjob/v1
Package v1 is the v1 version of the API.
Package v1 is the v1 version of the API.
client/clientset/versioned
This package has the automatically generated clientset.
This package has the automatically generated clientset.
client/clientset/versioned/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
client/clientset/versioned/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
client/clientset/versioned/typed/aitrainingjob/v1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
client/clientset/versioned/typed/aitrainingjob/v1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL