openmodelz

module

v0.0.27 Latest Latest Go to latest Published: Sep 27, 2023 License: Apache-2.0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/tensorchord/openmodelz

Links

Open Source Insights

README ¶

OpenModelZ

What is OpenModelZ?

OpenModelZ ( mdz ) is tool to deploy your models to any cluster (GCP, AWS, Lambda labs, your home lab, or even a single machine).

Getting models into production is hard for data scientists and SREs. You need to configure the monitoring, logging, and scaling infrastructure, with the right security and permissions. And then setup the domain, SSL, and load balancer. This can take weeks or months of work even for a single model deployment.

You can now use mdz deploy to effortlessly deploy your models. OpenModelZ handles all the infrastructure setup for you. Each deployment gets a public subdomain, like http://jupyter-9pnxd.2.242.22.143.modelz.live, making it easily accessible.

OpenModelZ

Benefits

OpenModelZ provides the following features out-of-the-box:

📈 Auto-scaling from 0: The number of inference servers could be scaled based on the workload. You could start from 0 and scale it up to 10+ replicas easily.
📦 Support any machine learning framework: You could deploy any machine learning framework (e.g. vLLM/triton-inference-server/mosec etc.) with a single command. Besides, you could also deploy your own custom inference server.
🔬 Gradio/Streamlit/Jupyter support: We provide a robust prototyping environment with support for Gradio, Streamlit, jupyter and so on. You could visualize your model's performance and debug it easily in the notebook, or deploy a web app for your model with a single command.
🏃 Start from a single machine to a cluster of machines: You could start from a single machine and scale it up to a cluster of machines without any hassle, with a single command mdz server start.
🚀 Public accessible subdomain for each deployment ( optional ) : We provision a separate subdomain for each deployment without any extra cost and effort, making each deployment easily accessible from the outside.

OpenModelZ is the foundational component of the ModelZ platform available at modelz.ai.

How it works

Get a server (could be a cloud VM, a home lab, or even a single machine) and run the mdz server start command. OpenModelZ will bootstrap the server for you.

$ mdz server start
🚧 Creating the server...
🚧 Initializing the load balancer...
🚧 Initializing the GPU resource...
🚧 Initializing the server...
🚧 Waiting for the server to be ready...
🐋 Checking if the server is running...
🐳 The server is running at http://146.235.213.84.modelz.live
🎉 You could set the environment variable to get started!

export MDZ_URL=http://146.235.213.84.modelz.live
$ export MDZ_URL=http://146.235.213.84.modelz.live

Then you could deploy your model with a single command mdz deploy and get the endpoint:

$ mdz deploy --image modelzai/gradio-stable-diffusion:23.03 --name sdw --port 7860 --gpu 1
Inference sd is created
$ mdz list
 NAME  ENDPOINT                                                 STATUS  INVOCATIONS  REPLICAS 
 sdw   http://sdw-qh2n0y28ybqc36oc.146.235.213.84.modelz.live   Ready           174  1/1      
       http://146.235.213.84.modelz.live/inference/sdw.default

Quick Start 🚀

Install `mdz`

You can install OpenModelZ using the following command:

pip install openmodelz

You could verify the installation by running the following command:

mdz

Once you've installed the mdz you can start deploying models and experimenting with them.

Bootstrap `mdz`

It's super easy to bootstrap the mdz server. You just need to find a server (could be a cloud VM, a home lab, or even a single machine) and run the mdz server start command.

Notice: We may require the root permission to bootstrap the mdz server on port 80.

$ mdz server start
🚧 Creating the server...
🚧 Initializing the load balancer...
🚧 Initializing the GPU resource...
🚧 Initializing the server...
🚧 Waiting for the server to be ready...
🐋 Checking if the server is running...
Agent:
 Version:       v0.0.13
 Build Date:    2023-07-19T09:12:55Z
 Git Commit:    84d0171640453e9272f78a63e621392e93ef6bbb
 Git State:     clean
 Go Version:    go1.19.10
 Compiler:      gc
 Platform:      linux/amd64
🐳 The server is running at http://192.168.71.93.modelz.live
🎉 You could set the environment variable to get started!

export MDZ_URL=http://192.168.71.93.modelz.live

The internal IP address will be used as the default endpoint of your deployments. You could provide the public IP address of your server to the mdz server start command to make it accessible from the outside world.

# Provide the public IP as an argument
$ mdz server start 1.2.3.4

You could also specify the registry mirror to speed up the image pulling process. Here is an example:

$ mdz server start --mirror-endpoints https://docker.mirrors.sjtug.sjtu.edu.cn

Create your first UI-based deployment

Once you've bootstrapped the mdz server, you can start deploying your first applications. We will use jupyter notebook as an example in this tutorial. You could use any docker image as your deployment.

$ mdz deploy --image jupyter/minimal-notebook:lab-4.0.3 --name jupyter --port 8888 --command "jupyter notebook --ip='*' --NotebookApp.token='' --NotebookApp.password=''"
Inference jupyter is created
$ mdz list
 NAME     ENDPOINT                                                   STATUS  INVOCATIONS  REPLICAS
 jupyter  http://jupyter-9pnxdkeb6jsfqkmq.192.168.71.93.modelz.live  Ready           488  1/1
          http://192.168.71.93/inference/jupyter.default

You could access the deployment by visiting the endpoint URL. The endpoint will be automatically generated for each deployment with the following format: <name>-<random-string>.<ip>.modelz.live.

It is http://jupyter-9pnxdkeb6jsfqkmq.192.168.71.93.modelz.live in this case. The endpoint could be accessed from the outside world as well if you've provided the public IP address of your server to the mdz server start command.

jupyter notebook

Create your first OpenAI compatible API server

You could also create API-based deployments. We will use OpenAI compatible API server with Bloomz 560M as an example in this tutorial.

$ mdz deploy --image modelzai/llm-bloomz-560m:23.07.4 --name simple-server
Inference simple-server is created
$ mdz list
 NAME           ENDPOINT                                                         STATUS  INVOCATIONS  REPLICAS 
 jupyter        http://jupyter-9pnxdkeb6jsfqkmq.192.168.71.93.modelz.live        Ready           488  1/1      
                http://192.168.71.93/inference/jupyter.default                                                 
 simple-server  http://simple-server-lagn8m9m8648q6kx.192.168.71.93.modelz.live  Ready             0  1/1      
                http://192.168.71.93/inference/simple-server.default

You could use OpenAI python package and the endpoint http://simple-server-lagn8m9m8648q6kx.192.168.71.93.modelz.live in this case, to interact with the deployment.

import openai
openai.api_base="http://simple-server-lagn8m9m8648q6kx.192.168.71.93.modelz.live"
openai.api_key="any"

# create a chat completion
chat_completion = openai.ChatCompletion.create(model="bloomz", messages=[
    {"role": "user", "content": "Who are you?"},
    {"role": "assistant", "content": "I am a student"},
    {"role": "user", "content": "What do you learn?"},
], max_tokens=100)

Scale your deployment

You could scale your deployment by using the mdz scale command.

$ mdz scale simple-server --replicas 3

The requests will be load balanced between the replicas of your deployment.

You could also tell the mdz to autoscale your deployment based on the inflight requests. Please check out the Autoscaling documentation for more details.

Debug your deployment

Sometimes you may want to debug your deployment. You could use the mdz logs command to get the logs of your deployment.

$ mdz logs simple-server
simple-server-6756dd67ff-4bf4g: 10.42.0.1 - - [27/Jul/2023 02:32:16] "GET / HTTP/1.1" 200 -
simple-server-6756dd67ff-4bf4g: 10.42.0.1 - - [27/Jul/2023 02:32:16] "GET / HTTP/1.1" 200 -
simple-server-6756dd67ff-4bf4g: 10.42.0.1 - - [27/Jul/2023 02:32:17] "GET / HTTP/1.1" 200 -

You could also use the mdz exec command to execute a command in the container of your deployment. You do not need to ssh into the server to do that.

$ mdz exec simple-server ps
PID   USER     TIME   COMMAND
    1 root       0:00 /usr/bin/dumb-init /bin/sh -c python3 -m http.server 80
    7 root       0:00 /bin/sh -c python3 -m http.server 80
    8 root       0:00 python3 -m http.server 80
    9 root       0:00 ps

$ mdz exec simple-server -ti bash
bash-4.4#

Or you could port-forward the deployment to your local machine and debug it locally.

$ mdz port-forward simple-server 7860
Forwarding inference simple-server to local port 7860

Add more servers

You could add more servers to your cluster by using the mdz server join command. The mdz server will be bootstrapped on the server and join the cluster automatically.

$ mdz server join <internal ip address of the previous server>
$ mdz server list
 NAME   PHASE  ALLOCATABLE      CAPACITY        
 node1  Ready  cpu: 16          cpu: 16         
               mem: 32784748Ki  mem: 32784748Ki 
               gpu: 1           gpu: 1      
 node2  Ready  cpu: 16          cpu: 16         
               mem: 32784748Ki  mem: 32784748Ki 
               gpu: 1           gpu: 1

Label your servers

You could label your servers to deploy your models to specific servers. For example, you could label your servers with gpu=true and deploy your models to servers with GPUs.

$ mdz server label node3 gpu=true type=nvidia-a100
$ mdz deploy ... --node-labels gpu=true,type=nvidia-a100

Architecture

OpenModelZ is inspired by the k3s and OpenFaaS, but designed specifically for machine learning deployment. We keep the core of the system simple, and easy to extend.

You do not need to read this section if you just want to deploy your models. But if you want to understand how OpenModelZ works, this section is for you.

OpenModelZ

OpenModelZ is composed of two components:

Data Plane: The data plane is responsible for the servers. You could use mdz server to manage the servers. The data plane is designed to be stateless and scalable. You could easily scale the data plane by adding more servers to the cluster. It uses k3s under the hood, to support VMs, bare-metal, and IoT devices (in the future). You could also deploy OpenModelZ on a existing kubernetes cluster.
Control Plane: The control plane is responsible for the deployments. It manages the deployments and the underlying resources.

A request will be routed to the inference servers by the load balancer. And the autoscaler will scale the number of inference servers based on the workload. We provide a domain *.modelz.live by default, with the help of a wildcard DNS server to support the public accessible subdomain for each deployment. You could also use your own domain.

You could check out the architecture documentation for more details.

Roadmap 🗂️

Please checkout ROADMAP.

Contribute 😊

We welcome all kinds of contributions from the open-source community, individuals, and partners.

Join our discord community!

Contributors ✨

_{Ce Gao} 💻 👀 ✅	_{Jinjing Zhou} 💬 🐛 🤔	_Keming 💻 🎨 🚇	_{Nadeshiko Manju} 🐛 🎨 🤔	_{Teddy Xinyuan Chen} 📖	_{Wei Zhang} 💻	_Xuanwo 🖋 🎨 🤔
_cutecutecat 🤔	_xieydd 🤔

Acknowledgements 🙏

K3s for the single control-plane binary and process.
OpenFaaS for their work on serverless function services. It laid the foundation for OpenModelZ.
sslip.io for the wildcard DNS service. It makes it possible to access the server from the outside world without any setup.

Directories ¶

Path	Synopsis
agent module
api/types
client
cmd/agent
errdefs Package errdefs defines a set of error interfaces that packages should use for communicating classes of errors.	Package errdefs defines a set of error interfaces that packages should use for communicating classes of errors.
pkg/app
pkg/config
pkg/consts
pkg/docs Package docs GENERATED BY SWAG; DO NOT EDIT This file was generated by swaggo/swag	Package docs GENERATED BY SWAG; DO NOT EDIT This file was generated by swaggo/swag
pkg/event
pkg/k8s
pkg/log
pkg/metrics
pkg/prom
pkg/query
pkg/query/mock Package mock is a generated GoMock package.	Package mock is a generated GoMock package.
pkg/runtime
pkg/runtime/mock Package mock is a generated GoMock package.	Package mock is a generated GoMock package.
pkg/scaling
pkg/server
pkg/server/static
pkg/server/validator
pkg/version
autoscaler
cmd/autoscaler
pkg/autoscaler
pkg/autoscalerapp
pkg/prom
pkg/server
pkg/version
ingress-operator
cmd/ingress-operator
pkg/apis/modelzetes
pkg/apis/modelzetes/v1 Package v1 is the OpenFaaS v1 version of the API.	Package v1 is the OpenFaaS v1 version of the API.
pkg/app
pkg/client/clientset/versioned This package has the automatically generated clientset.	This package has the automatically generated clientset.
pkg/client/clientset/versioned/fake This package has the automatically generated fake clientset.	This package has the automatically generated fake clientset.
pkg/client/clientset/versioned/scheme This package contains the scheme of the automatically generated clientset.	This package contains the scheme of the automatically generated clientset.
pkg/client/clientset/versioned/typed/modelzetes/v1 This package has the automatically generated typed clients.	This package has the automatically generated typed clients.
pkg/client/clientset/versioned/typed/modelzetes/v1/fake Package fake has the automatically generated clients.	Package fake has the automatically generated clients.
pkg/client/informers/externalversions
pkg/client/informers/externalversions/internalinterfaces
pkg/client/informers/externalversions/modelzetes
pkg/client/informers/externalversions/modelzetes/v1
pkg/client/listers/modelzetes/v1
pkg/config
pkg/consts
pkg/controller
pkg/controller/v1
pkg/signals
pkg/version
mdz module
cmd/mdz
hack/cli-doc-gen
pkg/agentd/runtime
pkg/agentd/server
pkg/cmd
pkg/cmd/ioutils
pkg/cmd/streams
pkg/server
pkg/telemetry
pkg/term
pkg/version
modelzetes
cmd/modelzetes
pkg/apis/modelzetes
pkg/apis/modelzetes/v2alpha1 Package v2alpha1 is the modelzetes API.	Package v2alpha1 is the modelzetes API.
pkg/app
pkg/client/clientset/versioned This package has the automatically generated clientset.	This package has the automatically generated clientset.
pkg/client/clientset/versioned/fake This package has the automatically generated fake clientset.	This package has the automatically generated fake clientset.
pkg/client/clientset/versioned/scheme This package contains the scheme of the automatically generated clientset.	This package contains the scheme of the automatically generated clientset.
pkg/client/clientset/versioned/typed/modelzetes/v2alpha1 This package has the automatically generated typed clients.	This package has the automatically generated typed clients.
pkg/client/clientset/versioned/typed/modelzetes/v2alpha1/fake Package fake has the automatically generated clients.	Package fake has the automatically generated clients.
pkg/client/informers/externalversions
pkg/client/informers/externalversions/internalinterfaces
pkg/client/informers/externalversions/modelzetes
pkg/client/informers/externalversions/modelzetes/v2alpha1
pkg/client/listers/modelzetes/v2alpha1
pkg/config
pkg/consts
pkg/controller
pkg/k8s
pkg/pointer
pkg/signals
pkg/version

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

OpenModelZ

What is OpenModelZ?

Benefits

How it works

Quick Start 🚀

Install mdz

Bootstrap mdz

Create your first UI-based deployment

Create your first OpenAI compatible API server

Scale your deployment

Debug your deployment

Add more servers

Label your servers

Architecture

Roadmap 🗂️

Contribute 😊

Contributors ✨

Acknowledgements 🙏

Directories ¶

Install `mdz`

Bootstrap `mdz`