mlserver

command module
v0.0.0-...-895f261 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 13, 2014 License: Apache-2.0 Imports: 20 Imported by: 0

README

mlserver

This is a simple application that provides an HTTP/JSON api for machine learning. Currently, only classification is implemented. The server is written in Go, the machine learning algorithms are Python, from the Scikit-Learn library. Each model is run inside a separate child process. The models fitted in fit.py are pickled using joblib and saved to a folder named models in the working directory of mlserver.

Building/Installing

Make sure you have mercurial and zeromq installed:

Mac/Homebrew

brew install zeromq
brew install mercurial

Ubuntu

sudo apt-get install mercurial
sudo apt-get install libtool autoconf automake uuid-dev build-essential

Fetch and Build zmq4

curl -O http://download.zeromq.org/zeromq-4.0.5.tar.gz
tar zxvf zeromq-4.0.5.tar.gz && cd zeromq-4.0.5
./configure
make
sudo make install

If you get error while loading shared libraries: libzmq.so.4 when trying to run mlserver on Ubuntu, try updating the library cache.

sudo ldconfig

Building the app is fairly simple (assuming Go is installed and $GOPAH is set):

go get github.com/wlattner/mlserver

This will clone the repo to $GOPATH/src/github.com/wlattner/mlserver and copy the mlserver binary to $GOPATH/bin.

The code in fit.py and predict.py require Python 3, NumPy, SciPy and Scikit-Learn; these are sometimes tricky to install, look elsewhere for help.

Ubuntu

sudo apt-get install build-essential python3-dev python3-setuptools python3-numpy python3-scipy libatlas-dev libatlas3gf-base
sudo apt-get install python3-pip
pip3 install scikit-learn
pip3 install pyzmq

If you modify fit.py or predict.py, run make. These two files must be included in the Go source as raw string values, make will rewrite fit_py.go and predict_py.go using the current version of fit.py and predict.py

Running

Start the server:

mlserver

By default, the server will listen on port 5000.

TODO

  • error handling, especially with fit/predict input
  • automatically stop unused models
  • store models in S3
  • add regression, detect which based on input data
  • better model selection in fit.py
  • better project name
  • config options
  • csv file upload for fit/predict input
  • docker container for fit.py and predict.py
  • use kubernetes for fit/predict workers
  • tests

API

Get Models

  • GET /models will return all models on the server
[
  {
    "model_id": "0e12bb73-e49a-4dcd-87aa-cb0338b1c758",
    "metadata": {
      "name": "iris model 1",
      "created_at": "2014-11-06T21:52:16.143688Z"
    },
    "performance": {
      "algorithm": "GradientBoostingClassifier",
      "confusion_matrix": {
        "setosa": {
          "setosa": 50,
          "versicolor": 0,
          "virginica": 0
        },
        "versicolor": {
          "setosa": 0,
          "versicolor": 50,
          "virginica": 0
        },
        "virginica": {
          "setosa": 0,
          "versicolor": 0,
          "virginica": 50
        }
      },
      "score": 0.9673202614379085
    },
    "running": false,
    "trained": true
  },
  {
    "model_id": "26f786c1-5e59-432f-a3b0-8b87025043f8",
    "metadata": {
      "name": "ESL 10.2 Generated Data",
      "created_at": "2014-11-07T00:47:14.602932Z"
    },
    "performance": {
      "algorithm": "GradientBoostingClassifier",
      "confusion_matrix": {
        "-1.0": {
          "-1.0": 5931,
          "1.0": 111
        },
        "1.0": {
          "-1.0": 307,
          "1.0": 5651
        }
      },
      "score": 0.9285000000000001
    },
    "running": false,
    "trained": true
  }
]

Get Model

  • GET /models/:model_id will return the specified model.
{
  "model_id": "0e12bb73-e49a-4dcd-87aa-cb0338b1c758",
  "metadata": {
    "name": "iris model 1",
    "created_at": "2014-11-06T21:52:16.143688Z"
  },
  "performance": {
    "algorithm": "GradientBoostingClassifier",
    "confusion_matrix": {
      "setosa": {
        "setosa": 50,
        "versicolor": 0,
        "virginica": 0
      },
      "versicolor": {
        "setosa": 0,
        "versicolor": 50,
        "virginica": 0
      },
      "virginica": {
        "setosa": 0,
        "versicolor": 0,
        "virginica": 50
      }
    },
    "score": 0.9673202614379085
  },
  "running": false,
  "trained": true
}

Fit

  • POST /models will create and fit a new model with the supplied training data.

The request body should be JSON with the following fields:

  • name the name of the model
  • data an array of objects, each element represents a single row/observation
  • labels an array of strings representing the target value/label of each training example

To fit a model for predicting the species variable from the Iris data:

sepal_length sepal_width petal_length petal_width species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
5.0 3.6 1.4 0.2 setosa
... ... ... ... ...
{
  "name": "iris model",
  "data": [
    {
      "sepal_length": 5.1,
      "petal_length": 1.4,
      "sepal_width": 3.5,
      "petal_width": 0.2
    },
    {
      "sepal_length": 4.9,
      "petal_length": 1.4,
      "sepal_width": 3.0,
      "petal_width": 0.2
    },
    {
      "sepal_length": 4.7,
      "petal_length": 1.3,
      "sepal_width": 3.2,
      "petal_width": 0.2
    },
    {
      "sepal_length": 4.6,
      "petal_length": 1.5,
      "sepal_width": 3.1,
      "petal_width": 0.2
    },
    {
      "sepal_length": 5.0,
      "petal_length": 1.4,
      "sepal_width": 3.6,
      "petal_width": 0.2
    }
  ],
  "labels": [
    "setosa",
    "setosa",
    "setosa",
    "setosa",
    "setosa"
  ]
}

This will return 202 Accepted along with the id of the newly created model. The model will be fitted in the background.

{
  "model_id": "07421303-62f9-40f3-bf14-23cf44af05e2"
}

Alternatively, the data for fitting a model can be uploaded as a csv file. The file must have a header row and the target variable must be the first column. The table above would be encoded as:

"species","sepal_length","sepal_width","petal_length","petal_width"
"setosa",5.1,3.5,1.4,0.2
"setosa",4.9,3,1.4,0.2
"setosa",4.7,3.2,1.3,0.2
"setosa",4.6,3.1,1.5,0.2
"setosa",5,3.6,1.4,0.2
"setosa",5.4,3.9,1.7,0.4
"setosa",4.6,3.4,1.4,0.3
"setosa",5,3.4,1.5,0.2
"setosa",4.4,2.9,1.4,0.2

The request should be encoded as multipart/form with the following fields:

  • name the name to use for the model
  • file the csv file
curl --form name="iris model csv" --form file=@iris.csv http://localhost:5000/models

Predict

  • POST /models/:model_id will return predictions using the model for the supplied data

The request body should have the following fields:

  • data an array of objects, each element represents a single row/observation

To make predict labels (species) for the following data:

sepal_length sepal_width petal_length petal_width species
6.7 3.0 5.2 2.3 ?
6.3 2.5 5.0 1.9 ?
6.5 3.0 5.2 2.0 ?
6.2 3.4 5.4 2.3 ?
5.9 3.0 5.1 1.8 ?
{
  "data": [
    {
      "sepal_length": 6.7,
      "petal_length": 5.2,
      "sepal_width": 3.0,
      "petal_width": 2.3
    },
    {
      "sepal_length": 6.3,
      "petal_length": 5.0,
      "sepal_width": 2.5,
      "petal_width": 1.9
    },
    {
      "sepal_length": 6.5,
      "petal_length": 5.2,
      "sepal_width": 3.0,
      "petal_width": 2.0
    },
    {
      "sepal_length": 6.2,
      "petal_length": 5.4,
      "sepal_width": 3.4,
      "petal_width": 2.3
    },
    {
      "sepal_length": 5.9,
      "petal_length": 5.1,
      "sepal_width": 3.0,
      "petal_width": 1.8
    }
  ]
}

The response will contain class probabilities for each example submitted:

{
  "labels": [
    {
      "versicolor": 0.000005590474449815602,
      "virginica": 0.9999925658927716,
      "setosa": 0.000001843632778976535
    },
    {
      "versicolor": 0.00003448150394080962,
      "virginica": 0.9999626991986605,
      "setosa": 0.0000028192973987744193
    },
    {
      "versicolor": 0.00000583767259563357,
      "virginica": 0.9999923186950813,
      "setosa": 0.0000018436323232313824
    },
    {
      "versicolor": 0.000025292685027774954,
      "virginica": 0.9999702563844668,
      "setosa": 0.000004450930505165
    },
    {
      "versicolor": 0.00006891207512697766,
      "virginica": 0.9999281614880159,
      "setosa": 0.000002926436856866432
    }
  ],
  "model_id": "0e12bb73-e49a-4dcd-87aa-cb0338b1c758"
}

Alternatively, the data could be uploaded as a csv file, see above description for fitting a model using a csv file. In the case of making predictions, the csv file should not have the label/target data in the first column.

Start Model

The prediction woker is started with the first prediction request for a model. A model can be started manually however.

  • POST /models/running will start a model
{
  "model_id": "0e12bb73-e49a-4dcd-87aa-cb0338b1c758"
}

This will return 201 Created with an empty body. The model will be started in the background.

Stop Model

Once started, models will run until the server process exits. Models can be stopped manually.

  • DELETE /models/running/:model_id will stop a model

This will return 202 Accepted with an empty body. The model will be stopped in the background.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
Godeps
_workspace/src/bitbucket.org/kardianos/osext
Extensions to the standard "os" package.
Extensions to the standard "os" package.
_workspace/src/code.google.com/p/go-uuid/uuid
The uuid package generates and inspects UUIDs.
The uuid package generates and inspects UUIDs.
_workspace/src/github.com/coreos/go-systemd/journal
Package journal provides write bindings to the systemd journal
Package journal provides write bindings to the systemd journal
_workspace/src/github.com/pebbe/zmq4
A Go interface to ZeroMQ (zmq, 0mq) version 4.
A Go interface to ZeroMQ (zmq, 0mq) version 4.
bstar - Binary Star reactor.
_workspace/src/github.com/pebbe/zmq4/examples/clone
Clone client API stack (multithreaded).
Clone client API stack (multithreaded).
_workspace/src/github.com/pebbe/zmq4/examples/flcliapi
flcliapi - Freelance Pattern agent class.
flcliapi - Freelance Pattern agent class.
Interface class for Chapter 8.
_workspace/src/github.com/pebbe/zmq4/examples/kvmsg
kvmsg class - key-value message class for example applications
kvmsg class - key-value message class for example applications
_workspace/src/github.com/pebbe/zmq4/examples/kvsimple
kvsimple - simple key-value message class for example applications.
kvsimple - simple key-value message class for example applications.
_workspace/src/github.com/pebbe/zmq4/examples/mdapi
Majordomo Protocol Client and Worker API.
Majordomo Protocol Client and Worker API.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL