elasticthought

package module

v0.0.0-...-95e8ffc Latest Latest Go to latest Published: Aug 7, 2015 License: Apache-2.0 Imports: 32 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/tleyden/elastic-thought

Links

Open Source Insights

README ¶

Scalable REST API wrapper for the Caffe deep learning framework.

The problem

Caffe is an awesome deep learning framework, but running it on a single laptop or desktop computer isn't nearly as productive as running it in the cloud at scale.

ElasticThought gives you the ability to:

Run multiple Caffe training jobs in parallel
Queue up training jobs
Tune the number of workers that process jobs on the queue
Interact with it via a REST API (and later build Web/Mobile apps on top of it)
Multi-tenancy to allow multiple users to interact with it, each having access to only their own data

Components

ElasticThought Components

Caffe - core deep learning framework
Couchbase Server - Distributed document database used as an object store (source code)
Sync Gateway - REST adapter layer for Couchbase Server + Mobile Sync gateway
CBFS - Couchbase Distributed File System used as blob store
NSQ - Distributed message queue
ElasticThought REST Service - REST API server written in Go

Deployment Architecture

Here is what a typical cluster might look like:

ElasticThought Deployment

If running on AWS, each CoreOS instance would be running on its own EC2 instance.

Although not shown, all components would be running inside of Docker containers.

It would be possible to start more nodes which only had Caffe GPU workers running.

Roadmap

Current Status: everything under heavy construction, not ready for public consumption yet

[done] Working end-to-end with IMAGE_DATA caffe layer using a single test set with a single training set, and ability to query trained set.
[done] Support LEVELDB / LMDB data formats, to run mnist example.
[in progress] Package everything up to make it easy to deploy locally or in the cloud
Support the majority of caffe use cases
Ability to auto-scale worker instances up and down based on how many jobs are in the message queue.
Attempt to add support for other deep learning frameworks: pylearn2, cuda-convnet, etc.
Build a Web App on top of the REST API that leverages PouchDB
Build Android and iOS mobile apps on top of the REST API that leverages Couchbase Mobile

Design goals

100% Open Source (Apache 2 / BSD), including all components used.
Architected to enable warehouse scale computing
No IAAS lockin -- easily migrate between AWS, GCE, or your own private data center
Ability to scale down as well as up

Documentation

REST API
Godocs
This README

System Requirements

ElasticThought requires CoreOS to run.

If you want to access the GPU, you will need to do extra work to get CoreOS working with Nvidia CUDA GPU Drivers

Installing elastic-thought on a single CoreOS host (Development mode)

If you are on OSX, you'll first need to install Vagrant, VirtualBox, and CoreOS. See CoreOS on Vagrant for instructions.

Here's what will be created:

           ┌─────────────────────────────────────────────────────────┐
           │                       CoreOS Host                       │
           │  ┌──────────────────────────┐  ┌─────────────────────┐  │
           │  │     Docker Container     │  │  Docker Container   │  │
           │  │   ┌───────────────────┐  │  │    ┌────────────┐   │  │
           │  │   │  Elastic Thought  │  │  │    │Sync Gateway│   │  │
           │  │   │      Server       │  │  │    │  Database  │   │  │
           │  │   │   ┌───────────┐   │  │  │    │            │   │  │
           │  │   │   │In-process │   │◀─┼──┼───▶│            │   │  │
           │  │   │   │   Caffe   │   │  │  │    │            │   │  │
           │  │   │   │  worker   │   │  │  │    │            │   │  │
           │  │   │   └───────────┘   │  │  │    └────────────┘   │  │
           │  │   └───────────────────┘  │  └─────────────────────┘  │
           │  └──────────────────────────┘                           │
           └─────────────────────────────────────────────────────────┘

Run the following commands on your CoreOS box (to get in, you may need to vagrant ssh core-01)

Start Sync Gateway Database

$ docker run -d --name sync-gateway -P couchbase/sync-gateway:1.1.0-forestdb_bucket sync_gateway https://gist.githubusercontent.com/tleyden/8051567cf62dfa8f89ca/raw/43d4abc9ef64cef7b4bbbdf6cb8ce80c456efd1f/gistfile1.txt

Start ElasticThought REST API server

$ docker run -d --name elastic-thought -p 8080:8080 --link sync-gateway:sync-gateway tleyden5iwx/elastic-thought-cpu-develop bash -c 'refresh-elastic-thought; elastic-thought --sync-gw http://sync-gateway:4984/elastic-thought'

It's also a good idea to check the logs of both containers to look for any errors:

$ docker logs sync-gateway 
$ docker logs -f elastic-thought

At this point you can test the API via curl.

Installing elastic-thought on AWS (Production mode)

It should be possible to install elastic-thought anywhere that CoreOS is supported. Currently, there are instructions for AWS and Vagrant (below).

Launch EC2 instances via CloudFormation script

Note: the instance will launch in us-east-1. If you want to launch in another region, please file an issue.

Launch CPU Stack or Launch GPU Stack
Choose 3 node cluster with m3.medium or g2.2xlarge (GPU case) instance type
All other values should be default

Verify CoreOS cluster

Run:

$ fleetctl list-machines

Which should show all the CoreOS machines in your cluster. (this uses etcd under the hood, so will also validate that etcd is setup correctly).

Kick off ElasticThought

Ssh into one of the machines (doesn't matter which): ssh -A core@ec2-54-160-96-153.compute-1.amazonaws.com

$ wget https://raw.githubusercontent.com/tleyden/elastic-thought/master/docker/scripts/elasticthought-cluster-init.sh
$ chmod +x elasticthought-cluster-init.sh
$ ./elasticthought-cluster-init.sh -v 3.0.1 -n 3 -u "user:passw0rd" -p gpu

Once it launches, verify your cluster by running fleetctl list-units.

It should look like this:

UNIT						MACHINE				ACTIVE	SUB
cbfs_announce@1.service                         2340c553.../10.225.17.229       active	running
cbfs_announce@2.service                         fbd4562e.../10.182.197.145      active	running
cbfs_announce@3.service                         0f5e2e11.../10.168.212.210      active	running
cbfs_node@1.service                             2340c553.../10.225.17.229       active	running
cbfs_node@2.service                             fbd4562e.../10.182.197.145      active	running
cbfs_node@3.service                             0f5e2e11.../10.168.212.210      active	running
couchbase_bootstrap_node.service                0f5e2e11.../10.168.212.210      active	running
couchbase_bootstrap_node_announce.service       0f5e2e11.../10.168.212.210      active	running
couchbase_node.1.service                        2340c553.../10.225.17.229       active	running
couchbase_node.2.service                        fbd4562e.../10.182.197.145      active	running
elastic_thought_gpu@1.service                   2340c553.../10.225.17.229       active	running
elastic_thought_gpu@2.service                   fbd4562e.../10.182.197.145      active	running
elastic_thought_gpu@3.service                   0f5e2e11.../10.168.212.210      active	running
sync_gw_announce@1.service                      2340c553.../10.225.17.229       active	running
sync_gw_announce@2.service                      fbd4562e.../10.182.197.145      active	running
sync_gw_announce@3.service                      0f5e2e11.../10.168.212.210      active	running
sync_gw_node@1.service                          2340c553.../10.225.17.229       active	running
sync_gw_node@2.service                          fbd4562e.../10.182.197.145      active	running
sync_gw_node@3.service                          0f5e2e11.../10.168.212.210      active	running

At this point you should be able to access the REST API on the public ip any of the three Sync Gateway machines.

Installing elastic-thought on Vagrant (Staging mode)

This mode tries to replicate the Production mode described above, but on Vagrant instead of AWS.

Update Vagrant

Make sure you're running a current version of Vagrant, otherwise the plugin install below may fail.

$ vagrant -v
1.7.1

Install CoreOS on Vagrant

Clone the coreos/vagrant fork that has been customized for running ElasticThought.

$ cd ~/Vagrant 
$ git clone git@github.com:tleyden/coreos-vagrant.git
$ cd coreos-vagrant
$ cp config.rb.sample config.rb
$ cp user-data.sample user-data

By default this will run a two node cluster, if you want to change this, update the $num_instances variable in the config.rb file.

Run CoreOS

$ vagrant up

Ssh in:

$ vagrant ssh core-01 -- -A

If you see:

Failed Units: 1
  user-cloudinit@var-lib-coreos\x2dvagrant-vagrantfile\x2duser\x2ddata.service

Jump to Workaround CoreOS + Vagrant issues below.

Verify things started up correctly:

core@core-01 ~ $ fleectctl list-machines

If you get errors like:

2015/03/26 16:58:50 INFO client.go:291: Failed getting response from http://127.0.0.1:4001/: dial tcp 127.0.0.1:4001: connection refused
2015/03/26 16:58:50 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms

Jump to Workaround CoreOS + Vagrant issues below.

Workaround CoreOS + Vagrant issues:

First exit out of CoreOS:

core@core-01 ~ $ exit

On your OSX workstation, try the following workaround:

$ sed -i '' 's/420/0644/' user-data
$ sed -i '' 's/484/0744/' user-data
$ vagrant reload --provision

Ssh back in:

$ vagrant ssh core-01 -- -A

Verify it worked:

core@core-01 ~ $ fleectctl list-machines

You should see:

MACHINE		IP		METADATA
ce0fec18...	172.17.8.102	-
d6402b24...	172.17.8.101	-

I filed CoreOS cloudinit issue 328 to figure out why this error is happening (possibly related issues: CoreOS cloudinit issue 261 or CoreOS cloudinit issue 190)

Continue steps above

Scroll up to the Installing elastic-thought on AWS section and start with Verify CoreOS cluster

FAQ

Is this useful for grid computing / distributed computation? Ans: No, this is not trying to be a grid computing (aka distributed computation) solution. You may want to check out Caffe Issue 876 or ParameterServer

License

Apache 2

Documentation ¶

Overview ¶

ElasticThought is a REST wrapper around the Caffe Deep Learning framework.

Index ¶

Constants
Variables
func Asset(name string) ([]byte, error)
func AssetDir(name string) ([]string, error)
func AssetInfo(name string) (os.FileInfo, error)
func AssetNames() []string
func CbfsReadWriteFile(config Configuration, destPath, content string) error
func CbfsSanityCheck(config Configuration) error
func CopyFileContents(src, dst string) (err error)
func DbAuthRequired() gin.HandlerFunc
func DbConnector(dbUrl string) gin.HandlerFunc
func EnableAllLogKeys()
func EnvironmentSanityCheck(config Configuration) error
func Mkdir(directory string) error
func NewUuid() string
func RestoreAsset(dir, name string) error
func RestoreAssets(dir, name string) error
func TempDir() string
type BlobHandle
type BlobPutOptions
type BlobStore
- func NewBlobStore(rawurl string) (BlobStore, error)
type CbfsBlobStore
- func NewCbfsBlobStore(uri string) (*CbfsBlobStore, error)
- func (c *CbfsBlobStore) OpenFile(path string) (BlobHandle, error)
- func (c *CbfsBlobStore) Put(srcname, dest string, r io.Reader, opts BlobPutOptions) error
type ChangesListener
- func NewChangesListener(c Configuration, jobScheduler JobScheduler) (*ChangesListener, error)
- func (c ChangesListener) FollowChangesFeed()
type Classifier
- func NewClassifier(c Configuration) *Classifier
- func (c *Classifier) Find(id string) error
- func (c *Classifier) Insert() error
- func (c *Classifier) RefreshFromDB(db couch.Database) error
- func (c *Classifier) SetSpecificationUrl(specUrlCbfs string) error
- func (c Classifier) Validate() error
type ClassifyJob
- func NewClassifyJob(c Configuration) *ClassifyJob
- func (c ClassifyJob) Failed(db couch.Database, processingErr error) error
- func (c *ClassifyJob) Find(id string) error
- func (c *ClassifyJob) Insert() error
- func (c *ClassifyJob) RefreshFromDB(db couch.Database) error
- func (c *ClassifyJob) Run(wg *sync.WaitGroup)
- func (c *ClassifyJob) SetResults(results map[string]string) (bool, error)
- func (c *ClassifyJob) UpdateProcessingLog(val string) (bool, error)
- func (c *ClassifyJob) UpdateProcessingState(newState ProcessingState) (bool, error)
type Configuration
- func NewDefaultConfiguration() *Configuration
- func (c Configuration) DbConnection() couch.Database
- func (c Configuration) Merge(parsedDocOpts map[string]interface{}) (Configuration, error)
- func (c Configuration) NewBlobStoreClient() (BlobStore, error)
type Datafile
- func FindDatafile(db couch.Database, datafileId string) (*Datafile, error)
- func NewDatafile(c Configuration) *Datafile
- func (d Datafile) CopyToBlobStore(db couch.Database, blobStore BlobStore) (string, error)
- func (d Datafile) Failed(db couch.Database, processingErr error) error
- func (d Datafile) FinishedSuccessfully(db couch.Database) error
- func (d *Datafile) GetProcessingState() ProcessingState
- func (d Datafile) HasValidId() bool
- func (d *Datafile) RefreshFromDB(db couch.Database) error
- func (d Datafile) Save(db couch.Database) (*Datafile, error)
- func (d *Datafile) SetProcessingState(newState ProcessingState)
- func (d *Datafile) UpdateProcessingState(newState ProcessingState) (bool, error)
type DatafileDownloader
- func (d DatafileDownloader) Run(wg *sync.WaitGroup)
type Dataset
- func NewDataset(c Configuration) *Dataset
- func (d *Dataset) AddArtifactUrls() error
- func (d Dataset) Failed(db couch.Database, processingErr error) error
- func (d Dataset) FinishedSuccessfully(db couch.Database) error
- func (d *Dataset) GetProcessingState() ProcessingState
- func (d Dataset) GetSplittableDatafile(db couch.Database) (*Datafile, error)
- func (d Dataset) GetTestingDatafile(db couch.Database) (*Datafile, error)
- func (d Dataset) GetTestingDatafileUrl(db couch.Database) string
- func (d Dataset) GetTrainingDatafile(db couch.Database) (*Datafile, error)
- func (d Dataset) GetTrainingDatafileUrl(db couch.Database) string
- func (d *Dataset) Insert() error
- func (d *Dataset) RefreshFromDB(db couch.Database) error
- func (d *Dataset) SetProcessingState(newState ProcessingState)
- func (d Dataset) TestingArtifactPath() string
- func (d Dataset) TrainingArtifactPath() string
- func (d *Dataset) UpdateArtifactUrls(trainingDatasetUrl, testingDatasetUrl string) (bool, error)
- func (d *Dataset) UpdateProcessingLog(val string) (bool, error)
- func (d *Dataset) UpdateProcessingState(newState ProcessingState) (bool, error)
type DatasetSplitter
- func (d DatasetSplitter) DownloadDatafiles()
- func (d DatasetSplitter) Run(wg *sync.WaitGroup)
- func (d DatasetSplitter) SplitDatafile()
type ElasticThoughtDoc
type EndpointContext
- func (e EndpointContext) CreateClassificationJobEndpoint(c *gin.Context)
- func (e EndpointContext) CreateClassifierEndpoint(c *gin.Context)
- func (e EndpointContext) CreateDataFileEndpoint(c *gin.Context)
- func (e EndpointContext) CreateDataSetsEndpoint(c *gin.Context)
- func (e EndpointContext) CreateSolverEndpoint(c *gin.Context)
- func (e EndpointContext) CreateTrainingJob(c *gin.Context)
- func (e EndpointContext) CreateUserEndpoint(c *gin.Context)
type FileSystemBlobHandle
- func (h FileSystemBlobHandle) Nodes() map[string]time.Time
type FileSystemBlobStore
- func NewFileSystemBlobStore(rootPath string) (*FileSystemBlobStore, error)
- func (f FileSystemBlobStore) Get(path string) (io.ReadCloser, error)
- func (f FileSystemBlobStore) OpenFile(path string) (BlobHandle, error)
- func (f FileSystemBlobStore) Put(srcname, dest string, r io.Reader, opts BlobPutOptions) error
- func (f FileSystemBlobStore) Rm(path string) error
type InProcessJobScheduler
- func NewInProcessJobScheduler(c Configuration) *InProcessJobScheduler
- func (j InProcessJobScheduler) ScheduleJob(jobDescriptor JobDescriptor) error
type JobDescriptor
- func NewJobDescriptor(docId string) *JobDescriptor
type JobScheduler
type LayerType
type MockBlobStore
- func NewMockBlobStore() *MockBlobStore
- func (m *MockBlobStore) Get(path string) (io.ReadCloser, error)
- func (m *MockBlobStore) OpenFile(path string) (BlobHandle, error)
- func (m *MockBlobStore) Put(srcname, dest string, r io.Reader, opts BlobPutOptions) error
- func (m *MockBlobStore) QueueGetResponse(pathRegex string, response io.Reader)
- func (m *MockBlobStore) Rm(fn string) error
type NsqJobScheduler
- func NewNsqJobScheduler(c Configuration) *NsqJobScheduler
- func (j NsqJobScheduler) ScheduleJob(jobDescriptor JobDescriptor) error
type NsqWorker
- func NewNsqWorker(c Configuration) *NsqWorker
- func (n NsqWorker) HandleEvents()
type Processable
type ProcessingState
- func (p ProcessingState) MarshalJSON() ([]byte, error)
- func (p *ProcessingState) UnmarshalJSON(bytes []byte) error
type QueueType
type ResponseQueue
type Runnable
- func CreateJob(config Configuration, jobDescriptor JobDescriptor) (Runnable, error)
type Solver
- func NewSolver(config Configuration) *Solver
- func (s Solver) DownloadSpecToBlobStore(db couch.Database, blobStore BlobStore) (*Solver, error)
- func (s Solver) Insert(db couch.Database) (*Solver, error)
- func (s Solver) Save(db couch.Database) (*Solver, error)
- func (s Solver) SaveTrainTestData(config Configuration, destDirectory string) (trainingLabelIndex []string, err error)
- func (s Solver) SpecificationNetUrlPath() (string, error)
- func (s Solver) SpecificationUrlPath() (string, error)
type TestDataset
type TrainingDataset
type TrainingJob
- func NewTrainingJob(c Configuration) *TrainingJob
- func (j TrainingJob) Failed(db couch.Database, processingErr error) error
- func (j *TrainingJob) Find(id string) error
- func (j TrainingJob) FinishedSuccessfully(db couch.Database, logPath string) error
- func (j *TrainingJob) GetProcessingState() ProcessingState
- func (j TrainingJob) Insert(db couch.Database) (*TrainingJob, error)
- func (j *TrainingJob) RefreshFromDB(db couch.Database) error
- func (j *TrainingJob) Run(wg *sync.WaitGroup)
- func (j *TrainingJob) SetProcessingState(newState ProcessingState)
- func (j *TrainingJob) UpdateLabels(labels []string) (bool, error)
- func (j *TrainingJob) UpdateProcessingLog(val string) (bool, error)
- func (j *TrainingJob) UpdateProcessingState(newState ProcessingState) (bool, error)
type User
- func AuthenticateUser(db couch.Database, username, password string) (*User, error)
- func NewUser() *User
- func NewUserFromUser(other User) *User
- func (u User) DocId() string

Constants ¶

View Source

const (
	TRAINING_ARTIFACT = "training.tar.gz"
	TEST_ARTIFACT     = "testing.tar.gz"
	TRAINING_INDEX    = "training.txt"
	TESTING_INDEX     = "testing.txt"
	TRAINING_DIR      = "training-data"
	TESTING_DIR       = "test-data"
	CBFS_URI_PREFIX   = "cbfs://"
)

View Source

const (
	MIDDLEWARE_KEY_DB   = "db"
	MIDDLEWARE_KEY_USER = "user"
)

View Source

const (
	DOC_TYPE_USER         = "user"
	DOC_TYPE_DATAFILE     = "datafile"
	DOC_TYPE_DATASET      = "dataset"
	DOC_TYPE_SOLVER       = "solver"
	DOC_TYPE_TRAINING_JOB = "training-job"
	DOC_TYPE_CLASSIFIER   = "classifier"
	DOC_TYPE_CLASSIFY_JOB = "classify-job"
)

View Source

const (
	PROCESSING_STATE_PENDING               = "pending"
	PROCESSING_STATE_PROCESSING            = "processing"
	PROCESSING_STATE_FINISHED_SUCCESSFULLY = "finished_successfully"
	PROCESSING_STATE_FAILED                = "failed"
)

View Source

const (
	IMAGE_DATA = LayerType(caffe.V1LayerParameter_IMAGE_DATA)
	DATA       = LayerType(caffe.V1LayerParameter_DATA)
)

Variables ¶

View Source

var LogKeys []string

The logging keys available in elastic thought

Functions ¶

func Asset ¶

func Asset(name string) ([]byte, error)

Asset loads and returns the asset for the given name. It returns an error if the asset could not be found or could not be loaded.

func AssetDir ¶

func AssetDir(name string) ([]string, error)

AssetDir returns the file names below a certain directory embedded in the file by go-bindata. For example if you run go-bindata on data/... and data contains the following hierarchy:

data/
  foo.txt
  img/
    a.png
    b.png

then AssetDir("data") would return []string{"foo.txt", "img"} AssetDir("data/img") would return []string{"a.png", "b.png"} AssetDir("foo.txt") and AssetDir("notexist") would return an error AssetDir("") will return []string{"data"}.

func AssetInfo ¶

func AssetInfo(name string) (os.FileInfo, error)

AssetInfo loads and returns the asset info for the given name. It returns an error if the asset could not be found or could not be loaded.

func AssetNames ¶

func AssetNames() []string

AssetNames returns the names of the assets.

func CbfsReadWriteFile ¶

func CbfsReadWriteFile(config Configuration, destPath, content string) error

func CbfsSanityCheck ¶

func CbfsSanityCheck(config Configuration) error

Perform a

func CopyFileContents ¶

func CopyFileContents(src, dst string) (err error)

copyFileContents copies the contents of the file named src to the file named by dst. The file will be created if it does not already exist. If the destination file exists, all it's contents will be replaced by the contents of the source file.

Source: http://stackoverflow.com/questions/21060945/simple-way-to-copy-a-file-in-golang

func DbAuthRequired ¶

func DbAuthRequired() gin.HandlerFunc

Gin middleware to authenticate the user specified in the Basic Auth Authorization header. It will lookup the user in the database (this middleware requires the use of the DbConnector middleware to have run before it), and then add to the Gin Context.

func DbConnector ¶

func DbConnector(dbUrl string) gin.HandlerFunc

Gin middleware to connnect to the Sync Gw database given in the dbUrl parameter, and set the connection object into the context. This creates a new connection for each request, which is ultra-conservative in case the connection object isn't safe to use among multiple goroutines (and I believe it is). If it becomes a bottleneck, it's easy to create another middleware that re-uses an existing connection.

func EnableAllLogKeys ¶

func EnableAllLogKeys()

Enable logging for all logging keys

func EnvironmentSanityCheck ¶

func EnvironmentSanityCheck(config Configuration) error

func Mkdir ¶

func Mkdir(directory string) error

func NewUuid ¶

func NewUuid() string

func RestoreAsset ¶

func RestoreAsset(dir, name string) error

Restore an asset under the given directory

func RestoreAssets ¶

func RestoreAssets(dir, name string) error

Restore assets under the given directory recursively

func TempDir ¶

func TempDir() string

TempDir returns the default directory to use for temporary files.

Types ¶

type BlobHandle ¶

type BlobHandle interface {

	// The nodes that contain the file corresponding to this blob
	// and the last time it was "scrubbed" (cbfs terminology)
	Nodes() map[string]time.Time
}

Calling OpenFile with a path on a BlobStore returns a BlobHandle which allows for reading and metadata.

type BlobPutOptions ¶

type BlobPutOptions struct {
	cbfsclient.PutOptions
}

type BlobStore ¶

type BlobStore interface {
	Get(path string) (io.ReadCloser, error)
	Put(srcname, dest string, r io.Reader, opts BlobPutOptions) error
	Rm(fn string) error
	OpenFile(path string) (BlobHandle, error)
}

BlobStore provides a blob store interface.

func NewBlobStore ¶

func NewBlobStore(rawurl string) (BlobStore, error)

type CbfsBlobStore ¶

type CbfsBlobStore struct {
	*cbfsclient.Client
	// contains filtered or unexported fields
}

func NewCbfsBlobStore ¶

func NewCbfsBlobStore(uri string) (*CbfsBlobStore, error)

func (*CbfsBlobStore) OpenFile ¶

func (c *CbfsBlobStore) OpenFile(path string) (BlobHandle, error)

func (*CbfsBlobStore) Put ¶

func (c *CbfsBlobStore) Put(srcname, dest string, r io.Reader, opts BlobPutOptions) error

type ChangesListener ¶

type ChangesListener struct {
	Configuration Configuration
	Database      couch.Database
	JobScheduler  JobScheduler
}

A changes listener listens for changes on the _changes feed and reacts to them. The changes listener currently runs as a goroutine in the httpd process, and so the system only currently supports having a single httpd process, because otherwise there would be multiple changes listeners on the same changes feed, which will cause duplicate jobs to get kicked off. If the system needs to support multiple http processes, then the changes listener needs to run in its own process.

func NewChangesListener ¶

func NewChangesListener(c Configuration, jobScheduler JobScheduler) (*ChangesListener, error)

Create a new ChangesListener

func (ChangesListener) FollowChangesFeed ¶

func (c ChangesListener) FollowChangesFeed()

Follow changes feed. This will typically be run in its own goroutine.

type Classifier ¶

type Classifier struct {
	ElasticThoughtDoc
	SpecificationUrl string `json:"specification-url" binding:"required"`
	TrainingJobID    string `json:"training-job-id" binding:"required"`
	Scale            string `json:"scale" binding:"required"`
	ImageWidth       string `json:"image-width" binding:"required"`
	ImageHeight      string `json:"image-height" binding:"required"`
	Color            bool   `json:"color"`
	Gpu              bool   `json:"gpu"`

	// had to make exported, due to https://github.com/gin-gonic/gin/pull/123
	// waiting for this to get merged into master branch, since go get
	// pulls from master branch.
	Configuration Configuration
}

A classifier uses a trained model to classify new incoming data points

func NewClassifier ¶

func NewClassifier(c Configuration) *Classifier

Create a new classifier. If you don't use this, you must set the embedded ElasticThoughtDoc Type field.

func (*Classifier) Find ¶

func (c *Classifier) Find(id string) error

Find a classifier in the db with the given id, or return an error if not found

func (*Classifier) Insert ¶

func (c *Classifier) Insert() error

Insert into database (only call this if you know it doesn't arleady exist, or else you'll end up w/ unwanted dupes)

func (*Classifier) RefreshFromDB ¶

func (c *Classifier) RefreshFromDB(db couch.Database) error

func (*Classifier) SetSpecificationUrl ¶

func (c *Classifier) SetSpecificationUrl(specUrlCbfs string) error

func (Classifier) Validate ¶

func (c Classifier) Validate() error

type ClassifyJob ¶

type ClassifyJob struct {
	ElasticThoughtDoc
	ProcessingState ProcessingState `json:"processing-state"`
	ProcessingLog   string          `json:"processing-log"`
	StdOutUrl       string          `json:"std-out-url"`
	StdErrUrl       string          `json:"std-err-url"`
	ClassifierID    string          `json:"classifier-id"`

	// Key: image url of image in cbfs
	// Value: the classification result for that image
	Results map[string]string `json:"results"`

	// had to make exported, due to https://github.com/gin-gonic/gin/pull/123
	// waiting for this to get merged into master branch, since go get
	// pulls from master branch.
	Configuration Configuration
}

A classify job tries to classify images given by user against the given trained model

func NewClassifyJob ¶

func NewClassifyJob(c Configuration) *ClassifyJob

Create a new classify job. If you don't use this, you must set the embedded ElasticThoughtDoc Type field.

func (ClassifyJob) Failed ¶

func (c ClassifyJob) Failed(db couch.Database, processingErr error) error

func (*ClassifyJob) Find ¶

func (c *ClassifyJob) Find(id string) error

Find a classify Job in the db with the given id, or error if not found CodeReview: duplication with Find in many places

func (*ClassifyJob) Insert ¶

func (c *ClassifyJob) Insert() error

Insert into database (only call this if you know it doesn't arleady exist, or else you'll end up w/ unwanted dupes)

func (*ClassifyJob) RefreshFromDB ¶

func (c *ClassifyJob) RefreshFromDB(db couch.Database) error

CodeReview: duplication with RefreshFromDB in many places

func (*ClassifyJob) Run ¶

func (c *ClassifyJob) Run(wg *sync.WaitGroup)

Run this job

func (*ClassifyJob) SetResults ¶

func (c *ClassifyJob) SetResults(results map[string]string) (bool, error)

func (*ClassifyJob) UpdateProcessingLog ¶

func (c *ClassifyJob) UpdateProcessingLog(val string) (bool, error)

func (*ClassifyJob) UpdateProcessingState ¶

func (c *ClassifyJob) UpdateProcessingState(newState ProcessingState) (bool, error)

Update the processing state to new state.

type Configuration ¶

type Configuration struct {
	DbUrl               string
	CbfsUrl             string
	NsqLookupdUrl       string
	NsqdUrl             string
	NsqdTopic           string
	WorkDirectory       string
	QueueType           QueueType
	NumCbfsClusterNodes int // needed to validate cbfs cluster health
}

Holds configuration values that are used throughout the application

func NewDefaultConfiguration ¶

func NewDefaultConfiguration() *Configuration

func (Configuration) DbConnection ¶

func (c Configuration) DbConnection() couch.Database

Connect to db based on url stored in config, or panic if not able to connect

func (Configuration) Merge ¶

func (c Configuration) Merge(parsedDocOpts map[string]interface{}) (Configuration, error)

Add values from parsedDocOpts into Configuration and return a new instance Example map:

map[--help:false --blob-store-url:file:///tmp --sync-gw-url:http://blah.com:4985/et]

func (Configuration) NewBlobStoreClient ¶

func (c Configuration) NewBlobStoreClient() (BlobStore, error)

Create a new cbfs client based on url stored in config

type Datafile ¶

type Datafile struct {
	ElasticThoughtDoc
	ProcessingState ProcessingState `json:"processing-state"`
	ProcessingLog   string          `json:"processing-log"`
	UserID          string          `json:"user-id"`
	Url             string          `json:"url" binding:"required"`

	// had to make exported, due to https://github.com/gin-gonic/gin/pull/123
	// waiting for this to get merged into master branch, since go get
	// pulls from master branch.
	Configuration Configuration
}

A Datafile is a raw "bundle" of data, typically a zip or .tar.gz file. It cannot be used by a solver directly, instead it used to create dataset objects which can be used by the solver. A single datafile can be used to create any number of dataset objects.

func FindDatafile ¶

func FindDatafile(db couch.Database, datafileId string) (*Datafile, error)

Find Datafile by Id from the db

func NewDatafile ¶

func NewDatafile(c Configuration) *Datafile

Create a new datafile

func (Datafile) CopyToBlobStore ¶

func (d Datafile) CopyToBlobStore(db couch.Database, blobStore BlobStore) (string, error)

Copy the contents of Datafile.Url to CBFS and return the cbfs dest path

func (Datafile) Failed ¶

func (d Datafile) Failed(db couch.Database, processingErr error) error

Update the dataset state to record that it failed Codereview: datafile.go has same method

func (Datafile) FinishedSuccessfully ¶

func (d Datafile) FinishedSuccessfully(db couch.Database) error

Mark this datafile as having finished processing succesfully

func (*Datafile) GetProcessingState ¶

func (d *Datafile) GetProcessingState() ProcessingState

func (Datafile) HasValidId ¶

func (d Datafile) HasValidId() bool

Does this datafile have a valid Id?

func (*Datafile) RefreshFromDB ¶

func (d *Datafile) RefreshFromDB(db couch.Database) error

func (Datafile) Save ¶

func (d Datafile) Save(db couch.Database) (*Datafile, error)

Save a new version of Datafile to the db

func (*Datafile) SetProcessingState ¶

func (d *Datafile) SetProcessingState(newState ProcessingState)

func (*Datafile) UpdateProcessingState ¶

func (d *Datafile) UpdateProcessingState(newState ProcessingState) (bool, error)

Update the processing state to new state.

type DatafileDownloader ¶

type DatafileDownloader struct {
	Configuration Configuration
	Datafile      Datafile
}

Worker job that downloads a datafile url contents to cbfs

func (DatafileDownloader) Run ¶

func (d DatafileDownloader) Run(wg *sync.WaitGroup)

Run this job

type Dataset ¶

type Dataset struct {
	ElasticThoughtDoc
	ProcessingState ProcessingState `json:"processing-state"`
	ProcessingLog   string          `json:"processing-log"`
	TrainingDataset TrainingDataset `json:"training" binding:"required"`
	TestDataset     TestDataset     `json:"test" binding:"required"`

	// had to make exported, due to https://github.com/gin-gonic/gin/pull/123
	// waiting for this to get merged into master branch, since go get
	// pulls from master branch.
	Configuration Configuration
}

A dataset is created from a datafile, and represents a partition of the datafile to be used for a particular purpose. The typical example would involve:

Datafile with 100 examples
Training dataset with 70 examples
Test dataset with 30 examples

func NewDataset ¶

func NewDataset(c Configuration) *Dataset

Create a new dataset. If you don't use this, you must set the embedded ElasticThoughtDoc Type field.

func (*Dataset) AddArtifactUrls ¶

func (d *Dataset) AddArtifactUrls() error

Update this dataset with the artifact urls (cbfs://<id>/training.tar.gz, ..) even though these artifacts might not exist yet.

func (Dataset) Failed ¶

func (d Dataset) Failed(db couch.Database, processingErr error) error

Update the dataset state to record that it failed Codereview: datafile.go has same method

func (Dataset) FinishedSuccessfully ¶

func (d Dataset) FinishedSuccessfully(db couch.Database) error

Update the dataset state to record that it finished successfully Codereview: de-dupe with datafile FinishedSuccessfully

func (*Dataset) GetProcessingState ¶

func (d *Dataset) GetProcessingState() ProcessingState

func (Dataset) GetSplittableDatafile ¶

func (d Dataset) GetSplittableDatafile(db couch.Database) (*Datafile, error)

Find and return the datafile associated with this dataset

func (Dataset) GetTestingDatafile ¶

func (d Dataset) GetTestingDatafile(db couch.Database) (*Datafile, error)

Get the testing datafile object

func (Dataset) GetTestingDatafileUrl ¶

func (d Dataset) GetTestingDatafileUrl(db couch.Database) string

Get the source url associated with the testing datafile

func (Dataset) GetTrainingDatafile ¶

func (d Dataset) GetTrainingDatafile(db couch.Database) (*Datafile, error)

Get the training datafile object

func (Dataset) GetTrainingDatafileUrl ¶

func (d Dataset) GetTrainingDatafileUrl(db couch.Database) string

Get the source url associated with the training datafile

func (*Dataset) Insert ¶

func (d *Dataset) Insert() error

Insert into database (only call this if you know it doesn't arleady exist, or else you'll end up w/ unwanted dupes)

func (*Dataset) RefreshFromDB ¶

func (d *Dataset) RefreshFromDB(db couch.Database) error

func (*Dataset) SetProcessingState ¶

func (d *Dataset) SetProcessingState(newState ProcessingState)

func (Dataset) TestingArtifactPath ¶

func (d Dataset) TestingArtifactPath() string

Path to testing artifact file, eg <id>/testing.tar.gz

func (Dataset) TrainingArtifactPath ¶

func (d Dataset) TrainingArtifactPath() string

Path to training artifact file, eg <id>/training.tar.gz

func (*Dataset) UpdateArtifactUrls ¶

func (d *Dataset) UpdateArtifactUrls(trainingDatasetUrl, testingDatasetUrl string) (bool, error)

func (*Dataset) UpdateProcessingLog ¶

func (d *Dataset) UpdateProcessingLog(val string) (bool, error)

func (*Dataset) UpdateProcessingState ¶

func (d *Dataset) UpdateProcessingState(newState ProcessingState) (bool, error)

Update the processing state to new state.

type DatasetSplitter ¶

type DatasetSplitter struct {
	Configuration Configuration
	Dataset       Dataset
}

Worker job that splits a dataset into training/test set

func (DatasetSplitter) DownloadDatafiles ¶

func (d DatasetSplitter) DownloadDatafiles()

func (DatasetSplitter) Run ¶

func (d DatasetSplitter) Run(wg *sync.WaitGroup)

Run this job

func (DatasetSplitter) SplitDatafile ¶

func (d DatasetSplitter) SplitDatafile()

type ElasticThoughtDoc ¶

type ElasticThoughtDoc struct {
	Revision string `json:"_rev"`
	Id       string `json:"_id"`
	Type     string `json:"type"`
}

All document structs should embed this struct go get access to the sync gateway metadata (_id, _rev) and the "type" field which differentiates the different doc types.

type EndpointContext ¶

type EndpointContext struct {
	Configuration Configuration
}

func (EndpointContext) CreateClassificationJobEndpoint ¶

func (e EndpointContext) CreateClassificationJobEndpoint(c *gin.Context)

func (EndpointContext) CreateClassifierEndpoint ¶

func (e EndpointContext) CreateClassifierEndpoint(c *gin.Context)

Creates a classifier

func (EndpointContext) CreateDataFileEndpoint ¶

func (e EndpointContext) CreateDataFileEndpoint(c *gin.Context)

Creates a datafile

func (EndpointContext) CreateDataSetsEndpoint ¶

func (e EndpointContext) CreateDataSetsEndpoint(c *gin.Context)

Creates datasets from a datafile

func (EndpointContext) CreateSolverEndpoint ¶

func (e EndpointContext) CreateSolverEndpoint(c *gin.Context)

Creates a solver

func (EndpointContext) CreateTrainingJob ¶

func (e EndpointContext) CreateTrainingJob(c *gin.Context)

Create Training Job

func (EndpointContext) CreateUserEndpoint ¶

func (e EndpointContext) CreateUserEndpoint(c *gin.Context)

Creates a new user

type FileSystemBlobHandle ¶

type FileSystemBlobHandle struct{}

func (FileSystemBlobHandle) Nodes ¶

func (h FileSystemBlobHandle) Nodes() map[string]time.Time

type FileSystemBlobStore ¶

type FileSystemBlobStore struct {
	// contains filtered or unexported fields
}

func NewFileSystemBlobStore ¶

func NewFileSystemBlobStore(rootPath string) (*FileSystemBlobStore, error)

func (FileSystemBlobStore) Get ¶

func (f FileSystemBlobStore) Get(path string) (io.ReadCloser, error)

func (FileSystemBlobStore) OpenFile ¶

func (f FileSystemBlobStore) OpenFile(path string) (BlobHandle, error)

func (FileSystemBlobStore) Put ¶

func (f FileSystemBlobStore) Put(srcname, dest string, r io.Reader, opts BlobPutOptions) error

func (FileSystemBlobStore) Rm ¶

func (f FileSystemBlobStore) Rm(path string) error

type InProcessJobScheduler ¶

type InProcessJobScheduler struct {
	Configuration   Configuration
	JobsOutstanding *sync.WaitGroup
}

Run worker jobs in a goroutine in the rest server process (as oppposed to using nsq) Makes certain testing easier

func NewInProcessJobScheduler ¶

func NewInProcessJobScheduler(c Configuration) *InProcessJobScheduler

func (InProcessJobScheduler) ScheduleJob ¶

func (j InProcessJobScheduler) ScheduleJob(jobDescriptor JobDescriptor) error

type JobDescriptor ¶

type JobDescriptor struct {
	DocIdToProcess string `json:"doc-id-to-process"`
}

A job descriptor is meant to fully describe a worker job. It needs to be easily serializable into json so it can be passed on the wire.

func NewJobDescriptor ¶

func NewJobDescriptor(docId string) *JobDescriptor

Create a new JobDescriptor

type JobScheduler ¶

type JobScheduler interface {
	ScheduleJob(job JobDescriptor) error
}

By swapping out the job scheduler, you can easily swap out the ability to have jobs placed on NSQ or to be run in a local goroutine inside the rest server process

type LayerType ¶

type LayerType int32

type MockBlobStore ¶

type MockBlobStore struct {

	// Queued responses for blob Get requests.  The key is a
	// regex that should match the path in the Get request.
	GetResponses map[string]ResponseQueue
}

var (
	DefaultMockBlobStore *MockBlobStore
)

func NewMockBlobStore ¶

func NewMockBlobStore() *MockBlobStore

func (*MockBlobStore) Get ¶

func (m *MockBlobStore) Get(path string) (io.ReadCloser, error)

func (*MockBlobStore) OpenFile ¶

func (m *MockBlobStore) OpenFile(path string) (BlobHandle, error)

func (*MockBlobStore) Put ¶

func (m *MockBlobStore) Put(srcname, dest string, r io.Reader, opts BlobPutOptions) error

func (*MockBlobStore) QueueGetResponse ¶

func (m *MockBlobStore) QueueGetResponse(pathRegex string, response io.Reader)

Queue up a response to a Get request

func (*MockBlobStore) Rm ¶

func (m *MockBlobStore) Rm(fn string) error

type NsqJobScheduler ¶

type NsqJobScheduler struct {
	Configuration Configuration
}

func NewNsqJobScheduler ¶

func NewNsqJobScheduler(c Configuration) *NsqJobScheduler

func (NsqJobScheduler) ScheduleJob ¶

func (j NsqJobScheduler) ScheduleJob(jobDescriptor JobDescriptor) error

type NsqWorker ¶

type NsqWorker struct {
	Configuration Configuration
}

A worker which pulls jobs off of NSQ and processes them

func NewNsqWorker ¶

func NewNsqWorker(c Configuration) *NsqWorker

func (NsqWorker) HandleEvents ¶

func (n NsqWorker) HandleEvents()

type Processable ¶

type Processable interface {
	GetProcessingState() ProcessingState
	SetProcessingState(newState ProcessingState)
	RefreshFromDB(db couch.Database) error
}

type ProcessingState ¶

type ProcessingState int

For objects that require processing, like Dataset objects, the ProcessingState helps track the current state of processing.

const (
	Pending ProcessingState = iota
	Processing
	FinishedSuccessfully
	Failed
)

func (ProcessingState) MarshalJSON ¶

func (p ProcessingState) MarshalJSON() ([]byte, error)

func (*ProcessingState) UnmarshalJSON ¶

func (p *ProcessingState) UnmarshalJSON(bytes []byte) error

Custom Unmarshal so that "Pending" is mapped to the numeric ProcessingState

type QueueType ¶

type QueueType int

const (
	Nsq QueueType = iota
	Goroutine
)

type ResponseQueue ¶

type ResponseQueue []io.Reader

type Runnable ¶

type Runnable interface {
	Run(wg *sync.WaitGroup)
}

func CreateJob ¶

func CreateJob(config Configuration, jobDescriptor JobDescriptor) (Runnable, error)

Create a new job based on the Job Descriptor

type Solver ¶

type Solver struct {
	ElasticThoughtDoc
	DatasetId           string `json:"dataset-id"`
	SpecificationUrl    string `json:"specification-url" binding:"required"`
	SpecificationNetUrl string `json:"specification-net-url" binding:"required"`

	// had to make exported, due to https://github.com/gin-gonic/gin/pull/123
	// waiting for this to get merged into master branch, since go get
	// pulls from master branch.
	Configuration Configuration

	// distinguish between IMAGE_DATA, LEVELDB, LMDB, etc.
	// this assumes that test and training input layers are
	// of the same layer type.
	LayerType LayerType
}

A solver can generate trained models, which ban be used to make predictions

func NewSolver ¶

func NewSolver(config Configuration) *Solver

Create a new solver. If you don't use this, you must set the embedded ElasticThoughtDoc Type field.

func (Solver) DownloadSpecToBlobStore ¶

func (s Solver) DownloadSpecToBlobStore(db couch.Database, blobStore BlobStore) (*Solver, error)

download contents of solver-spec-url into cbfs://<solver-id>/spec.prototxt and update solver object's solver-spec-url with cbfs url

func (Solver) Insert ¶

func (s Solver) Insert(db couch.Database) (*Solver, error)

Insert into database (only call this if you know it doesn't arleady exist, or else you'll end up w/ unwanted dupes)

func (Solver) Save ¶

func (s Solver) Save(db couch.Database) (*Solver, error)

Saves the solver to the db, returns latest rev

func (Solver) SaveTrainTestData ¶

func (s Solver) SaveTrainTestData(config Configuration, destDirectory string) (trainingLabelIndex []string, err error)

Download and untar the training and test .tar.gz files associated w/ solver, as well as index files.

Returns the label index (each label indexed by its numeric label id), and an error or nil

func (Solver) SpecificationNetUrlPath ¶

func (s Solver) SpecificationNetUrlPath() (string, error)

func (Solver) SpecificationUrlPath ¶

func (s Solver) SpecificationUrlPath() (string, error)

If spefication url is "cbfs://foo/bar.txt", return "/foo/bar.txt"

type TestDataset ¶

type TestDataset struct {
	DatafileID      string  `json:"datafile-id" binding:"required"`
	SplitPercentage float64 `json:"split-percentage"`
	Url             string  `json:"url"`
}

type TrainingDataset ¶

type TrainingDataset struct {
	DatafileID      string  `json:"datafile-id" binding:"required"`
	SplitPercentage float64 `json:"split-percentage"`
	Url             string  `json:"url"`
}

type TrainingJob ¶

type TrainingJob struct {
	ElasticThoughtDoc
	ProcessingState ProcessingState `json:"processing-state"`
	ProcessingLog   string          `json:"processing-log"`
	UserID          string          `json:"user-id"`
	SolverId        string          `json:"solver-id" binding:"required"`
	StdOutUrl       string          `json:"std-out-url"`
	StdErrUrl       string          `json:"std-err-url"`
	TrainedModelUrl string          `json:"trained-model-url"`
	Labels          []string        `json:"labels"`

	// had to make exported, due to https://github.com/gin-gonic/gin/pull/123
	// waiting for this to get merged into master branch, since go get
	// pulls from master branch.
	Configuration Configuration
}

A training job represents a "training session" of a solver against training/test data

func NewTrainingJob ¶

func NewTrainingJob(c Configuration) *TrainingJob

Create a new training job. If you don't use this, you must set the embedded ElasticThoughtDoc Type field.

func (TrainingJob) Failed ¶

func (j TrainingJob) Failed(db couch.Database, processingErr error) error

Update the state to record that it failed Codereview: de-dupe

func (*TrainingJob) Find ¶

func (j *TrainingJob) Find(id string) error

Find a training job in the db with the given id, or return an error if not found

func (TrainingJob) FinishedSuccessfully ¶

func (j TrainingJob) FinishedSuccessfully(db couch.Database, logPath string) error

Update the state to record that it succeeded Codereview: de-dupe

func (*TrainingJob) GetProcessingState ¶

func (j *TrainingJob) GetProcessingState() ProcessingState

func (TrainingJob) Insert ¶

func (j TrainingJob) Insert(db couch.Database) (*TrainingJob, error)

Insert into database (only call this if you know it doesn't arleady exist, or else you'll end up w/ unwanted dupes) TODO: use same approach as Classifier#Insert() Codereview: de-dupe

func (*TrainingJob) RefreshFromDB ¶

func (j *TrainingJob) RefreshFromDB(db couch.Database) error

func (*TrainingJob) Run ¶

func (j *TrainingJob) Run(wg *sync.WaitGroup)

Run this job

func (*TrainingJob) SetProcessingState ¶

func (j *TrainingJob) SetProcessingState(newState ProcessingState)

func (*TrainingJob) UpdateLabels ¶

func (j *TrainingJob) UpdateLabels(labels []string) (bool, error)

func (*TrainingJob) UpdateProcessingLog ¶

func (j *TrainingJob) UpdateProcessingLog(val string) (bool, error)

func (*TrainingJob) UpdateProcessingState ¶

func (j *TrainingJob) UpdateProcessingState(newState ProcessingState) (bool, error)

Update the processing state to new state.

type User ¶

type User struct {
	ElasticThoughtDoc
	Username string `json:"username"`
	Email    string `json:"email"`
	Password string `json:"password"`
}

An ElasticThought user.

func AuthenticateUser ¶

func AuthenticateUser(db couch.Database, username, password string) (*User, error)

Does this username/password combo exist in the database? If so, return the user. If not, return an error.

func NewUser ¶

func NewUser() *User

Create a new User

func NewUserFromUser ¶

func NewUserFromUser(other User) *User

Create a new User based on values in another user

func (User) DocId ¶

func (u User) DocId() string

The doc id of this user. If the username is "foo", the doc id will be "user:foo"

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
caffe Package caffe is a generated protocol buffer package.	Package caffe is a generated protocol buffer package.
cli
elastic-thought Command line utility to launch the ElasticThought REST API server.	Command line utility to launch the ElasticThought REST API server.
envcheck
nsq-client
update-wrapper
worker Command line utility to launch an ElasticThought worker	Command line utility to launch an ElasticThought worker
docker
cloudformation
scripts/generate_dockerfiles
scripts/generate_fleet
tools

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

The problem

Components

Deployment Architecture

Roadmap

Design goals

Documentation

System Requirements

Installing elastic-thought on a single CoreOS host (Development mode)

Installing elastic-thought on AWS (Production mode)

Launch EC2 instances via CloudFormation script

Verify CoreOS cluster

Kick off ElasticThought

Installing elastic-thought on Vagrant (Staging mode)

Update Vagrant

Install CoreOS on Vagrant

Run CoreOS

Workaround CoreOS + Vagrant issues:

Continue steps above

FAQ

Related Projects

License

Documentation ¶

Overview ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func Asset ¶

func AssetDir ¶

func AssetInfo ¶

func AssetNames ¶

func CbfsReadWriteFile ¶

func CbfsSanityCheck ¶

func CopyFileContents ¶

func DbAuthRequired ¶

func DbConnector ¶

func EnableAllLogKeys ¶

func EnvironmentSanityCheck ¶

func Mkdir ¶

func NewUuid ¶

func RestoreAsset ¶

func RestoreAssets ¶

func TempDir ¶

Types ¶

type BlobHandle ¶

type BlobPutOptions ¶

type BlobStore ¶

func NewBlobStore ¶

type CbfsBlobStore ¶

func NewCbfsBlobStore ¶

func (*CbfsBlobStore) OpenFile ¶

func (*CbfsBlobStore) Put ¶

type ChangesListener ¶

func NewChangesListener ¶

func (ChangesListener) FollowChangesFeed ¶

type Classifier ¶

func NewClassifier ¶

func (*Classifier) Find ¶

func (*Classifier) Insert ¶

func (*Classifier) RefreshFromDB ¶

func (*Classifier) SetSpecificationUrl ¶

func (Classifier) Validate ¶

type ClassifyJob ¶

func NewClassifyJob ¶

func (ClassifyJob) Failed ¶

func (*ClassifyJob) Find ¶

func (*ClassifyJob) Insert ¶

func (*ClassifyJob) RefreshFromDB ¶

func (*ClassifyJob) Run ¶

func (*ClassifyJob) SetResults ¶

func (*ClassifyJob) UpdateProcessingLog ¶

func (*ClassifyJob) UpdateProcessingState ¶

type Configuration ¶

func NewDefaultConfiguration ¶

func (Configuration) DbConnection ¶

func (Configuration) Merge ¶

func (Configuration) NewBlobStoreClient ¶

type Datafile ¶

func FindDatafile ¶