distil

command module
v0.0.0-...-0d40728 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 12, 2021 License: Apache-2.0 Imports: 30 Imported by: 0

README

distil

CircleCI Go Report Card

  • AutoML Server automated machine learning server component that implements the D3M API.
  • Primitives set of primitives created for use by Distil as steps in a D3M pipeline and included in the base D3M image.
  • Primitives Addendum set of primitives created for use by Distil as steps in a D3M pipeline and not included in the base D3M image.

Dependencies

  • Git and Git LFS Versioning softwares.
  • Go programming language binaries with the GOPATH environment variable specified and $GOPATH/bin in your PATH.
  • NodeJS JavaScript runtime.
  • Docker platform.
  • Docker Compose (optional) for managing multi-container dev environments.
  • GDAL v2.4.2 or better for geospatial data access. Available as a package for most Linux distributions, and OSX through Homebrew.

Development

Clone the repository:
mkdir -p $GOPATH/src/github.com/uncharted-distil
cd $GOPATH/src/github.com/uncharted-distil
git clone git@github.com:unchartedsoftware/distil.git
cd distil
Install dependencies:
make install
Install datasets:

Datasets are stored using git LFS and can be pulled using the datasets.sh script.

./datasets.sh

To add / remove a dataset modify the $datasets variable:

declare -a datasets=("185_baseball" "LL0_acled" "22_handgeometry")
Generate code (optional):

To regenerate the PANDAS dataframe parser if the api/compute/result/complex_field.peg file is changed, run:

make peg
Docker images:

The application requires:

  • ElasticSearch
  • PostgreSQL
  • TA2 Pipeline Server Stub

Docker images for each are available at the following registry:

docker.uncharted.software
Login to Docker Registry:
sudo docker login docker.uncharted.software
Update docker-compose.yml
---
distil-auto-ml:
  image: docker.uncharted.software/distil-auto-ml
Pull Images:

Pull docker images via Docker Compose:

./update_services.sh
Running the app:

Using three separate terminals:

Terminal 1 - Launch docker containers via Docker Compose:
./run_services.sh
Terminal 2 - Build and watch webapp:
yarn watch

The app will be accessible at localhost:8080.

Terminal 3 - Build, watch, and run server:
make watch
Advanced Configuration

The location of the dataset directory can be changed by setting the D3MINPUTDIR environment variable, and the location of the temporary data written out during model building can be set using the D3MOUTPUTDIR environment variable. The host IP address of the docker containers if not localhost can be set with DOCKER_HOST. (i.e.export DOCKER_HOST=192.168.0.10 && make watch.) These are used by the other Distil services that are launched via the run_services.sh script, and are typically set as global environment variables in .bashrc or similar.

Linter Setup
VSCODE

For the VsCode editor download and install the eslint extension. Once installed go to the editor settings (hot key ⌘⇧p -- type settings) Add the following to your settings file:

  "eslint.lintTask.enable": true, // enable eslint to run
  "eslint.validate": [
    "vue", // tell eslint to read vue files
    "html", // tell eslint to read html files
    "javascript", // tell eslint to read javascript files
    "typescript" // tell eslint to read typescript files
  ],
  "eslint.workingDirectories": [{ "mode": "auto" }], // eslint will try its best to figure out the working directory of the project

At this point save your settings file and restart VsCode. If upon restarting and the linter is not working check the output (^⇧` -- OUTPUT tab -- dropdown -- ESlint)

Common Issues:

"../repo/subpackage/file.go:10:2: cannot find package "github.com/company/package/subpackage" in any of":
  • Cause: Dependencies are out of date or have not been installed
  • Solution: Run make install to install latest dependencies.
"# pkg-config --cflags -- gdal gdal gdal gdal gdal gdal Package gdal was not found in the pkg-config search path."
  • Cause: GDAL has not been installed
  • Solution: Install GDAL using a package for your environment or download and build from source.
Mac
runtime error while training "joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker."
  • Cause: Not enough Docker resources
  • Solution: change Docker resources to recommended "CPU:10, RAM:10 gigs, Swap:2.5 gigs, Disk Image Size: 64 gigs"

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL