archer

command module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 7, 2021 License: MIT Imports: 1 Imported by: 0

README

Archer

Artic Resource for Classifying, Honing & Exporting Reads


tests godoc goreportcard


About

This is a basic microservice that is used to pre-process data before running CLIMB workflows. It has a gRPC API that supports start/cancel/watch of sample processing tasks, and includes a command line application for running a server and client implementation (called archer).

If a user wants CLIMB to run the artic pipeline on their data, the data needs to be checked locally and uploaded to an S3 bucket before CLIMB will process it. This is where archer comes in. It will:

  • validate a sample as provided in a ProcessRequest for minimal metadata
  • filter reads linked to that sample against the amplicon primer scheme
  • compress all on-target reads and upload to S3
  • report back
Dependencies

As well as the external Go packages listed in go.mod, the following tools and packages are required to build the microservice executable and documentation:

  • Make
  • Go toolchain
  • protoc
  • protoc-gen-go
  • protoc-gen-doc
Installing

Easy installation is handled by the Makefile:

make all

This command will:

  • compile the proto files for Go
  • compile the gRPC API docs
  • run fmt, lint and vet tools on the Go code
  • run the unit tests
  • build the Go executable

WIP: There is also a containerised version of the service available which is built via a Github Action. It can be obtained from Dockerhub but is not tested/supported currently:

docker pull willrowe/archer:latest
docker run -p 9090:9090 willrowe/archer:latest
Testing

Unit tests are available for the service implementation. In addition several Go tools are used (Go lint, vet, fmt) to check the codebase. All these can be run separately using:

make test
make lint
make vet

A Github Action is used to run continuous integration testing using the above make commands on linux and mac OS.

To test the gRPC code without having to connect to a real server we use the mock package; the mock class was generated using:

mockgen github.com/will-rowe/archer/pkg/api/v1 ArcherClient > pkg/mock/client_mock.go
Running

A client and server imlementation of the Archer microservice are available in a single binary called archer, which will be found in the ./bin after installation.

To run the server:

archer launch

To run the watch client:

archer watch

To run the process client:

cat sample.json | archer process
Documentation

API documentation can be found here. Implementation documentation can be found here.

Limitations/TODOs
  • the S3 bucket upload is limited at the moment, it will need improving before this is in production
  • might be worth adding an option to daemonise the server
  • there is no index for amplicon bottom-k sketches, so each read loops over the ~90 amplicon sketches and picks the best one - not great
  • read length filtering and jacard filtering are hard coded atm
  • there is no containment search etc., just a basic similarity compairison between an amplicon sketch and a read sketch
  • if something goes wrong during the sample processing, it is just marked as errored and there is no attempt to try again/fix it
    • it might be worth cycling through the db on service start and trying to re-process anything marked as errored - definitely need to report back to user
  • intergrate with herald

Documentation

Overview

Copyright © 2021 Will Rowe

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Directories

Path Synopsis
Package cmd is the command line interface for the Archer microservice.
Package cmd is the command line interface for the Archer microservice.
pkg
amplicons
Package amplicons performs the amplicon scheme validation, handling and filtering for Archer.
Package amplicons performs the amplicon scheme validation, handling and filtering for Archer.
bucket
Package bucket manages the AWS S3 bucket uploads.
Package bucket manages the AWS S3 bucket uploads.
minhash
Package minhash is a simple KMV MinHash implementation.
Package minhash is a simple KMV MinHash implementation.
mock
Package mock_v1 is a generated GoMock package.
Package mock_v1 is a generated GoMock package.
protocol/grpc
Package grpc is the gRPC server implementation which runs the Archer service.
Package grpc is the gRPC server implementation which runs the Archer service.
service/v1
Package service implements the Archer service API.
Package service implements the Archer service API.
version
Package version is used to access the current CLI version.
Package version is used to access the current CLI version.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL