util-exporter-server

module
v0.7.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 20, 2022 License: Apache-2.0

README

= Exporter Script Server
// General Doc Settings
:toc: left
:source-highlighter: highlightjs
:icons: font
// Github specifics
ifdef::env-github[]
:toc: preamble
:tip-caption: :bulb:
:note-caption: :information_source:
:important-caption: :heavy_exclamation_mark:
:caution-caption: :fire:
:warning-caption: :warning:
endif::[]
Elizabeth Paige Harper <epharper@upenn.edu>
v1.0.0

// Custom Config
:repo-url: https://github.com/VEuPathDB/util-user-dataset-handler-server
:site-url: https://veupathdb.github.io/util-user-dataset-handler-server
:repo-file-base: {repo-url}/blob/master

image:https://www.travis-ci.org/VEuPathDB/util-user-dataset-handler-server.svg?branch=master["Build Status", link="https://www.travis-ci.org/VEuPathDB/util-user-dataset-handler-server"]
image:https://goreportcard.com/badge/github.com/VEuPathDB/util-user-dataset-handler-server["Go Report Card", link="https://goreportcard.com/report/github.com/VEuPathDB/util-user-dataset-handler-server"]
image:https://img.shields.io/github/v/release/VEuPathDB/util-user-dataset-handler-server["Latest Release", link="https://github.com/VEuPathDB/util-user-dataset-handler-server/releases/latest"]


Exposes configured exporter / validation scripts over HTTP.

ifdef::env-github[]
{site-url}[Rendered Readme] |
endif::[]
{site-url}/api.html[API Docs]

== Usage

=== In Script Container

==== Standard Use

https://github.com/VEuPathDB/dataset-handler-biom[Working example project]


==== Custom Use

To create a custom setup not based on the demo container, this server can be
configured by performing the following steps.

CAUTION: Currently, this server app only supports linux & mac environments.

. Download the latest release from the {repo-url}/releases/latest[releases page]
  and unpack the archive in your project directory.
. Run the command `./service gen-config` to generate a server configuration
  template file.  This file will be named `config.tpl.yml`.
. Edit the configuration file with your desired service name and command
  configurations.
+
NOTE: Check the {file-config-readme}[config file readme] for more
information about the configuration file.
. Once you have edited the configuration file, rename it and place it in the
  desired relative to the service binary.
. Validate the configuration by running the command
  `./service check-config --config=path/to/your/config.yml`.  This command will
  validate the configuration and report any errors.
+
.Bad Config Example
[source, bash-session]
----
$ ./service --config=config.yml check-config
----

== Script Expectations

This server is language agnostic when it comes to calling scripts.  Anything may
be used as long as it can be executed via a cli call.

=== Output

The script expectations are based on the Galaxy tooling.  This means scripts are
expected to output a `.tgz` file suitable for being directly loaded into iRODS.

The pattern for the tar file output name must match the iRODS trigger's expected
format of `dataset_u\{user-id}_t\{timestamp-int64}_p\{process-id}.tgz`.  For
example: `dataset_u12345_t1647364816_p132.tgz`.

This means the tar file must contain a `meta.json`, `dataset.json`, and a
directory named `datafiles` containing the actual dataset file(s).

This server will automatically pick up that file and return it to the udis
service to be loaded into iRODS.

The reason for this is the HTTP server which unpacks and starts the command does
not know what belongs or doesn't belong in the iRODS tar.

=== Error handling

A script can indicate a fatal error by exiting with a non-zero status code.  An
error message will be generated using the text printed to `stderr` by the
script.

Errors may be plaintext, a json object, or plaintext followed by a json object.

To indicate an error is a user error and not a script failure, a json object
error must be returned.

.Valid `stderr` output examples
[source, shell script]
----
# A plaintext error
Some error message that will be recorded as a script/server error.

# Error output followed by a final fatal error.
Some error text followed by a {"error": "json", "message": "object"}

# A structured error
{"error": "user", "message": "some user error"}
----

When returning both plaintext and a json object, the plaintext will be prepended
to the error message in the json output.

For example:

.Raw `stderr` output
----
Some error messages
Logged while executing the script
{"error": "fatal", "message": "failed to open input file"}
----

.Recorded error
[source, json]
----
{
  "error": "fatal",
  "message": "Some error messages\nLogged while executing the script\nfailed to open input file"
}
----


==== Json Object Error Format

[source,json]
----
{
  "error": "user", <1>
  "message": "some error text" <2>
}
----
<1> The `"error"` field is used to define the type of error.  A value of
    `"user"` indicates that the error was a user error and the user import
    service will record the error as such.  Any other value will be recorded as
    a script error.
<2> The `"message"` field defines the error message to record.

CAUTION: Both fields are required.

== Config File

=== Components

.config.yml
[source, yaml, linenums]
----
service-name: my service <1>
command:
  executable: /some/path/to/your/executable <2>
  args: <3>
    - <<input-files>> <4>
----
<1> The display name of the running service
<2> Path to an executable script or binary to be called by the server to process
    uploaded datasets.
<3> An array of arguments that will be passed to the configured executable.
    Each array element is effectively a single, quote wrapped cli arg.
+
WARNING: Space separated entries in the args array _will_ be treated as a single
         argument.
<4> An <<Variables,Injected Variable>>

=== Command Context

The configured command will be executed in an isolated subshell, but will be
provided the same environment as the server itself, meaning it is possible to
set environment variables for the script simply by setting them on the docker
container.

The output of the script's `stdout` will be piped through the server's logging
mechanism and will appear in the container logs.

The output of the script's `stderr` will be both piped through the server's
logging mechanism (like `stdout`) but will also be captured for parsing and
returning to the caller.

=== Variables

==== `+<<cwd>>+`

The working directory for the job that is running in the current request.

.Example
----
/workspace/12345
----

==== `+<<date>>+`

The current date formatted as `YYYY-MM-DD`.

.Example
----
1994-02-13
----

==== `+<<date-time>>+`

The current datetime in RFC3339 format.

.Example
----
2018-10-31T23:37:18.013557+0500
----

==== `+<<ds-description>>+`

The user provided description for the current dataset upload.

.Example
----
My dataset upload containing foo and bar
----

==== `+<<ds-name>>+`

The user provided name for the current dataset upload.

.Example
----
My Dataset 3
----

==== `+<<ds-summary>>+`

The user provided summary of the current dataset upload.

.Example
----
Some summary text for my dataset upload
----

==== `+<<ds-origin>>+`

The source/origin of the user dataset should be either `galaxy` or `direct-upload`.

.Example
----
direct-upload
----

==== `+<<ds-user-id>>+`

WDK user ID of the user that uploaded the dataset.

.Example
----
123456
----

==== `+<<input-files>>+`

A space separated list of the files that were unpacked from the uploaded zip or
tar file sorted by name ascending.

===== Examples

[#upload-tgz]
.Upload Contents
----
dataset.tgz
 ├─ foo.txt
 ├─ bar.xml
 ├─ fizz.json
 └─ buzz/
     └─ fazz.yml
----

.+<<input-files>>+
----
bar.xml buzz fizz.json foo.txt
----

===== `+<<input-files[n]>>+`

Array style access of the input file list allowing retrieval of a single file
name from the input file list.

====== Examples

These use <<#upload-tgz, this example input>>.

.+<<input-files[0]>>+
----
bar.xml
----

.+<<input-files[2]>>+
----
fizz.json
----

==== `+<<time>>+`

The current time formatted as `HH:MM:SS`.

.Example
----
03:47:58
----

==== `+<<timestamp>>+`

The current unix timestamp in seconds.

.Example
----
783647299
----

==== `<<projects>>`

A comma-separated list of project identifiers.

.Example
----
VectorBase,PlasmoDB
----

==== `+<<handler-params.p1>>+`

Access of handler-specific parameter values from the request body of the import request.

===== Examples
.Request Body
[source, json]
----
{
    "datasetName": "foo",
    "projects": [
        "PlasmoDB"
    ],
    "datasetType": "gene-list",
    "origin": "direct-upload",
    "formatParams": {
        "genome": "test-genome-1",
        "otherParam": "other-param-value"
    }
}
----

.+<<handler_params.genome>>+
----
test-genome-1
----

.+<<handler_params.otherParam>>+
----
other-param-value
----

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL