medifor

command
v0.0.0-...-5a762db Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 26, 2021 License: Apache-2.0 Imports: 1 Imported by: 0

README

MediFor command-line utility.

The medifor utility is useful for sending requests to analytics; it packages up images or videos into a format that is understood by the gRPC interface and asks the analytic to do the detection.

Basic Use

Run the utility with the help command to get help about usage.

Many of the commands attempt to connect to an analytic server. The default host and port for that server is :50051, which means "port 50051 on the local host, visible to the local network". You can change it using the --host flag if needed.

If you want to check that you can talk to your analytic service at all, use the health command to see if you can connect to it and it returns a "serving" status. That can be really helpful for debugging.

If you are running your service inside of a container and the health check is not coming back properly, try running the container without detaching so you can see its output logs easily, and run the health check against it from a different terminal. That can help you find bugs.

Running Single-File Detections

The medifor detect command has several subcommands, including imgmanip, vidmanip, and others. These do essentially what you think they do: medifor detect imgmanip sends an image manipuation detection request to your analytic, and outputs the response (which will point to a mask file if you write one).

The output can be formatted as JSON or as a text-serialized protocol buffer. Both are readable formats.

Defaults Are Not Shown

If a value in your output is the default "zero" value, it will not be displayed. That is fine and expected. Defaults are never sent over the wire because the absence of a value means, in proto3, that it has its default value. Thus, the absence of an int will be seen as that int having the value 0. The absence of a string means it has the value "", etc.

If you do not use a value, leave it as its default. Resist the temptation to specify something like -1 to mean "I don't set this." Just leave things as their defaults when unused. The protocol is set up to make it easy, for example, to specify that you are not providing a score. That's what OPT_OUT_DETECTION means. Use those fields that indicate inaction to specify inaction, rather than producing magic values.

Running Batch Detections

To run a set of detections, e.g., image manipulations, we have provided the medifor detect batch command. This accepts a single .csv file as input, using the NIST input CSV specification, where the following columns are present (for the image manipulation "MDL" task, at least---see NIST documentation for more details):

  • TaskID
  • ProbeFileID
  • ProbeFileName
  • ProbeWidth
  • ProbeHeight
  • HPDeviceID (usually not present)
  • HPSensorID (usually not present)

The detect batch command will read the rows from a CSV of this format, and attempt to run requests for each row against your analytic. The output is a JSON-serialized Detection protocol buffer, which contains information like when the request was made, what status it returned, the contents of the request, and the contents of the response.

In normal use, this output should be written to a file, but if you do not specify an output file, it is written to STDOUT. If you only redirect STDOUT to a file (not STDERR!), you will capture the structured log output.

To run the batch detection command, you might choose to use the sample assets in this project, e.g.,

$ medifor detect batch assets/manipulations.csv output.json

This instructs the tool to read its tasks from the given CSV file and output its results to the specified output.json file.

Packaging Batch Results for Validation/Scoring

To package up the results of your batch run into a format accepted by NIST (which we use for validating take-home results, for example), you can use the nist package command, like so:

$ medifor nist package output.json my-batch-output.tar.gz

This also takes two arguments, namely the JSON log from a detect batch command, and the name of a new tarball that it will create. This command looks through all of the entries in output.json, finds all of the corresponding output mask files, and packages them all into a NIST tarball containing a CSV file representing those rows, and a mask directory containing the mask files.

When submitting results, we strongly suggest that you keep your original files, including the JSON log for debugging purposes, and send the tarball to the program-specified submission location.

Sending Requests to an Analytic Service Container

Note that any request you send will necessarily reference files. Those file paths need to be either visible from the analytic's perspective, or translatable into a path that is thus visible.

In other words, if you run something like

$ medifor detect imgmanip /path/to/images/img.png

The analytic server needs to be able to see /path/to/images/img.png. If it's running in a container, you might have mounted that directory somewhere else, in which case you need to do path translation.

Let's suppose that you have set up a mount point in your container like this (host -> container):

  • Host: /path/to/images
  • Container: /data/images

You can still call medifor detect imgmanip /path/to/images, but you will need to supply a flag telling it how the mount point is mapped:

$ medifor -s /path/to/images -t /data/images detect imgmanip /path/to/images

There are similar flags for mapping the output directory, as well. So, for example, if you want the output mask images written to /path/to/outputs, but the container has that location mapped to /data/outputs, you would specify additional flags like this:

$ medifor -s /path/to/images -t /data/images \
          -S /path/to/outputs -T /data/outputs \
          detect imgmanip /path/to/images/img.png

This all assumes that you have run your analytic inside of a docker container with appropriate volume mounts. Here's how you might go about doing that:

docker run \
  --mount type=bind,writable=false,source=/path/to/images,target=/data/images \
  --mount type=bind,writable=true,source=/path/to/outputs,target=/data/outputs \
  --publish 50051:50051 \
  --detach my_analytic_image

That will start up my_analytic_image in a container that can see files in /data/images and can write to /data/outputs, as well as publishing port 50051 to the host network.

Then you can run medifor on the host, provided that you only reference images in /path/to/images and its subdirectories.

File mappings can get people turned around, so just remember this: they mirror your mounts exactly. Thus, if you have mounts like the following from your host to your container:

  • /host/inputs -> /container/inputs
  • /host/outputs -> /container/outputs

Then you can specify those mappings in the medifor command with these flags:

-s /host/inputs -t /container/inputs -S /host/outputs -T /container/outputs

If you get tired of typing that out all the time, you can use a config YAML file, like so:

source: /host/inputs
target: /container/inputs
out_source: /host/outputs
out_target: /container/outputs

and then you can refer to the config file using the --config flag, or drop it in $HOME/.config/medifor.yml.

Building

This utility can be built from the root of this repository by issuing

$ go build ./...

This creates a release binary for your platform inside of the release directory.

Documentation

Overview

Binary medifor sends requests to a medifor gRPC service and receives replies.

Directories

Path Synopsis
Package cmd contains individual commands used in the medifor command line client.
Package cmd contains individual commands used in the medifor command line client.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL