rais-image-server

module
v2.10.1+incompatible Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 17, 2018 License: CC0-1.0

README

Go Report Card

RAIS Image Server

RAIS was originally built by eikeon as a 100% open source, no-commercial-products-required, proof-of-concept tile server for JP2 images within chronam.

It has been updated to allow more command-line options, more source file formats, more features, conformance to the IIIF spec, and experimental DeepZoom support.

The University of Oregon's primary use case is the Historic Oregon Newspapers project.

Setup

Docker is the preferred way to install and run RAIS.

See the manual installation instructions if you don't want to use Docker, or you want to see exactly what's going on behind the scenes.

Note that specific build and production environments can be found in the docker files and the Makefile's docker target, which may be useful for manual installation. docker/README.md describes this in a little more detail.

Dockerhub

If you pull RAIS from Dockerhub, please note that you'll be getting the latest stable version. It may not have the same features as the development version. You can look at the latest stable version in github by browsing our master branch.

For an example of running a docker image as a RAIS server, look at rundocker.sh.

On the first run, there will be a large download to get the image, but after that it will be cached locally.

Once the container has been created, it can then be started and stopped via the usual docker commands.

Docker Demo

You can run a quick demo on a Linux system by pulling the "httpd:2.4" docker image, running ./apache.sh and then visiting http://localhost. Click the IIIF link, and you'll be able to see the test JP2 in two different forms using Open Seadragon: it will look the same, but its INFO response will be slightly different from one to the other due to one of the two images having an explicitly overridden info response.

Additionally, if you put other files into docker/images, the apache.sh script will automatically add them to the OSD tile source list, allowing you to quickly test a variety of files.

Finally, you can test out the experimental DeepZoom support by choosing the second link.

Build locally

You can clone the repository if you want to create your own RAIS Docker image:

git clone https://github.com/uoregon-libraries/rais-image-server.git
cd rais-image-server
make docker

For contributors: note that make docker, in addition to creating a production image, will produce an image called "rais-build:f27" which can be used to compile and run tests. See docker/Dockerfile.build for examples of how to make this happen. Also consider using buildrun.sh to ease compiling and testing. dev.sh is also available for easing the edit-compile-run loop on a system with no JP2 libraries, where compilation has to go through docker.

Configuration

RAIS uses a configuration system that allows environment variables, a config file, and/or command-line flags. See rais-example.toml for an example of a configuration file. RAIS will use a configuration file if one exists at /etc/rais.toml.

The configuration file's values can be overridden by environment variables, while command-line flags will override both configuration files and environtmental variables. Configuration is best explained and understood by reading the example file above, which describes all the values in detail.

Using with chronam

To make this tile server work with chronam, you have two options.

You can modify chronam directly, which is easier for a quick test, but can make it tougher when chronam is updated.

For a longer-term solution, you can instead make your web server proxy all traffic for /images/tiles/ to the tile server. In Apache, you'd need to enable proxy and proxy_http mods, and add this to your config:

ProxyPass /images/tiles/ http://localhost:8888/images/tiles/

Unfortunately, the version of chronam we're using has a lot of other dynamic image URLs, so serving JP2s exclusively ended up requiring a lot of other chronam hacks. Our work isn't portable due to the extensive customizations we have done to the site, but you can see the branch merge commit where we centralized all dynamic image URLs in this commit to the oregonnews project

Using with Open ONI

RAIS works out of the box with Open ONI, a fork of chronam. No hacking required!

IIIF Features

When running as an IIIF server, you can browse to any valid Image's INFO page to see the features supported.

To use a custom info.json response, you can create a file with the same name as the JP2, with "-info.json" appended at the end. e.g., source.jp2-info.json. This can be useful for limiting features, custom resize values, etc. To keep the system working on any URL, you can set the @id value in the custom JSON to %ID%. Since IIIF ids are a full URL, changing paths, URLs, or ports will break custom info.json files unless you allow the system to fill in the ID. See testfile/info.json for an example.

To customize the capabilities for all images, a custom capabilities TOML file can be specified on the command-line via --capabilities-file [filename], the config value CapabilitiesFile, or using the environment variable RAIS_CAPABILITIESFILE. You can remove undesired capabilities from the list of what RAIS supports, which will prevent them from working if a client requests them. This can be helpful to avoid denial-of-service vectors, such as the extremely slow GIF output. See cap-max.toml for an example that shows all currently supported features.

Other than possible bugs, we are ensuring we support level 2 at a minimum, as well as a handful other features beyond level 2.

An example INFO request would look like http://example.com/iiif/source.jp2/info.json, assuming your server is at example.com, the IIIF prefix is iiif, and the file "source.jp2" exists relative to the configured tile path.

Full list of features supported:

  • Region:
    • "full"
    • "x,y,w,h": regionByPx
    • "pct:x,y,w,h": regionByPct
  • Size:
    • "full"
    • "w," / sizeByW
    • ",h" / sizeByH
    • "pct:x" / sizeByPct
    • "w,h" / sizeByForcedWH
    • "!w,h" / sizeByWH
    • "sizeAboveFull"
  • Rotation:
    • 0
    • "90,180,270" / rotationBy90s
    • "!0,!90,!180,!270" / mirroring
  • Quality:
    • "default"
    • "native" (same as "default")
    • "color"
    • "gray"
    • "bitonal"
  • Format:
    • jpg (This is the best format for a speedy encode and small download)
    • png
    • tif
    • gif (Note that this is VERY slow for some reason!)
  • HTTP Features:
    • baseUriRedirect
    • cors
    • jsonldMediaType

Caching

info.json responses

We've implemented a simple LRU cache for info.json responses, which holds 10,000 entries by default. The info.json data is very small, making this a fairly efficient cache. But the info.json data is very easy to generate, so the value of caching is minimal, and may be removed in the future.

Image responses

The server can optionally cache generated tiles under specific circumstances, but doesn't inherently cache the other images such as thumbnails. Tiles which are requested at a width of 1024, 512, or 256, in JPG format, can be cached by setting TileCacheLen in /etc/rais.toml, or the RAIS_TILECACHELEN environment variable. This defaults to not be enabled due to the cost of caching tiles compared to the benefits, particularly on large collections of images.

Setting the tile cache length to anything greater than zero will enable the cache. This is only recommended for systems with a small number of images or systems that expect a lot of traffic to hit a small subset of the collection, such as might be the case if there's a few featured images.

For resize requests such as thumbnails, caching can prove very beneficial, but for now RAIS doesn't have resize-only caching, as Apache handles this for us very effectively.

The server returns a valid Last-Modified header based on the last time the JP2 file changed, which Apache can use to create a simple disk-based cache:

# Cache thumbnails (and only thumbnails)
CacheRoot /var/cache/httpd/mod_disk_cache
CacheEnable disk /images/resize

# Allow a total of 4096 content directories at two levels so we never have
# more than 64 directories in any other directory.  If we cache a million
# thumbnails, we'll still only end up with about 250 files per content
# directory.
CacheDirLength 1
CacheDirLevels 2

# Change !RAIS_HOST! below to serve tiles and thumbnails from RAIS
AllowEncodedSlashes NoDecode
ProxyPassMatch ^/images/resize/([^/]*)/full/([0-6][0-9][0-9],.*jpg)$ http://!RAIS_HOST!:12415/images/iiif/$1/full/$2 nocanon
ProxyPassMatch ^/images/iiif/(.*(jpg|info\.json))$ http://!RAIS_HOST!:12415/images/iiif/$1 nocanon

This kind of configuration allows us to split the resize requests from other requests in order to have a reasonably intelligent disk cache, which is kept separate from the in-memory tile cache.

This won't be the smartest cache, but it will help when search results pages are used on large collections. It is highly advisable that the htcacheclean tool be used in tandem with Apache cache directives, and it's probably worth reading the Apache caching guide.

Known Limitations

RAIS was built first and foremost to serve tiles for JP2s that always have exactly six resolution factors ("zoom levels") and are tiled. It has been amazing for us within that context, but we don't know much about other uses, so outside of that context, there may be issues worth consideration.

JP2: Slow on huge files

Very large images (as in, hundreds of megapixels) can take a while to decode tiles. In some cases, 2-3 seconds per tile. Unfortunately, this seems to be a limitation of openjpeg. If serving up files of this size, external caching may be a good idea.

JP2: only supports RGB and Grayscale

YCC isn't supported directly (unless openjpeg does magic conversions for us, which we haven't tested). RGBa should work, but the alpha channel will be ignored. Embedded color profiles probably don't work, but they haven't been tested.

RAM usage should be monitored

Huge images and/or high traffic can cause the JP2 processor to chew up large amounts of RAM. The good news is that since compiling RAIS under Go 1.6, our RAM is significantly lower and more predictable than with Go 1.4.

Stats from about two months of monitoring:

  • Under Go 1.4, RAIS would slowly grow in RAM use until it was routinely above 1 gig of RAM (even when under relatively low load), with spikes above 2 gigs
  • Under Go 1.6, RAIS is typically under 80 megs of RAM, with spikes being few and far between, with the worst spike just over 400 megs

(For reference, RAIS serves about 800,000 tiles a week)

Please note: if you enable tile caching, RAM usage will increase, possibly by a very significant amount.

IIIF Support isn't perfect

The IIIF support adheres to level 2 of the spec (as well as some extra features), but it isn't as customizable as we would prefer.

When you don't provide your own info.json response (as described above), the default response's quality choices are hard-coded to include color, gray, and bitonal, even for gray/bitonal images.

It should also be noted that GIF output is amazingly slow. Given that GIF output isn't even an IIIF level 2 feature, we aren't planning to put much time into troubleshooting the issue. GIFs are available, but should be avoided except as one-offs.

Not all JP2 files are created equally

Our newspaper JP2s are encoded in a way that makes them very friendly to pan-and-zoom systems. They are encoded with tiling, which allows pieces of the JP2 to be read independently, and significantly reduces the memory needed to serve the data up to a viewer on the fly.

JP2s that aren't encoded like this will not be nearly as memory- and CPU-efficient. We'd recommend tiling JP2s at a size of around 1024x1024. If using graphics magick, a command like this can help:

gm convert input.tiff -flatten -quality 70 \
    -define jp2:prg=rlcp \
    -define jp2:numrlvls=7 \
    -define jp2:tilewidth=1024 \
    -define jp2:tileheight=1024 output.jp2

Additionally, grayscale images will require one-third the memory and processing power when compared to color images. If your sources are grayscale, but you encode to RGB for better preservation, consider building grayscale derivatives for web display.

Poor performance for non-JP2 files

These aren't well-tested since our system is exclusively JP2. Non-JP2 types that are supported (TIFF, JPG, PNG, and GIF) have to be read in fully and then cropped and resized in Go. This will not be as fast as image formats built for deep zooming (tiled JP2s for RAIS).

As an example: TIFF files are usually fast to process, but can potentially take up a great deal of memory. Sometimes this is okay, but it's a bottleneck quickly when running a tiling server. In our limited testing, TIFFs outperform tiled JP2s only when load is extremely light.

License

CC0

RAIS Image Server is in the public domain under a CC0 license.

Directories

Path Synopsis
src
color-assert
Package assert offers some very simple helper methods for testing.
Package assert offers some very simple helper methods for testing.
fakehttp
Package fakehttp provides a fake response writer for use in tests.
Package fakehttp provides a fake response writer for use in tests.
iiif
Package iiif defines various parsing options and compliance-level information for dealing with a web request for an IIIF resource/operation
Package iiif defines various parsing options and compliance-level information for dealing with a web request for an IIIF resource/operation
magick
Package magick is a hacked up port of the minimal functionality we need to satisfy the IIIFImageDecoder interface.
Package magick is a hacked up port of the minimal functionality we need to satisfy the IIIFImageDecoder interface.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL