replds
Maintains a (small) set of files, replicated across multiple servers.
It is targeted at small datasets that are managed by automation
workflows and need to be propagated to machines at runtime.
Data replication is eventually consistent, conflict resolution applies
last-write-wins semantics. Writes are immediately forwarded to all
peers, but at most one copy must succeed in order for the write to be
acknowleged successfully. The last written data will appear on all
nodes as soon as network partitions are resolved.
Given the replication model, this is not safe to use with multiple
writers on overlapping key space. For read-modify-update workflows, it
is best to implement a separate locking mechanism so that only a
single workflow accesses the data at any given time (since there is no
locking in this service itself, this is necessary to prevent
out-of-order unexpected updates).
There is no dynamic cluster control: the full list of peers must be
provided to each daemon. This suggests the usage of a configuration
management system to generate the daemon configuration.
Configuration
The replds tool requires a YAML-encoded configuration file (which
you can specify using the --config command-line option). This file
should contain the following attributes:
client
- configuration for the replds client commands
url
- service URL (the hostname can resolve to multiple IP addresses)
tls
- TLS configuration for the client
cert
- path to the certificate
key
- path to the private key
ca
- path to the CA file
server
- configuration for the replds server command
path
- path of the locally managed repository
peers
- list of URLs of cluster peers
tls_client
- TLS configuration for the peer-to-peer client
cert
- path to the certificate
key
- path to the private key
ca
- path to the CA file
http_server
- configuration for the HTTP server
tls
- server-side TLS configuration
cert
- path to the server certificate
key
- path to the server's private key
ca
- path to the CA used to validate clients
acl
- TLS-based access controls, a list of entries with the
following attributes:
path
is a regular expression to match the request URL path
cn
is a regular expression that must match the CommonName
part of the subject of the client certificate
max_inflight_requests
- maximum number of in-flight requests to
allow before server-side throttling kicks in
TLS Setup
For safe usage, you will want to secure peer-to-peer and
client-to-peer communication with TLS, with separate
credentials. Then, you can set ACLs to only allow the /api/internal/
URL prefix for peers, and everything else under /api/ for all
clients.
Service integration
The replication strategy adopted by replds puts severe limits on how
it can be used, however there are at least two useful use cases that
we'd like to examine in more detail. In both cases, there is a single
master server that controls the workflow (i.e. the key space is not
partitioned).
Letsencrypt automation
In this scenario, SSL certificates are automatically generated at
runtime with Letsencrypt (from a cron job), and we need to propagate
them to front-end servers.
This scenario is relatively simple because the timeouts and delays
involved in the workflow are so much greater than propagation delays
and expected fault durations that data convergence is not an issue:
when we refresh a SSL certificate 30 days before its expiration, it's
fine if it gets picked up by application servers within a day or more.
The workflow is going to look like this:
- A cron job (on a single node) examines the local repository to find
certificates that are about to expire, and renews them using the
ACME API. We are ignoring the details of the challenge/response
validation process as they are not relevant to data propagation
issues.
- The cron job stores the results in replds.
- Periodically, the application servers are reloaded to pick up the
new certificates, possibly via another cron job.
Using an independent data reload cycle, it is potentially possible to
end up in a situation where the application is reloaded when the
certificate and the private key do not (yet) match. One possible
strategy for handling this situation is for the service to crash, and
rely on an automatic service restart policy to keep trying to start it
again until the data is up to date: not optimal perhaps, but simple
and guaranteed to converge.
Package repository
Here, we need to propagate a Debian package repository across multiple
servers for redundancy. The incoming packages are sent to the master
repository server (in our case, over SSH), where some processing takes
place that results in a bunch of files being updated (the new
packages, and the repository metadata). This processing stage needs to
access the entire repository.
We're wrapping external functionality and tools, and they may be
complex enough that we can't simply make them use the replds API, so
we're going to let the tools use the local filesystem as they normally
would. At the same time, we can't just run the repository tools on the
filesystem copy managed by replds itself, because in that case we
would not be able to detect changes. So we use a separate staging
directory to run the repository tools on, and the final workflow is:
- rsync data from the replds-managed dir to the staging dir;
- run the metadata-generation tools on the staging dir;
- synchronize the data back to replds using the sync command.
Usage
The Debian package comes with a
replds-instance-create script that
can be used to set up multiple replds instances. For an instance named
foo, the script will setup the replds@foo systemd service, and it
will create the replds-foo user and group. Add users that need to
read the repository files to that group. The configuration will be
read from /etc/replds/foo.yml.
Note that files created by the daemon will be world-readable by
default. Set the process umask if you wish to restrict this further.