desync

package module
v0.9.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 4, 2023 License: BSD-3-Clause Imports: 50 Imported by: 6

README

desync

GoDoc

This project re-implements many features of upstream casync in Go. It seeks to maintain compatibility with casync's data structures, protocols and types, such as chunk stores (castr), index files (caibx/caidx) and archives (catar) in order to function as a drop-in replacement in many use cases. It also tries to maintain support for platforms other than Linux and simplify build/installation. It consists of a library that implements the features, available for integration into any 3rd-party product as well as a command-line tool.

For support and discussion, see Gitter chat. Feature requests should be discussed there before filing, unless you're interested in doing the work to implement them yourself.

Goals And Non-Goals

Among the distinguishing factors:

  • Supported on MacOS, though there could be incompatibilities when exchanging catar-files between Linux and Mac for example since devices and filemodes differ slightly. *BSD should work as well but hasn't been tested. Windows supports a subset of commands.
  • Where the upstream command has chosen to optimize for storage efficiency (f/e, being able to use local files as "seeds", building temporary indexes into them), this command chooses to optimize for runtime performance (maintaining a local explicit chunk store, avoiding the need to reindex) at cost to storage efficiency.
  • Where the upstream command has chosen to take full advantage of Linux platform features, this client chooses to implement a minimum featureset and, while high-value platform-specific features (such as support for btrfs reflinks into a decompressed local chunk cache) might be added in the future, the ability to build without them on other platforms will be maintained.
  • Both, SHA512/256 and SHA256 are supported hash functions.
  • Only chunk stores using zstd compression as well uncompressed are supported at this point.
  • Supports local stores as well as remote stores (as client) over SSH, SFTP and HTTP
  • Built-in HTTP(S) chunk server that can proxy multiple local or remote stores and also supports caching and deduplication for concurrent requests.
  • Drop-in replacement for casync on SSH servers when serving chunks read-only
  • Support for catar files exists, but ignores SELinux and ACLs that may be present in existing catar files and those won't be present when creating a new catar with the tar command; FCAPs are supported only as a verbatim copy of "security.capability" XAttr.
  • Supports chunking with the same algorithm used by casync (see make command) but executed in parallel. Results are identical to what casync produces, same chunks and index files, but with significantly better performance. For example, up to 10x faster than casync if the chunks are already present in the store. If the chunks are new, it heavily depends on I/O, but it's still likely several times faster than casync.
  • While casync supports very small min chunk sizes, optimizations in desync require min chunk sizes larger than the window size of the rolling hash used (currently 48 bytes). The tool's default chunk sizes match the defaults used in casync, min 16k, avg 64k, max 256k.
  • Allows FUSE mounting of blob indexes
  • S3/GC protocol support to access chunk stores for read operations and some some commands that write chunks
  • Stores and retrieves index files from remote index stores such as HTTP, SFTP, Google Storage and S3
  • Built-in HTTP(S) index server to read/write indexes
  • Reflinking matching blocks (rather than copying) from seed files if supported by the filesystem (currently only Btrfs and XFS)
  • catar archives can be created from standard tar archives, and they can also be extracted to GNU tar format.

Terminology

The documentation below uses terms that may not be clear to readers not already familiar with casync.

  • chunk - A chunk is a section of data from a file. Typically it's between 16kB and 256kB. Chunks are identified by the SHA512-256 checksum of their uncompressed data. Files are split into several chunks with the make command which tries to find chunk boundaries intelligently using the algorithm outlined in this blog post. By default, chunks are stored as files compressed with zstd and extension .cacnk.
  • chunk store - Location, either local or remote that stores chunks. In its most basic form, a chunk store can be a local directory, containing chunk files named after the checksum of the chunk. Other protocols like HTTP, S3, GC, SFTP and SSH are available as well.
  • index - Indexes are data structures containing references to chunks and their location within a file. An index is a small representation of a much larger file. Given an index and a chunk store, it's possible to re-assemble the large file or make it available via a FUSE mount. Indexes are produced during chunking operations such as the create command. The most common file extension for an index is .caibx. When catar archives are chunked, the extension .caidx is used instead.
  • index store - Index stores are used to keep index files. It could simply be a local directory, or accessed over SFTP, S3, GC or HTTP.
  • catar - Archives of directory trees, similar to what is produced by the tar command. These commonly have the .catar extension.
  • caidx - Index file of a chunked catar.
  • caibx - Index of a chunked regular blob.

Parallel chunking

One of the significant differences to casync is that desync attempts to make chunking faster by utilizing more CPU resources, chunking data in parallel. Depending on the chosen degree of concurrency, the file is split into N equal parts and each part is chunked independently. While the chunking of each part is ongoing, part1 is trying to align with part2, and part3 is trying to align with part4 and so on. Alignment is achieved once a common split point is found in the overlapping area. If a common split point is found, the process chunking the previous part stops, eg. part1 chunker stops, part2 chunker keeps going until it aligns with part3 and so on until all split points have been found. Once all split points have been determined, the file is opened again (N times) to read, compress and store the chunks. While in most cases this process achieves significantly reduced chunking times at the cost of CPU, there are edge cases where chunking is only about as fast as upstream casync (with more CPU usage). This is the case if no split points can be found in the data between min and max chunk size as is the case if most or all of the file consists of 0-bytes. In this situation, the concurrent chunking processes for each part will not align with each other and a lot of effort is wasted. The table below shows how the type of data that is being chunked can influence runtime of each operation. make refers to the process of chunking, while extract refers to re-assembly of blobs from chunks.

Command Mostly/All 0-bytes Typical data
make Slow (worst-case) - Likely comparable to casync Fast - Parallel chunking
extract Extremely fast - Effectively the speed of a truncate() syscall Fast - Done in parallel, usually limited by I/O

Copy-on-write filesystems such as Btrfs and XFS support cloning of blocks between files in order to save disk space as well as improve extraction performance. To utilize this feature, desync uses several seeds to clone sections of files rather than reading the data from chunk-stores and copying it in place:

  • A built-in seed for Null-chunks (a chunk of Max chunk size containing only 0 bytes). This can significantly reduce disk usage of files with large 0-byte ranges, such as VM images. This will effectively turn an eager-zeroed VM disk into a sparse disk while retaining all the advantages of eager-zeroed disk images.
  • A build-in Self-seed. As chunks are being written to the destination file, the file itself becomes a seed. If one chunk, or a series of chunks is used again later in the file, it'll be cloned from the position written previously. This saves storage when the file contains several repetitive sections.
  • Seed files and their indexes can be provided when extracting a file. For this feature, it's necessary to already have the index plus its blob on disk. So for example image-v1.vmdk and image-v1.vmdk.caibx can be used as seed for the extract operation of image-v2.vmdk. The amount of additional disk space required to store image-v2.vmdk will be the delta between it and image-v1.vmdk.

chunks-from-seeds

Even if cloning is not available, seeds are still useful. desync automatically determines if reflinks are available (and the block size used in the filesystem). If cloning is not supported, sections are copied instead of cloned. Copying still improves performance and reduces the load created by retrieving chunks over the network and decompressing them.

Reading and writing tar streams

In addition to packing local filesystem trees into catar archives, it is possible to read a tar archive stream. Various tar formats such as GNU and BSD tar are supported. See https://golang.org/pkg/archive/tar/ for details on supported formats. When reading from tar archives, the content is no re-ordered and written to the catar in the same order. This may create output files that are different when comparing to using the local filesystem as input since the order depends entirely on how the tar file is created. Since the catar format does not support hardlinks, the input tar stream needs to follow hardlinks for desync to process them correctly. See the --hard-dereference option in the tar utility.

catar archives can also be extracted to GNU tar archive streams. All files in the output stream are ordered the same as in the catar.

Tool

The tool is provided for convenience. It uses the desync library and makes most features of it available in a consistent fashion. It does not match upsteam casync's syntax exactly, but tries to be similar at least.

Installation

The following builds the binary and installs it into $HOME/go/bin by default.

GO111MODULE=on go get -v github.com/folbricht/desync/cmd/desync

Alternative method using a clone, building from the tip of the master branch.

git clone https://github.com/folbricht/desync.git
cd desync/cmd/desync && go install
Subcommands
  • extract - build a blob from an index file, optionally using seed indexes+blobs
  • verify - verify the integrity of a local store
  • list-chunks - list all chunk IDs contained in an index file
  • cache - populate a cache from index files without extracting a blob or archive
  • chop - split a blob according to an existing caibx and store the chunks in a local store
  • pull - serve chunks using the casync protocol over stdin/stdout. Set CASYNC_REMOTE_PATH=desync on the client to use it.
  • tar - pack a catar file, optionally chunk the catar and create an index file.
  • untar - unpack a catar file or an index referencing a catar. Device entries in tar files are unsuppored and --no-same-owner and --no-same-permissions options are ignored on Windows.
  • prune - remove unreferenced chunks from a local, S3 or GC store. Use with caution, can lead to data loss.
  • verify-index - verify that an index file matches a given blob
  • chunk-server - start a HTTP(S) chunk server/store
  • index-server - start a HTTP(S) index server/store
  • make - split a blob into chunks and create an index file
  • mount-index - FUSE mount a blob index. Will make the blob available as single file inside the mountpoint.
  • info - Show information about an index file, such as number of chunks and optionally chunks from an index that a re present in a store
  • mtree - Print the content of an archive or index in mtree-compatible format.
Options (not all apply to all commands)
  • -s <store> Location of the chunk store, can be local directory or a URL like ssh://hostname/path/to/store. Multiple stores can be specified, they'll be queried for chunks in the same order. The chop, make, tar and prune commands support updating chunk stores in S3, while verify only operates on a local store.
  • --seed <indexfile> Specifies a seed file and index for the extract command. The tool expects the matching file to be present and have the same name as the index file, without the .caibx extension.
  • --seed-dir <dir> Specifies a directory containing seed files and their indexes for the extract command. For each index file in the directory (*.caibx) there needs to be a matching blob without the extension.
  • -c <store> Location of a chunk store to be used as cache. Needs to be writable.
  • -n <int> Number of concurrent download jobs and ssh sessions to the chunk store.
  • -r Repair a local cache by removing invalid chunks. Only valid for the verify command.
  • -y Answer with yes when asked for confirmation. Only supported by the prune command.
  • -l Listening address for the HTTP chunk server. Can be used multiple times to run on more than one interface or more than one port. Only supported by the chunk-server command.
  • -m Specify the min/avg/max chunk sizes in kb. Only applicable to the make command. Defaults to 16:64:256 and for best results the min should be avg/4 and the max should be 4*avg.
  • -i When packing/unpacking an archive, don't create/read an archive file but instead store/read the chunks and use an index file (caidx) for the archive. Only applicable to tar and untar commands.
  • -t Trust all certificates presented by HTTPS stores. Allows the use of self-signed certs when using a HTTPS chunk server.
  • --key Key file in PEM format used for HTTPS chunk-server and index-server commands. Also requires a certificate with --cert
  • --cert Certificate file in PEM format used for HTTPS chunk-server and index-server commands. Also requires -key.
  • -k Keep partially assembled files in place when extract fails or is interrupted. The command can then be restarted and it'll not have to retrieve completed parts again. Also use this option to write to block devices.
Environment variables
  • CASYNC_SSH_PATH overrides the default "ssh" with a command to run when connecting to a remote SSH or SFTP chunk store
  • CASYNC_REMOTE_PATH defines the command to run on the chunk store when using SSH, default "casync"
  • S3_ACCESS_KEY, S3_SECRET_KEY, S3_SESSION_TOKEN, S3_REGION can be used to define S3 store credentials if only one store is used. If S3_ACCESS_KEY and S3_SECRET_KEY are not defined, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN are also considered. Caution, these values take precedence over any S3 credentials set in the config file.
  • DESYNC_PROGRESSBAR_ENABLED enables the progress bar if set to anything other than an empty string. By default, the progressbar is only turned on if STDERR is found to be a terminal.
  • DESYNC_ENABLE_PARSABLE_PROGRESS prints in STDERR the current operation name, the completed percentage and the estimated remaining time if it is set to anything other than an empty string. This is similar to the default progress bar but without the actual bar.
  • DESYNC_HTTP_AUTH sets the expected value in the HTTP Authorization header from clients when using chunk-server or index-server. It needs to be the full string, with type and encoding like "Basic dXNlcjpwYXNzd29yZAo=". Any authorization value provided in the command line takes precedence over the environment variable.
Caching

The -c <store> option can be used to either specify an existing store to act as cache or to populate a new store. Whenever a chunk is requested, it is first looked up in the cache before routing the request to the next (possibly remote) store. Any chunks downloaded from the main stores are added to the cache. In addition, when a chunk is read from the cache and it is a local store, mtime of the chunk is updated to allow for basic garbage collection based on file age. The cache store is expected to be writable. If the cache contains an invalid chunk (checksum does not match the chunk ID), the operation will fail. Invalid chunks are not skipped or removed from the cache automatically. verfiy -r can be used to evict bad chunks from a local store or cache.

Multiple chunk stores

One of the main features of desync is the ability to combine/chain multiple chunk stores of different types and also combine it with a cache store. For example, for a command that reads chunks when assembling a blob, stores can be chained in the command line like so: -s <store1> -s <store2> -s <store3>. A chunk will first be requested from store1, and if not found there, the request will be routed to <store2> and so on. Typically, the fastest chunk store should be listed first to improve performance. It is also possible to combine multiple chunk stores with a cache. In most cases the cache would be a local store, but that is not a requirement. When combining stores and a cache like so: -s <store1> -s <store2> -c <cache>, a chunk request will first be routed to the cache store, then to store1 followed by store2. Any chunks that is not yet in the cache will be stored there upon first request.

Not all types of stores support all operations. The table below lists the supported operations on all store types.

Operation Local store S3 store HTTP store SFTP SSH (casync protocol)
Read chunks yes yes yes yes yes
Write chunks yes yes yes yes no
Use as cache yes yes yes yes no
Prune yes yes no yes no
Verify yes yes no no no
Store failover

Given stores with identical content (same chunks in each), it is possible to group them in a way that provides resilience to failures. Store groups are specified in the command line using | as separator in the same -s option. For example using -s "http://server1/|http://server2/", requests will normally be sent to server1, but if a failure is encountered, all subsequent requests will be routed to server2. There is no automatic fail-back. A failure in server2 will cause it to switch back to server1. Any number of stores can be grouped this way. Note that a missing chunk is treated as a failure immediately, no other servers will be tried, hence the need for all grouped stores to hold the same content.

Dynamic store configuration

Some long-running processes, namely chunk-server and mount-index may require a reconfiguration without having to restart them. This can be achieved by starting them with the --store-file options which provides the arguments that are normally passed via command line flags --store and --cache from a JSON file instead. Once the server is running, a SIGHUP to the process will trigger a reload of the configuration and replace the stores internally without restart. This can be done under load. If the configuration in the file is found to be invalid, and error is printed to STDERR and the reload ignored. The structure of the store-file is as follows:

{
  "stores": [
    "/path/to/store1",
    "/path/to/store2"
  ],
  "cache": "/path/to/cache"
}

This can be combined with store failover by providing the same syntax as is used in the command-line, for example {"stores":["/path/to/main|/path/to/backup"]}, See Examples for details on how to use the --store-file option.

Remote indexes

Indexes can be stored and retrieved from remote locations via SFTP, S3, and HTTP. Storing indexes remotely is optional and deliberately separate from chunk storage. While it's possible to store indexes in the same location as chunks in the case of SFTP and S3, this should only be done in secured environments. The built-in HTTP chunk store (chunk-server command) can not be used as index server. Use the index-server command instead to start an index server that serves indexes and can optionally store them as well (with -w).

Using remote indexes, it is possible to use desync completely file-less. For example when wanting to share a large file with mount-index, one could read the index from an index store like this:

desync mount-index -s http://chunk.store/store http://index.store/myindex.caibx /mnt/image

No file would need to be stored on disk in this case.

S3 chunk stores

desync supports reading from and writing to chunk stores that offer an S3 API, for example hosted in AWS or running on a local server. When using such a store, credentials are passed into the tool either via environment variables S3_ACCESS_KEY, S3_SECRET_KEY and S3_SESSION_TOKEN (if needed) or, if multiples are required, in the config file. Care is required when building those URLs. Below a few examples:

AWS

This store is hosted in eu-west-3 in AWS. s3 signals that the S3 protocol is to be used, https should be specified for SSL connections. The first path element of the URL contains the bucket, desync.bucket in this example. Note, when using AWS, no port should be given in the URL!

s3+https://s3-eu-west-3.amazonaws.com/desync.bucket

It's possible to use prefixes (or "directories") to object names like so:

s3+https://s3-eu-west-3.amazonaws.com/desync.bucket/prefix
Other service with S3 API

This is a store running on the local machine on port 9000 without SSL.

s3+http://127.0.0.1:9000/store
Setting S3 bucket addressing style for other services

desync uses minio as an S3 client library. It has an auto-detection mechanism for determining the addressing style of the buckets which should work for Amazon and Google S3 services but could potentially fail for your custom implementation. You can manually specify the addressing style by appending the "lookup" query parameter to the URL.

By default, the value of "?lookup=auto" is implied.

s3+http://127.0.0.1:9000/bucket/prefix?lookup=path
s3+https://s3.internal.company/bucket/prefix?lookup=dns
s3+https://example.com/bucket/prefix?lookup=auto
Compressed vs Uncompressed chunk stores

By default, desync reads and writes chunks in compressed form to all supported stores. This is in line with upstream casync's goal of storing in the most efficient way. It is however possible to change this behavior by providing desync with a config file (see Configuration section below). Disabling compression and store chunks uncompressed may reduce latency in some use-cases and improve performance. desync supports reading and writing uncompressed chunks to SFTP, S3, HTTP and local stores and caches. If more than one store is used, each of those can be configured independently, for example it's possible to read compressed chunks from S3 while using a local uncompressed cache for best performance. However, care needs to be taken when using the chunk-server command and building chains of chunk store proxies to avoid shifting the decompression load onto the server (it's possible this is actually desirable).

In the setup below, a client reads chunks from an HTTP chunk server which itself gets chunks from S3.

<Client> ---> <HTTP chunk server> ---> <S3 store>

If the client configures the HTTP chunk server to be uncompressed (chunk-server needs to be started with the -u option), and the chunk server reads compressed chunks from S3, then the chunk server will have to decompress every chunk that's requested before responding to the client. If the chunk server was reading uncompressed chunks from S3, there would be no overhead.

Compressed and uncompressed chunks can live in the same store and don't interfere with each other. A store that's configured for compressed chunks by configuring it client-side will not see the uncompressed chunks that may be present. prune and verify too will ignore any chunks written in the other format. Both kinds of chunks can be accessed by multiple clients concurrently and independently.

Configuration

For most use cases, it is sufficient to use the tool's default configuration not requiring a config file. Having a config file $HOME/.config/desync/config.json allows for further customization of timeouts, error retry behaviour or credentials that can't be set via command-line options or environment variables. All values have sensible defaults if unconfigured. Only add configuration for values that differ from the defaults. To view the current configuration, use desync config. If no config file is present, this will show the defaults. To create a config file allowing custom values, use desync config -w which will write the current configuration to the file, then edit the file.

Available configuration values:

  • http-timeout DEPRECATED, see store-options.<Location>.timeout - HTTP request timeout used in HTTP stores (not S3) in nanoseconds
  • http-error-retry *DEPRECATED, see store-options.<Location>.error-retry - Number of times to retry failed chunk requests from HTTP stores
  • s3-credentials - Defines credentials for use with S3 stores. Especially useful if more than one S3 store is used. The key in the config needs to be the URL scheme and host used for the store, excluding the path, but including the port number if used in the store URL. The key can also contain glob patterns, and the available wildcards are *, ? and […]. Please refer to the filepath.Match documentation for additional information. It is also possible to use a standard aws credentials file in order to store s3 credentials.
  • store-options - Allows customization of chunk and index stores, for example compression settings, timeouts, retry behavior and keys. Not all options are applicable to every store, some of these like timeout are ignored for local stores. Some of these options, such as the client certificates are overwritten with any values set in the command line. Note that the store location used in the command line needs to match the key under store-options exactly for these options to be used. As for the s3-credentials, glob patterns are also supported. A configuration file where more than one key matches a single store location, is considered invalid.
    • timeout - Time limit for chunk read or write operation in nanoseconds. Default: 1 minute. If set to a negative value, timeout is infinite.
    • error-retry - Number of times to retry failed chunk requests. Default: 0.
    • error-retry-base-interval - Number of nanoseconds to wait before first retry attempt. Retry attempt number N for the same request will wait N times this interval. Default: 0.
    • client-cert - Cerificate file to be used for stores where the server requires mutual SSL.
    • client-key - Key file to be used for stores where the server requires mutual SSL.
    • ca-cert - Certificate file containing trusted certs or CAs.
    • trust-insecure - Trust any certificate presented by the server.
    • skip-verify - Disables data integrity verification when reading chunks to improve performance. Only recommended when chaining chunk stores with the chunk-server command using compressed stores.
    • uncompressed - Reads and writes uncompressed chunks from/to this store. This can improve performance, especially for local stores or caches. Compressed and uncompressed chunks can coexist in the same store, but only one kind is read or written by one client.
    • http-auth - Value of the Authorization header in HTTP requests. This could be a bearer token with "Bearer <token>" or a Base64-encoded username and password pair for basic authentication like "Basic dXNlcjpwYXNzd29yZAo=".
    • http-cookie - Value of the Cookie header in HTTP requests. This should be in the form of a list of name-value pairs separated by a semicolon and a space ('; ') like "name=value; name2=value2; name3=value3".
Example config
{
  "s3-credentials": {
       "http://localhost": {
           "access-key": "MYACCESSKEY",
           "secret-key": "MYSECRETKEY"
       },
       "https://127.0.0.1:9000": {
           "aws-credentials-file": "/Users/user/.aws/credentials",
       },
       "https://127.0.0.1:8000": {
           "aws-credentials-file": "/Users/user/.aws/credentials",
           "aws-profile": "profile_static"
       },
       "https://s3.us-west-2.amazonaws.com": {
           "aws-credentials-file": "/Users/user/.aws/credentials",
           "aws-region": "us-west-2",
           "aws-profile": "profile_refreshable"
       }
  },
  "store-options": {
    "https://192.168.1.1/store": {
      "client-cert": "/path/to/crt",
      "client-key": "/path/to/key",
      "error-retry": 1
    },
    "https://10.0.0.1/": {
      "http-auth": "Bearer abcabcabc"
    },
    "https://example.com/*/*/": {
      "http-auth": "Bearer dXNlcjpwYXNzd29yZA=="
    },
    "https://cdn.example.com/": {
      "http-cookie": "PHPSESSID=298zf09hf012fh2; csrftoken=u32t4o3tb3gg43"
    },
    "/path/to/local/cache": {
      "uncompressed": true
    }
  }
}
Example aws credentials
[default]
aws_access_key_id = DEFAULT_PROFILE_KEY
aws_secret_access_key = DEFAULT_PROFILE_SECRET

[profile_static]
aws_access_key_id = OTHERACCESSKEY
aws_secret_access_key = OTHERSECRETKEY

[profile_refreshable]
aws_access_key_id = PROFILE_REFRESHABLE_KEY
aws_secret_access_key = PROFILE_REFRESHABLE_SECRET
aws_session_token = PROFILE_REFRESHABLE_TOKEN
Examples

Re-assemble somefile.tar using a remote chunk store and a blob index file.

desync extract -s ssh://192.168.1.1/path/to/casync.store/ -c /tmp/store somefile.tar.caibx somefile.tar

Use multiple stores, specify the local one first to improve performance.

desync extract -s /some/local/store -s ssh://192.168.1.1/path/to/casync.store/ somefile.tar.caibx somefile.tar

Extract version 3 of a disk image using the previous 2 versions as seed for cloning (if supported), or copying. Note, when providing a seed like --seed <file>.ext.caibx, it is assumed that <file>.ext is available next to the index file, and matches the index.

desync extract -s /local/store \
  --seed image-v1.qcow2.caibx \
  --seed image-v2.qcow2.caibx \
  image-v3.qcow2.caibx image-v3.qcow2

Extract an image using several seeds present in a directory. Each of the .caibx files in the directory needs to have a matching blob of the same name. It is possible for the source index file to be in the same directory also (it'll be skipped automatically).

desync extract -s /local/store --seed-dir /path/to/images image-v3.qcow2.caibx image-v3.qcow2

Mix and match remote stores and use a local cache store to improve performance. Also group two identical HTTP stores with | to provide failover in case of errors on one.

desync extract \
       -s "http://192.168.1.101/casync.store/|http://192.168.1.102/casync.store/" \
       -s ssh://192.168.1.1/path/to/casync.store/ \
       -s https://192.168.1.3/ssl.store/ \
       -c /path/to/cache \
       somefile.tar.caibx somefile.tar

Extract a file in-place (-k option). If this operation fails, the file will remain partially complete and can be restarted without the need to re-download chunks from the remote SFTP store. Use -k when a local cache is not available and the extract may be interrupted.

desync extract -k -s sftp://192.168.1.1/path/to/store file.caibx file.tar

Extract an image directly onto a block device. The -k or --in-place option is needed.

desync extract -k -s /mnt/store image.caibx /dev/sdc

Extract a file using a remote index stored in an HTTP index store

desync extract -k -s sftp://192.168.1.1/path/to/store http://192.168.1.2/file.caibx file.tar

Verify a local cache. Errors will be reported to STDOUT, since -r is not given, nothing invalid will be removed.

desync verify -s /some/local/store

Cache the chunks used in a couple of index files in a local store without actually writing the blob.

desync cache -s ssh://192.168.1.1/path/to/casync.store/ -c /local/cache somefile.tar.caibx other.file.caibx

List the chunks referenced in a caibx.

desync list-chunks somefile.tar.caibx

Chop an existing file according to an existing caibx and store the chunks in a local store. This can be used to populate a local cache from a possibly large blob that already exists on the target system.

desync chop -s /some/local/store somefile.tar.caibx somefile.tar

Chop a blob according to an existing index, while ignoring any chunks that are referenced in another index. This can be used to improve performance when it is known that all chunks referenced in image-v1.caibx are already present in the target store and can be ignored when chopping image-v2.iso.

desync chop -s /some/local/store --ignore image-v1.iso.caibx image-v2.iso.caibx image-v2.iso

Pack a directory tree into a catar file.

desync tar archive.catar /some/dir

Pack a directory tree into an archive and chunk the archive, producing an index file.

desync tar -i -s /some/local/store archive.caidx /some/dir

Unpack a catar file.

desync untar archive.catar /some/dir

Unpack a directory tree using an index file referencing a chunked archive.

desync untar -i -s /some/local/store archive.caidx /some/dir

Pack a directory tree currently available as tar archive into a catar. The tar input stream can also be read from STDIN by providing '-' instead of the file name.

desync tar --input-format=tar archive.catar /path/to/archive.tar

Process a tar stream into a catar. Since catar don't support hardlinks, we need to make sure those are dereferenced in the input stream.

tar --hard-dereference -C /path/to/dir -c . | desync tar --input-format tar archive.catar -

Unpack a directory tree from an index file and store the output filesystem in a GNU tar file rather than the local filesystem. Instead of an archive file, the output can be given as '-' which will write to STDOUT.

desync untar -i -s /some/local/store --output-format=gnu-tar archive.caidx /path/to/archive.tar

Prune a store to only contain chunks that are referenced in the provided index files. Possible data loss.

desync prune -s /some/local/store index1.caibx index2.caibx

Start a chunk server serving up a local store via port 80.

desync chunk-server -s /some/local/store

Start a chunk server on port 8080 acting as proxy for other remote HTTP and SSH stores and populate a local cache.

desync chunk-server -s http://192.168.1.1/ -s ssh://192.168.1.2/store -c cache -l :8080

Start a chunk server with a store-file, this allows the configuration to be re-read on SIGHUP without restart.

# Create store file
echo '{"stores": ["http://192.168.1.1/"], "cache": "/tmp/cache"}` > stores.json

# Start the server
desync chunk-server --store-file stores.json -l :8080

# Modify
echo '{"stores": ["http://192.168.1.2/"], "cache": "/tmp/cache"}` > stores.json

# Reload
killall -1 desync

Start a writable index server, chunk a file and store the index.

server# desync index-server -s /mnt/indexes --writable -l :8080

client# desync make -s /some/store http://192.168.1.1:8080/file.vmdk.caibx file.vmdk

Copy all chunks referenced in an index file from a remote HTTP store to a remote SFTP store.

desync cache -s ssh://192.168.1.2/store -c sftp://192.168.1.3/path/to/store /path/to/index.caibx

Start a TLS chunk server on port 443 acting as proxy for a remote chunk store in AWS with local cache. The credentials for AWS are expected to be in the config file under key https://s3-eu-west-3.amazonaws.com.

desync chunk-server -s s3+https://s3-eu-west-3.amazonaws.com/desync.bucket/prefix -c cache -l 127.0.0.1:https --cert cert.pem --key key.pem

Split a blob, store the chunks and create an index file.

desync make -s /some/local/store index.caibx /some/blob

Split a blob, create an index file and store the chunks in an S3 bucket named store.

S3_ACCESS_KEY=mykey S3_SECRET_KEY=mysecret desync make -s s3+http://127.0.0.1:9000/store index.caibx /some/blob

FUSE mount an index file. This will make the indexed blob available as file underneath the mount point. The filename in the mount matches the name of the index with the extension removed. In this example /some/mnt/ will contain one file index.

desync mount-index -s /some/local/store index.caibx /some/mnt

FUSE mount a chunked and remote index file. First a (small) index file is read from the index-server which is used to re-assemble a larger index file and pipe it into the 2nd command that then mounts it.

desync cat -s http://192.168.1.1/store http://192.168.1.2/small.caibx | desync mount-index -s http://192.168.1.1/store - /mnt/point

Long-running FUSE mount that may need to have its store setup changed without unmounting. This can be done by using the --store-file option rather than speicifying store+cache in the command line. The process will then reload the file when a SIGHUP is sent.

# Create the store file
echo '{"stores": ["http://192.168.1.1/"], "cache": "/tmp/cache"}` > stores.json

# Start the mount
desync mount-index --store-file stores.json index.caibx /some/mnt

# Modify the store setup
echo '{"stores": ["http://192.168.1.2/"], "cache": "/tmp/cache"}` > stores.json

# Reload
killall -1 desync

Show information about an index file to see how many of its chunks are present in a local store or an S3 store. The local store is queried first, S3 is only queried if the chunk is not present in the local store. The output will be in JSON format (--format=json) for easier processing in scripts.

desync info --format=json -s /tmp/store -s s3+http://127.0.0.1:9000/store /path/to/index

Start an HTTP chunk server that will store uncompressed chunks locally, configured via JSON config file, and serve uncompressed chunks over the network (-u option). This chunk server could be used as a cache, minimizing latency by storing and serving uncompressed chunks. Clients will need to be configured to request uncompressed chunks from this server.

# Chunk server
echo '{"store-options": {"/path/to/store/":{"uncompressed": true}}}' > /path/to/server.json

desync --config /path/to/server.json chunk-server -w -u -s /path/to/store/ -l :8080

# Client
echo '{"store-options": {"http://store.host:8080/":{"uncompressed": true}}}' > /path/to/client.json

desync --config /path/to/client.json cache -s sftp://remote.host/store -c http://store.host:8080/ /path/to/blob.caibx

HTTP chunk server using basic authorization. The server is configured to expect an Authorization header with the correct value in every request. The client configuration defines what the value should be on a per-server basis. The client config could be added to the default $HOME/.config/desync/config.json instead.

# Server
DESYNC_HTTP_AUTH="Bearer abcabcabc" desync chunk-server -s /path/to/store -l :8080

# Client
echo '{"store-options": {"http://127.0.0.1:8080/":{"http-auth": "Bearer abcabcabc"}}}' > /path/to/client.json

desync --config /path/to/client.json extract -s http://127.0.0.1:8080/ /path/to/blob.caibx /path/to/blob

HTTPS chunk server using key and certificate signed by custom CA.

# Building the CA and server certficate
openssl genrsa -out ca.key 4096
openssl req -x509 -new -nodes -key ca.key -sha256 -days 3650 -out ca.crt
openssl genrsa -out server.key 2048
openssl req -new -key server.key -out server.csr (Common Name should be the server name)
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt -days 3650 -sha256

# Chunk server
desync chunk-server -s /path/to/store --key server.key --cert server.crt -l :8443

# Client
desync extract --ca-cert ca.crt -s https://hostname:8443/ image.iso.caibx image.iso

HTTPS chunk server with client authentication (mutual-TLS).

# Building the CA, server and client certficates
openssl genrsa -out ca.key 4096
openssl req -x509 -new -nodes -key ca.key -sha256 -days 3650 -out ca.crt
openssl genrsa -out server.key 2048
openssl req -new -key server.key -out server.csr (Common Name should be the server name)
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt -days 3650 -sha256
openssl genrsa -out client.key 2048
openssl req -new -key client.key -out client.csr
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt -days 3650 -sha256

# Chunk server
desync chunk-server -s /path/to/store --key server.key --cert server.crt --mutual-tls --client-ca ca.crt -l :8443

# Client
desync extract --client-key client.key --client-cert client.crt --ca-cert ca.crt -s https://hostname:8443/ image.iso.caibx image.iso

Documentation

Overview

Package desync implements data structures, protocols and features of https://github.com/systemd/casync in order to allow support for additional platforms and improve performace by way of concurrency and caching.

Supports the following casync data structures: catar archives, caibx/caidx index files, castr stores (local or remote).

See desync/cmd for reference implementations of the available features.

Index

Constants

View Source
const (
	// Format identifiers used in archive files
	CaFormatEntry             = 0x1396fabcea5bbb51
	CaFormatUser              = 0xf453131aaeeaccb3
	CaFormatGroup             = 0x25eb6ac969396a52
	CaFormatXAttr             = 0xb8157091f80bc486
	CaFormatACLUser           = 0x297dc88b2ef12faf
	CaFormatACLGroup          = 0x36f2acb56cb3dd0b
	CaFormatACLGroupObj       = 0x23047110441f38f3
	CaFormatACLDefault        = 0xfe3eeda6823c8cd0
	CaFormatACLDefaultUser    = 0xbdf03df9bd010a91
	CaFormatACLDefaultGroup   = 0xa0cb1168782d1f51
	CaFormatFCaps             = 0xf7267db0afed0629
	CaFormatSELinux           = 0x46faf0602fd26c59
	CaFormatSymlink           = 0x664a6fb6830e0d6c
	CaFormatDevice            = 0xac3dace369dfe643
	CaFormatPayload           = 0x8b9e1d93d6dcffc9
	CaFormatFilename          = 0x6dbb6ebcb3161f0b
	CaFormatGoodbye           = 0xdfd35c5e8327c403
	CaFormatGoodbyeTailMarker = 0x57446fa533702943
	CaFormatIndex             = 0x96824d9c7b129ff9
	CaFormatTable             = 0xe75b9e112f17417d
	CaFormatTableTailMarker   = 0x4b4f050e5549ecd1

	// SipHash key used in Goodbye elements to hash the filename. It's 16 bytes,
	// split into 2x64bit values, upper and lower part of the key
	CaFormatGoodbyeHashKey0 = 0x8574442b0f1d84b3
	CaFormatGoodbyeHashKey1 = 0x2736ed30d1c22ec1

	// Format feature flags
	CaFormatWith16BitUIDs   = 0x1
	CaFormatWith32BitUIDs   = 0x2
	CaFormatWithUserNames   = 0x4
	CaFormatWithSecTime     = 0x8
	CaFormatWithUSecTime    = 0x10
	CaFormatWithNSecTime    = 0x20
	CaFormatWith2SecTime    = 0x40
	CaFormatWithReadOnly    = 0x80
	CaFormatWithPermissions = 0x100
	CaFormatWithSymlinks    = 0x200
	CaFormatWithDeviceNodes = 0x400
	CaFormatWithFIFOs       = 0x800
	CaFormatWithSockets     = 0x1000

	/* DOS file flags */
	CaFormatWithFlagHidden  = 0x2000
	CaFormatWithFlagSystem  = 0x4000
	CaFormatWithFlagArchive = 0x8000

	/* chattr() flags */
	CaFormatWithFlagAppend         = 0x10000
	CaFormatWithFlagNoAtime        = 0x20000
	CaFormatWithFlagCompr          = 0x40000
	CaFormatWithFlagNoCow          = 0x80000
	CaFormatWithFlagNoDump         = 0x100000
	CaFormatWithFlagDirSync        = 0x200000
	CaFormatWithFlagImmutable      = 0x400000
	CaFormatWithFlagSync           = 0x800000
	CaFormatWithFlagNoComp         = 0x1000000
	CaFormatWithFlagProjectInherit = 0x2000000

	/* btrfs magic */
	CaFormatWithSubvolume   = 0x4000000
	CaFormatWithSubvolumeRO = 0x8000000

	/* Extended Attribute metadata */
	CaFormatWithXattrs  = 0x10000000
	CaFormatWithACL     = 0x20000000
	CaFormatWithSELinux = 0x40000000
	CaFormatWithFcaps   = 0x80000000

	CaFormatExcludeFile      = 0x1000000000000000
	CaFormatSHA512256        = 0x2000000000000000
	CaFormatExcludeSubmounts = 0x4000000000000000
	CaFormatExcludeNoDump    = 0x8000000000000000

	// Protocol message types
	CaProtocolHello      = 0x3c71d0948ca5fbee
	CaProtocolIndex      = 0xb32a91dd2b3e27f8
	CaProtocolIndexEOF   = 0x4f0932f1043718f5
	CaProtocolArchive    = 0x95d6428a69eddcc5
	CaProtocolArchiveEOF = 0x450bef663f24cbad
	CaProtocolRequest    = 0x8ab427e0f89d9210
	CaProtocolChunk      = 0x5213dd180a84bc8c
	CaProtocolMissing    = 0xd010f9fac82b7b6c
	CaProtocolGoodbye    = 0xad205dbf1a3686c3
	CaProtocolAbort      = 0xe7d9136b7efea352

	// Provided services
	CaProtocolReadableStore   = 0x1
	CaProtocolWritableStore   = 0x2
	CaProtocolReadableIndex   = 0x4
	CaProtocolWritableIndex   = 0x8
	CaProtocolReadableArchive = 0x10
	CaProtocolWritableArchive = 0x20

	// Wanted services
	CaProtocolPullChunks      = 0x40
	CaProtocolPullIndex       = 0x80
	CaProtocolPullArchive     = 0x100
	CaProtocolPushChunks      = 0x200
	CaProtocolPushIndex       = 0x400
	CaProtocolPushIndexChunks = 0x800
	CaProtocolPushArchive     = 0x1000

	// Protocol request flags
	CaProtocolRequestHighPriority = 1

	// Chunk properties
	CaProtocolChunkCompressed = 1
)
View Source
const ChunkerWindowSize = 48

ChunkerWindowSize is the number of bytes in the rolling hash window

View Source
const CompressedChunkExt = ".cacnk"

CompressedChunkExt is the file extension used for compressed chunks

View Source
const DefaultBlockSize = 4096

DefaultBlockSize is used when the actual filesystem block size cannot be determined automatically

TarFeatureFlags are used as feature flags in the header of catar archives. These should be used in index files when chunking a catar as well. TODO: Find out what CaFormatWithPermissions is as that's not set incasync-produced catar archives.

View Source
const UncompressedChunkExt = ""

UncompressedChunkExt is the file extension of uncompressed chunks

Variables

View Source
var (
	FormatString = map[uint64]string{
		CaFormatEntry:             "CaFormatEntry",
		CaFormatUser:              "CaFormatUser",
		CaFormatGroup:             "CaFormatGroup",
		CaFormatXAttr:             "CaFormatXAttr",
		CaFormatACLUser:           "CaFormatACLUser",
		CaFormatACLGroup:          "CaFormatACLGroup",
		CaFormatACLGroupObj:       "CaFormatACLGroupObj",
		CaFormatACLDefault:        "CaFormatACLDefault",
		CaFormatACLDefaultUser:    "CaFormatACLDefaultUser",
		CaFormatACLDefaultGroup:   "CaFormatACLDefaultGroup",
		CaFormatFCaps:             "CaFormatFCaps",
		CaFormatSELinux:           "CaFormatSELinux",
		CaFormatSymlink:           "CaFormatSymlink",
		CaFormatDevice:            "CaFormatDevice",
		CaFormatPayload:           "CaFormatPayload",
		CaFormatFilename:          "CaFormatFilename",
		CaFormatGoodbye:           "CaFormatGoodbye",
		CaFormatGoodbyeTailMarker: "CaFormatGoodbyeTailMarker",
		CaFormatIndex:             "CaFormatIndex",
		CaFormatTable:             "CaFormatTable",
		CaFormatTableTailMarker:   "CaFormatTableTailMarker",
	}
)
View Source
var Log = logrus.New()

Functions

func CanClone added in v0.4.0

func CanClone(dstFile, srcFile string) bool

CanClone tries to determine if the filesystem allows cloning of blocks between two files. It'll create two tempfiles in the same dirs and attempt to perfom a 0-byte long block clone. If that's successful it'll return true.

func ChopFile added in v0.2.0

func ChopFile(ctx context.Context, name string, chunks []IndexChunk, ws WriteStore, n int, pb ProgressBar) error

ChopFile split a file according to a list of chunks obtained from an Index and stores them in the provided store

func CloneRange added in v0.4.0

func CloneRange(dst, src *os.File, srcOffset, srcLength, dstOffset uint64) error

CloneRange uses the FICLONERANGE ioctl to de-dupe blocks between two files when using XFS or btrfs. Only works at block-boundaries.

func Compress added in v0.2.0

func Compress(src []byte) ([]byte, error)

Compress a block using the only (currently) supported algorithm

func Copy added in v0.2.0

func Copy(ctx context.Context, ids []ChunkID, src Store, dst WriteStore, n int, pb ProgressBar) error

Copy reads a list of chunks from the provided src store, and copies the ones not already present in the dst store. The goal is to load chunks from remote store to populate a cache. If progress is provided, it'll be called when a chunk has been processed. Used to draw a progress bar, can be nil.

func Decompress added in v0.2.0

func Decompress(dst, src []byte) ([]byte, error)

Decompress a block using the only supported algorithm. If you already have a buffer it can be passed into out and will be used. If out=nil, a buffer will be allocated.

func FilemodeToStatMode added in v0.8.0

func FilemodeToStatMode(mode os.FileMode) uint32

FilemodeToStatMode converts Go's os.Filemode value into the syscall equivalent.

func GetFileSize added in v0.9.3

func GetFileSize(fileName string) (size uint64, err error)

GetFileSize determines the size, in Bytes, of the file located at the given fileName.

func IndexFromFile added in v0.2.0

func IndexFromFile(ctx context.Context,
	name string,
	n int,
	min, avg, max uint64,
	pb ProgressBar,
) (Index, ChunkingStats, error)

IndexFromFile chunks a file in parallel and returns an index. It does not store chunks! Each concurrent chunker starts filesize/n bytes apart and splits independently. Each chunk worker tries to sync with it's next neighbor and if successful stops processing letting the next one continue. The main routine reads and assembles a list of (confirmed) chunks from the workers, starting with the first worker. This algorithm wastes some CPU and I/O if the data doesn't contain chunk boundaries, for example if the whole file contains nil bytes. If progress is not nil, it'll be updated with the confirmed chunk position in the file.

func MountIndex added in v0.2.0

func MountIndex(ctx context.Context, idx Index, ifs MountFS, path string, s Store, n int) error

MountIndex mounts an index file under a FUSE mount point. The mount will only expose a single blob file as represented by the index.

func NewHTTPHandler added in v0.2.0

func NewHTTPHandler(s Store, writable, skipVerifyWrite bool, converters Converters, auth string) http.Handler

NewHTTPHandler initializes and returns a new HTTP handler for a chunks server.

func NewHTTPIndexHandler added in v0.3.0

func NewHTTPIndexHandler(s IndexStore, writable bool, auth string) http.Handler

NewHTTPIndexHandler initializes an HTTP index store handler

func SipHash added in v0.2.0

func SipHash(b []byte) uint64

SipHash is used to calculate the hash in Goodbye element items, hashing the filename.

func StatModeToFilemode added in v0.8.0

func StatModeToFilemode(mode uint32) os.FileMode

StatModeToFilemode converts syscall mode to Go's os.Filemode value.

func Tar added in v0.2.0

Tar implements the tar command which recursively parses a directory tree, and produces a stream of encoded casync format elements (catar file).

func UnTar added in v0.2.0

func UnTar(ctx context.Context, r io.Reader, fs FilesystemWriter) error

UnTar implements the untar command, decoding a catar file and writing the contained tree to a target directory.

func UnTarIndex added in v0.2.0

func UnTarIndex(ctx context.Context, fs FilesystemWriter, index Index, s Store, n int, pb ProgressBar) error

UnTarIndex takes an index file (of a chunked catar), re-assembles the catar and decodes it on-the-fly into the target directory 'dst'. Uses n gorountines to retrieve and decompress the chunks.

func VerifyIndex added in v0.2.0

func VerifyIndex(ctx context.Context, name string, idx Index, n int, pb ProgressBar) error

VerifyIndex re-calculates the checksums of a blob comparing it to a given index. Fails if the index does not match the blob.

Types

type ArchiveDecoder added in v0.2.0

type ArchiveDecoder struct {
	// contains filtered or unexported fields
}

ArchiveDecoder is used to decode a catar archive.

func NewArchiveDecoder added in v0.2.0

func NewArchiveDecoder(r io.Reader) ArchiveDecoder

NewArchiveDecoder initializes a decoder for a catar archive.

func (*ArchiveDecoder) Next added in v0.2.0

func (a *ArchiveDecoder) Next() (interface{}, error)

Next returns a node from an archive, or nil if the end is reached. If NodeFile is returned, the caller should read the file body before calling Next() again as that invalidates the reader.

type AssembleOptions added in v0.9.3

type AssembleOptions struct {
	N                 int
	InvalidSeedAction InvalidSeedAction
}

type Cache

type Cache struct {
	// contains filtered or unexported fields
}

Cache is used to connect a (typically remote) store with a local store which functions as disk cache. Any request to the cache for a chunk will first be routed to the local store, and if that fails to the slower remote store. Any chunks retrieved from the remote store will be stored in the local one.

func NewCache

func NewCache(s Store, l WriteStore) Cache

NewCache returns a cache router that uses a local store as cache before accessing a (supposedly slower) remote one.

func (Cache) Close added in v0.2.0

func (c Cache) Close() error

Close the underlying writable chunk store

func (Cache) GetChunk

func (c Cache) GetChunk(id ChunkID) (*Chunk, error)

GetChunk first asks the local store for the chunk and then the remote one. If we get a chunk from the remote, it's stored locally too.

func (Cache) HasChunk added in v0.2.0

func (c Cache) HasChunk(id ChunkID) (bool, error)

HasChunk first checks the cache for the chunk, then the store.

func (Cache) String added in v0.2.0

func (c Cache) String() string

type Chunk added in v0.4.0

type Chunk struct {
	// contains filtered or unexported fields
}

Chunk holds chunk data plain, storage format, or both. If a chunk is created from storage data, such as read from a compressed chunk store, and later the application requires the plain data, it'll be converted on demand by applying the given storage converters in reverse order. The converters can only be used to read the plain data, not to convert back to storage format.

func NewChunk added in v0.9.1

func NewChunk(b []byte) *Chunk

NewChunk creates a new chunk from plain data. The data is trusted and the ID is calculated on demand.

func NewChunkFromStorage added in v0.9.1

func NewChunkFromStorage(id ChunkID, b []byte, modifiers Converters, skipVerify bool) (*Chunk, error)

NewChunkFromStorage builds a new chunk from data that is not in plain format. It uses raw storage format from it source and the modifiers are used to convert into plain data as needed.

func NewChunkWithID added in v0.4.0

func NewChunkWithID(id ChunkID, b []byte, skipVerify bool) (*Chunk, error)

NewChunkWithID creates a new chunk from either compressed or uncompressed data (or both if available). It also expects an ID and validates that it matches the uncompressed data unless skipVerify is true. If called with just compressed data, it'll decompress it for the ID validation.

func (*Chunk) Data added in v0.9.1

func (c *Chunk) Data() ([]byte, error)

Data returns the chunk data in uncompressed form. If the chunk was created with compressed data only, it'll be decompressed, stored and returned. The caller must not modify the data in the returned slice.

func (*Chunk) ID added in v0.4.0

func (c *Chunk) ID() ChunkID

ID returns the checksum/ID of the uncompressed chunk data. The ID is stored after the first call and doesn't need to be re-calculated. Note that calculating the ID may mean decompressing the data first.

type ChunkID

type ChunkID [32]byte

ChunkID is the SHA512/256 in binary encoding

func ChunkIDFromSlice

func ChunkIDFromSlice(b []byte) (ChunkID, error)

ChunkIDFromSlice converts a SHA512/256 encoded as byte slice into a ChunkID. It's expected the slice is of the correct length

func ChunkIDFromString

func ChunkIDFromString(id string) (ChunkID, error)

ChunkIDFromString converts a SHA512/56 encoded as string into a ChunkID

func (ChunkID) String

func (c ChunkID) String() string

type ChunkInvalid added in v0.2.0

type ChunkInvalid struct {
	ID  ChunkID
	Sum ChunkID
}

ChunkInvalid means the hash of the chunk content doesn't match its ID

func (ChunkInvalid) Error added in v0.2.0

func (e ChunkInvalid) Error() string

type ChunkMissing

type ChunkMissing struct {
	ID ChunkID
}

ChunkMissing is returned by a store that can't find a requested chunk

func (ChunkMissing) Error

func (e ChunkMissing) Error() string

type ChunkStorage added in v0.2.0

type ChunkStorage struct {
	sync.Mutex
	// contains filtered or unexported fields
}

ChunkStorage stores chunks in a writable store. It can be safely used by multiple goroutines and contains an internal cache of what chunks have been store previously.

func NewChunkStorage added in v0.2.0

func NewChunkStorage(ws WriteStore) *ChunkStorage

NewChunkStorage initializes a ChunkStorage object.

func (*ChunkStorage) StoreChunk added in v0.2.0

func (s *ChunkStorage) StoreChunk(chunk *Chunk) (err error)

StoreChunk stores a single chunk in a synchronous manner.

type Chunker added in v0.2.0

type Chunker struct {
	// contains filtered or unexported fields
}

Chunker is used to break up a data stream into chunks of data.

func NewChunker added in v0.2.0

func NewChunker(r io.Reader, min, avg, max uint64) (Chunker, error)

NewChunker initializes a chunker for a data stream according to min/avg/max chunk size.

func (*Chunker) Advance added in v0.8.0

func (c *Chunker) Advance(n int) error

Advance n bytes without producing chunks. This can be used if the content of the next section in the file is known (i.e. it is known that there are a number of null chunks coming). This resets everything in the chunker and behaves as if the streams starts at (current position+n).

func (*Chunker) Avg added in v0.2.0

func (c *Chunker) Avg() uint64

Avg returns the average chunk size

func (*Chunker) Max added in v0.2.0

func (c *Chunker) Max() uint64

Max returns the maximum chunk size

func (*Chunker) Min added in v0.2.0

func (c *Chunker) Min() uint64

Min returns the minimum chunk size

func (*Chunker) Next added in v0.2.0

func (c *Chunker) Next() (uint64, []byte, error)

Next returns the starting position as well as the chunk data. Returns an empty byte slice when complete

type ChunkingStats added in v0.2.0

type ChunkingStats struct {
	ChunksAccepted uint64
	ChunksProduced uint64
}

ChunkingStats is used to report statistics of a parallel chunking operation.

type Compressor added in v0.9.1

type Compressor struct{}

Compression layer

type ConsoleIndexStore added in v0.3.0

type ConsoleIndexStore struct{}

ConsoleIndexStore is used for writing/reading indexes from STDOUT/STDIN

func NewConsoleIndexStore added in v0.3.0

func NewConsoleIndexStore() (ConsoleIndexStore, error)

NewConsoleIndexStore creates an instance of an indexStore that reads/writes to and from console

func (ConsoleIndexStore) Close added in v0.3.0

func (s ConsoleIndexStore) Close() error

Close the index store.

func (ConsoleIndexStore) GetIndex added in v0.3.0

func (s ConsoleIndexStore) GetIndex(string) (i Index, e error)

GetIndex reads an index from STDIN and returns it.

func (ConsoleIndexStore) GetIndexReader added in v0.3.0

func (s ConsoleIndexStore) GetIndexReader(string) (io.ReadCloser, error)

GetIndexReader returns a reader from STDIN

func (ConsoleIndexStore) StoreIndex added in v0.3.0

func (s ConsoleIndexStore) StoreIndex(name string, idx Index) error

StoreIndex writes the provided indes to STDOUT. The name is ignored.

func (ConsoleIndexStore) String added in v0.3.0

func (s ConsoleIndexStore) String() string

type Converters added in v0.9.1

type Converters []converter

Converters are modifiers for chunk data, such as compression or encryption. They are used to prepare chunk data for storage, or to read it from storage. The order of the conversion layers matters. When plain data is prepared for storage, the toStorage method is used in the order the layers are defined. To read from storage, the fromStorage method is called for each layer in reverse order.

type DedupQueue added in v0.7.0

type DedupQueue struct {
	// contains filtered or unexported fields
}

DedupQueue wraps a store and provides deduplication of incoming chunk requests. This is useful when a burst of requests for the same chunk is received and the chunk store serving those is slow. With the DedupQueue wrapper, concurrent requests for the same chunk will result in just one request to the upstread store. Implements the Store interface.

func NewDedupQueue added in v0.7.0

func NewDedupQueue(store Store) *DedupQueue

NewDedupQueue initializes a new instance of the wrapper.

func (*DedupQueue) Close added in v0.7.0

func (q *DedupQueue) Close() error

func (*DedupQueue) GetChunk added in v0.7.0

func (q *DedupQueue) GetChunk(id ChunkID) (*Chunk, error)

func (*DedupQueue) HasChunk added in v0.7.0

func (q *DedupQueue) HasChunk(id ChunkID) (bool, error)

func (*DedupQueue) String added in v0.7.0

func (q *DedupQueue) String() string

type DefaultProgressBar added in v0.9.3

type DefaultProgressBar struct {
	*pb.ProgressBar
}

DefaultProgressBar wraps https://github.com/cheggaaa/pb and implements desync.ProgressBar

func (DefaultProgressBar) Set added in v0.9.3

func (p DefaultProgressBar) Set(current int)

Set the current value

func (DefaultProgressBar) SetTotal added in v0.9.3

func (p DefaultProgressBar) SetTotal(total int)

SetTotal sets the upper bounds for the progress bar

func (DefaultProgressBar) Start added in v0.9.3

func (p DefaultProgressBar) Start()

Start displaying the progress bar

func (DefaultProgressBar) Write added in v0.9.3

func (p DefaultProgressBar) Write(b []byte) (n int, err error)

Write the current state of the progressbar

type ExtractStats added in v0.4.0

type ExtractStats struct {
	ChunksFromSeeds uint64 `json:"chunks-from-seeds"`
	ChunksFromStore uint64 `json:"chunks-from-store"`
	ChunksInPlace   uint64 `json:"chunks-in-place"`
	BytesCopied     uint64 `json:"bytes-copied-from-seeds"`
	BytesCloned     uint64 `json:"bytes-cloned-from-seeds"`
	Blocksize       uint64 `json:"blocksize"`
	BytesTotal      int64  `json:"bytes-total"`
	ChunksTotal     int    `json:"chunks-total"`
	Seeds           int    `json:"seeds"`
}

ExtractStats contains detailed statistics about a file extract operation, such as if data chunks were copied from seeds or cloned.

func AssembleFile added in v0.2.0

func AssembleFile(ctx context.Context, name string, idx Index, s Store, seeds []Seed, options AssembleOptions) (*ExtractStats, error)

AssembleFile re-assembles a file based on a list of index chunks. It runs n goroutines, creating one filehandle for the file "name" per goroutine and writes to the file simultaneously. If progress is provided, it'll be called when a chunk has been processed. If the input file exists and is not empty, the algorithm will first confirm if the data matches what is expected and only populate areas that differ from the expected content. This can be used to complete partly written files.

type FailoverGroup added in v0.7.0

type FailoverGroup struct {
	// contains filtered or unexported fields
}

FailoverGroup wraps multiple stores to provide failover when one or more stores in the group fail. Only one of the stores in the group is considered "active" at a time. If an unexpected error is returned from the active store, the next store in the group becomes the active one and the request retried. When all stores returned a failure, the group will pass up the failure to the caller. The active store rotates through all available stores. All stores in the group are expected to contain the same chunks, there is no failover for missing chunks. Implements the Store interface.

func NewFailoverGroup added in v0.7.0

func NewFailoverGroup(stores ...Store) *FailoverGroup

NewFailoverGroup initializes and returns a store wraps multiple stores to form a group that can fail over between them on failure from one.

func (*FailoverGroup) Close added in v0.7.0

func (g *FailoverGroup) Close() error

func (*FailoverGroup) GetChunk added in v0.7.0

func (g *FailoverGroup) GetChunk(id ChunkID) (*Chunk, error)

func (*FailoverGroup) HasChunk added in v0.7.0

func (g *FailoverGroup) HasChunk(id ChunkID) (bool, error)

func (*FailoverGroup) String added in v0.7.0

func (g *FailoverGroup) String() string

type File added in v0.8.0

type File struct {
	Name string
	Path string
	Mode os.FileMode

	Size uint64

	// Link target for symlinks
	LinkTarget string

	// Modification time
	ModTime time.Time

	// User/group IDs
	Uid int
	Gid int

	// Major/Minor for character or block devices
	DevMajor uint64
	DevMinor uint64

	// Extended attributes
	Xattrs map[string]string

	// File content. Nil for non-regular files.
	Data io.ReadCloser
}

File represents a filesystem object such as directory, file, symlink or device. It's used when creating archives from a source filesystem which can be a real OS filesystem, or another archive stream such as tar.

func (*File) Close added in v0.8.0

func (f *File) Close() error

Close closes the file data reader if any. It's safe to call for non-regular files as well.

func (*File) IsDevice added in v0.8.0

func (f *File) IsDevice() bool

func (*File) IsDir added in v0.8.0

func (f *File) IsDir() bool

func (*File) IsRegular added in v0.8.0

func (f *File) IsRegular() bool
func (f *File) IsSymlink() bool

type FileSeed added in v0.4.0

type FileSeed struct {
	// contains filtered or unexported fields
}

FileSeed is used to copy or clone blocks from an existing index+blob during file extraction.

func NewIndexSeed added in v0.4.0

func NewIndexSeed(dstFile string, srcFile string, index Index) (*FileSeed, error)

NewIndexSeed initializes a new seed that uses an existing index and its blob

func (*FileSeed) IsInvalid added in v0.9.3

func (s *FileSeed) IsInvalid() bool

func (*FileSeed) LongestMatchWith added in v0.4.0

func (s *FileSeed) LongestMatchWith(chunks []IndexChunk) (int, SeedSegment)

LongestMatchWith returns the longest sequence of chunks anywhere in Source that match `chunks` starting at chunks[0], limiting the maximum number of chunks if reflinks are not supported. If there is no match, it returns a length of zero and a nil SeedSegment.

func (*FileSeed) RegenerateIndex added in v0.9.3

func (s *FileSeed) RegenerateIndex(ctx context.Context, n int, attempt int, seedNumber int) error

func (*FileSeed) SetInvalid added in v0.9.3

func (s *FileSeed) SetInvalid(value bool)

type FilesystemReader added in v0.8.0

type FilesystemReader interface {
	Next() (*File, error)
}

FilesystemReader is an interface for source filesystem to be used during tar operations. Next() is expected to return files and directories in a consistent and stable order and return io.EOF when no further files are available.

type FilesystemWriter added in v0.8.0

type FilesystemWriter interface {
	CreateDir(n NodeDirectory) error
	CreateFile(n NodeFile) error
	CreateSymlink(n NodeSymlink) error
	CreateDevice(n NodeDevice) error
}

FilesystemWriter is a filesystem implementation that supports untar'ing a catar archive to.

type FormatACLDefault added in v0.2.0

type FormatACLDefault struct {
	FormatHeader
	UserObjPermissions  uint64
	GroupObjPermissions uint64
	OtherPermissions    uint64
	MaskPermissions     uint64
}

type FormatACLGroup added in v0.2.0

type FormatACLGroup struct {
	FormatHeader
	GID         uint64
	Permissions uint64
	Name        string
}

type FormatACLGroupObj added in v0.2.0

type FormatACLGroupObj struct {
	FormatHeader
	Permissions uint64
}

type FormatACLUser added in v0.2.0

type FormatACLUser struct {
	FormatHeader
	UID         uint64
	Permissions uint64
	Name        string
}

type FormatDecoder added in v0.2.0

type FormatDecoder struct {
	// contains filtered or unexported fields
}

FormatDecoder is used to parse and break up a stream of casync format elements found in archives or index files.

func NewFormatDecoder added in v0.2.0

func NewFormatDecoder(r io.Reader) FormatDecoder

func (*FormatDecoder) Next added in v0.2.0

func (d *FormatDecoder) Next() (interface{}, error)

Next returns the next format element from the stream. If an element contains a reader, that reader should be used before any subsequent calls as it'll be invalidated then. Returns nil when the end is reached.

type FormatDevice added in v0.2.0

type FormatDevice struct {
	FormatHeader
	Major uint64
	Minor uint64
}

type FormatEncoder added in v0.2.0

type FormatEncoder struct {
	// contains filtered or unexported fields
}

FormatEncoder takes casync format elements and encodes them into a stream.

func NewFormatEncoder added in v0.2.0

func NewFormatEncoder(w io.Writer) FormatEncoder

func (*FormatEncoder) Encode added in v0.2.0

func (e *FormatEncoder) Encode(v interface{}) (int64, error)

type FormatEntry added in v0.2.0

type FormatEntry struct {
	FormatHeader
	FeatureFlags uint64
	Mode         os.FileMode
	Flags        uint64
	UID          int
	GID          int
	MTime        time.Time
}

type FormatFCaps added in v0.2.0

type FormatFCaps struct {
	FormatHeader
	Data []byte
}

type FormatFilename added in v0.2.0

type FormatFilename struct {
	FormatHeader
	Name string
}

type FormatGoodbye added in v0.2.0

type FormatGoodbye struct {
	FormatHeader
	Items []FormatGoodbyeItem
}

type FormatGoodbyeItem added in v0.2.0

type FormatGoodbyeItem struct {
	Offset uint64
	Size   uint64
	Hash   uint64 // The last item in a list has the CaFormatGoodbyeTailMarker here
}

type FormatGroup added in v0.2.0

type FormatGroup struct {
	FormatHeader
	Name string
}

type FormatHeader added in v0.2.0

type FormatHeader struct {
	Size uint64
	Type uint64
}

type FormatIndex added in v0.2.0

type FormatIndex struct {
	FormatHeader
	FeatureFlags uint64
	ChunkSizeMin uint64
	ChunkSizeAvg uint64
	ChunkSizeMax uint64
}

type FormatPayload added in v0.2.0

type FormatPayload struct {
	FormatHeader
	Data io.Reader
}

type FormatSELinux added in v0.2.0

type FormatSELinux struct {
	FormatHeader
	Label string
}
type FormatSymlink struct {
	FormatHeader
	Target string
}

type FormatTable added in v0.2.0

type FormatTable struct {
	FormatHeader
	Items []FormatTableItem
}

type FormatTableItem added in v0.2.0

type FormatTableItem struct {
	Offset uint64
	Chunk  ChunkID
}

type FormatUser added in v0.2.0

type FormatUser struct {
	FormatHeader
	Name string
}

type FormatXAttr added in v0.2.0

type FormatXAttr struct {
	FormatHeader
	NameAndValue string
}

type GCIndexStore added in v0.9.0

type GCIndexStore struct {
	GCStoreBase
}

GCIndexStore is a read-write index store with Google Storage backing

func NewGCIndexStore added in v0.9.0

func NewGCIndexStore(location *url.URL, opt StoreOptions) (s GCIndexStore, e error)

NewGCIndexStore creates an index store with Google Storage backing. The URL should be provided like this: gc://bucket/prefix

func (GCIndexStore) GetIndex added in v0.9.0

func (s GCIndexStore) GetIndex(name string) (i Index, e error)

GetIndex returns an Index structure from the store

func (GCIndexStore) GetIndexReader added in v0.9.0

func (s GCIndexStore) GetIndexReader(name string) (r io.ReadCloser, err error)

GetIndexReader returns a reader for an index from an Google Storage store. Fails if the specified index file does not exist.

func (GCIndexStore) StoreIndex added in v0.9.0

func (s GCIndexStore) StoreIndex(name string, idx Index) error

StoreIndex writes the index file to the Google Storage store

type GCStore added in v0.9.0

type GCStore struct {
	GCStoreBase
}

GCStore is a read-write store with Google Storage backing

func NewGCStore added in v0.9.0

func NewGCStore(location *url.URL, opt StoreOptions) (s GCStore, e error)

NewGCStore creates a chunk store with Google Storage backing. The URL should be provided like this: gs://bucketname/prefix Credentials are passed in via the environment variables. TODO

func (GCStore) GetChunk added in v0.9.0

func (s GCStore) GetChunk(id ChunkID) (*Chunk, error)

GetChunk reads and returns one chunk from the store

func (GCStore) HasChunk added in v0.9.0

func (s GCStore) HasChunk(id ChunkID) (bool, error)

HasChunk returns true if the chunk is in the store

func (GCStore) Prune added in v0.9.0

func (s GCStore) Prune(ctx context.Context, ids map[ChunkID]struct{}) error

Prune removes any chunks from the store that are not contained in a list (map)

func (GCStore) RemoveChunk added in v0.9.0

func (s GCStore) RemoveChunk(id ChunkID) error

RemoveChunk deletes a chunk, typically an invalid one, from the filesystem. Used when verifying and repairing caches.

func (GCStore) StoreChunk added in v0.9.0

func (s GCStore) StoreChunk(chunk *Chunk) error

StoreChunk adds a new chunk to the store

type GCStoreBase added in v0.9.0

type GCStoreBase struct {
	Location string
	// contains filtered or unexported fields
}

GCStoreBase is the base object for all chunk and index stores with Google Storage backing

func NewGCStoreBase added in v0.9.0

func NewGCStoreBase(u *url.URL, opt StoreOptions) (GCStoreBase, error)

NewGCStoreBase initializes a base object used for chunk or index stores backed by Google Storage.

func (GCStoreBase) Close added in v0.9.0

func (s GCStoreBase) Close() error

Close the GCS base store. NOP opertation but needed to implement the store interface.

func (GCStoreBase) String added in v0.9.0

func (s GCStoreBase) String() string

type GetReaderForRequestBody added in v0.9.0

type GetReaderForRequestBody func() io.Reader

type HTTPHandler added in v0.2.0

type HTTPHandler struct {
	HTTPHandlerBase

	SkipVerifyWrite bool
	// contains filtered or unexported fields
}

HTTPHandler is the server-side handler for a HTTP chunk store.

func (HTTPHandler) ServeHTTP added in v0.2.0

func (h HTTPHandler) ServeHTTP(w http.ResponseWriter, r *http.Request)

type HTTPHandlerBase added in v0.3.0

type HTTPHandlerBase struct {
	// contains filtered or unexported fields
}

HTTPHandlerBase is the base object for a HTTP chunk or index store.

type HTTPIndexHandler added in v0.3.0

type HTTPIndexHandler struct {
	HTTPHandlerBase
	// contains filtered or unexported fields
}

HTTPIndexHandler is the HTTP handler for index stores.

func (HTTPIndexHandler) ServeHTTP added in v0.3.0

func (h HTTPIndexHandler) ServeHTTP(w http.ResponseWriter, r *http.Request)

type Hash added in v0.2.0

type Hash struct {
	// contains filtered or unexported fields
}

Hash implements the rolling hash algorithm used to find chunk bounaries in a stream of bytes.

func NewHash added in v0.2.0

func NewHash(size int, discriminator uint32) Hash

NewHash returns a new instance of a hash. size determines the length of the hash window used and the discriminator is used to find the boundary.

func (*Hash) Initialize added in v0.2.0

func (h *Hash) Initialize(b []byte)

Initialize the window used for the rolling hash calculation. The size of the slice must match the window size

func (*Hash) IsBoundary added in v0.2.0

func (h *Hash) IsBoundary() bool

IsBoundary returns true if the discriminator and hash match to signal a chunk boundary has been reached

func (*Hash) Reset added in v0.2.0

func (h *Hash) Reset()

Reset the hash window and value

func (*Hash) Roll added in v0.2.0

func (h *Hash) Roll(b byte)

Roll adds a new byte to the hash calculation. No useful value is returned until the hash window has been populated.

type HashAlgorithm added in v0.8.0

type HashAlgorithm interface {
	Sum([]byte) [32]byte
	Algorithm() crypto.Hash
}

HashAlgorithm is a digest algorithm used to hash chunks.

var Digest HashAlgorithm = SHA512256{}

Digest algorithm used globally for all chunk hashing. Can be set to SHA512256 (default) or to SHA256.

type Index

type Index struct {
	Index  FormatIndex
	Chunks []IndexChunk
}

Index represents the content of an index file

func ChunkStream added in v0.2.0

func ChunkStream(ctx context.Context, c Chunker, ws WriteStore, n int) (Index, error)

ChunkStream splits up a blob into chunks using the provided chunker (single stream), populates a store with the chunks and returns an index. Hashing and compression is performed in n goroutines while the hashing algorithm is performed serially.

func IndexFromReader added in v0.2.0

func IndexFromReader(r io.Reader) (c Index, err error)

IndexFromReader parses a caibx structure (from a reader) and returns a populated Caibx object

func (*Index) Length added in v0.2.0

func (i *Index) Length() int64

Length returns the total (uncompressed) size of the indexed stream

func (*Index) WriteTo added in v0.2.0

func (i *Index) WriteTo(w io.Writer) (int64, error)

WriteTo writes the index and chunk table into a stream

type IndexChunk added in v0.2.0

type IndexChunk struct {
	ID    ChunkID
	Start uint64
	Size  uint64
}

IndexChunk is a table entry in an index file containing the chunk ID (SHA256) Similar to an FormatTableItem but with Start and Size instead of just offset to make it easier to use throughout the application.

type IndexMountFS added in v0.2.0

type IndexMountFS struct {
	fs.Inode

	FName string // File name in the mountpoint
	Idx   Index  // Index of the blob
	Store Store
}

IndexMountFS is used to FUSE mount an index file (as a blob, not an archive). It present a single file underneath the mountpoint.

func NewIndexMountFS added in v0.2.0

func NewIndexMountFS(idx Index, name string, s Store) *IndexMountFS

NewIndexMountFS initializes a FUSE filesystem mount based on an index and a chunk store.

func (*IndexMountFS) Close added in v0.9.1

func (r *IndexMountFS) Close() error

func (*IndexMountFS) OnAdd added in v0.9.0

func (r *IndexMountFS) OnAdd(ctx context.Context)

OnAdd is used to build the static filesystem structure at the start of the mount.

type IndexPos added in v0.2.0

type IndexPos struct {
	Store  Store
	Index  Index
	Length int64 // total length of file
	// contains filtered or unexported fields
}

IndexPos represents a position inside an index file, to permit a seeking reader

func NewIndexReadSeeker added in v0.2.0

func NewIndexReadSeeker(i Index, s Store) *IndexPos

NewIndexReadSeeker initializes a ReadSeeker for indexes.

func (*IndexPos) Read added in v0.2.0

func (ip *IndexPos) Read(p []byte) (n int, err error)

func (*IndexPos) Seek added in v0.2.0

func (ip *IndexPos) Seek(offset int64, whence int) (int64, error)

Seek implements the io.Seeker interface. Sets the offset for the next Read operation.

type IndexSegment added in v0.4.0

type IndexSegment struct {
	// contains filtered or unexported fields
}

IndexSegment represents a contiguous section of an index which is used when assembling a file from seeds. first/last are positions in the index.

type IndexStore added in v0.3.0

type IndexStore interface {
	GetIndexReader(name string) (io.ReadCloser, error)
	GetIndex(name string) (Index, error)
	io.Closer
	fmt.Stringer
}

IndexStore is implemented by stores that hold indexes.

type IndexWriteStore added in v0.3.0

type IndexWriteStore interface {
	IndexStore
	StoreIndex(name string, idx Index) error
}

IndexWriteStore is used by stores that support reading and writing of indexes.

type Interrupted added in v0.2.0

type Interrupted struct{}

Interrupted is returned when a user interrupted a long-running operation, for example by pressing Ctrl+C

func (Interrupted) Error added in v0.2.0

func (e Interrupted) Error() string

type InvalidFormat added in v0.2.0

type InvalidFormat struct {
	Msg string
}

InvalidFormat is returned when an error occurred when parsing an archive file

func (InvalidFormat) Error added in v0.2.0

func (e InvalidFormat) Error() string

type InvalidSeedAction added in v0.9.3

type InvalidSeedAction int

InvalidSeedAction represent the action that we will take if a seed happens to be invalid. There are currently three options: - fail with an error - skip the invalid seed and try to continue - regenerate the invalid seed index

const (
	InvalidSeedActionBailOut InvalidSeedAction = iota
	InvalidSeedActionSkip
	InvalidSeedActionRegenerate
)

type LocalFS added in v0.8.0

type LocalFS struct {
	// Base directory
	Root string
	// contains filtered or unexported fields
}

LocalFS uses the local filesystem for tar/untar operations.

func NewLocalFS added in v0.8.0

func NewLocalFS(root string, opts LocalFSOptions) *LocalFS

NewLocalFS initializes a new instance of a local filesystem that can be used for tar/untar operations.

func (*LocalFS) CreateDevice added in v0.8.0

func (fs *LocalFS) CreateDevice(n NodeDevice) error

func (*LocalFS) CreateDir added in v0.8.0

func (fs *LocalFS) CreateDir(n NodeDirectory) error

func (*LocalFS) CreateFile added in v0.8.0

func (fs *LocalFS) CreateFile(n NodeFile) error
func (fs *LocalFS) CreateSymlink(n NodeSymlink) error

func (*LocalFS) Next added in v0.8.0

func (fs *LocalFS) Next() (*File, error)

Next returns the next filesystem entry or io.EOF when done. The caller is responsible for closing the returned File object.

func (*LocalFS) SetDirPermissions added in v0.9.0

func (fs *LocalFS) SetDirPermissions(n NodeDirectory) error

func (*LocalFS) SetFilePermissions added in v0.9.0

func (fs *LocalFS) SetFilePermissions(n NodeFile) error

func (*LocalFS) SetSymlinkPermissions added in v0.9.0

func (fs *LocalFS) SetSymlinkPermissions(n NodeSymlink) error

type LocalFSOptions added in v0.8.0

type LocalFSOptions struct {
	// Only used when reading from the filesystem. Will only return
	// files from the same device as the first read operation.
	OneFileSystem bool

	// When writing files, use the current owner and don't try to apply the original owner.
	NoSameOwner bool

	// Ignore the incoming permissions when writing files. Use the current default instead.
	NoSamePermissions bool

	// Reads all timestamps as zero. Used in tar operations to avoid unneccessary changes.
	NoTime bool
}

LocalFSOptions influence the behavior of the filesystem when reading from or writing too it.

type LocalIndexStore added in v0.3.0

type LocalIndexStore struct {
	Path string
}

LocalIndexStore is used to read/write index files on local disk

func NewLocalIndexStore added in v0.4.0

func NewLocalIndexStore(path string) (LocalIndexStore, error)

NewLocalIndexStore creates an instance of a local index store, it only checks presence of the store

func (LocalIndexStore) Close added in v0.3.0

func (s LocalIndexStore) Close() error

Close the index store. NOP operation, needed to implement IndexStore interface

func (LocalIndexStore) GetIndex added in v0.3.0

func (s LocalIndexStore) GetIndex(name string) (i Index, e error)

GetIndex returns an Index structure from the store

func (LocalIndexStore) GetIndexReader added in v0.3.0

func (s LocalIndexStore) GetIndexReader(name string) (rdr io.ReadCloser, e error)

GetIndexReader returns a reader of an index file in the store or an error if the specified index file does not exist.

func (LocalIndexStore) StoreIndex added in v0.3.0

func (s LocalIndexStore) StoreIndex(name string, idx Index) error

StoreIndex stores an index in the index store with the given name.

func (LocalIndexStore) String added in v0.3.0

func (s LocalIndexStore) String() string

type LocalStore

type LocalStore struct {
	Base string

	// When accessing chunks, should mtime be updated? Useful when this is
	// a cache. Old chunks can be identified and removed from the store that way
	UpdateTimes bool
	// contains filtered or unexported fields
}

LocalStore casync store

func NewLocalStore

func NewLocalStore(dir string, opt StoreOptions) (LocalStore, error)

NewLocalStore creates an instance of a local castore, it only checks presence of the store

func (LocalStore) Close added in v0.2.0

func (s LocalStore) Close() error

Close the store. NOP opertation, needed to implement Store interface.

func (LocalStore) GetChunk

func (s LocalStore) GetChunk(id ChunkID) (*Chunk, error)

GetChunk reads and returns one (compressed!) chunk from the store

func (LocalStore) HasChunk added in v0.2.0

func (s LocalStore) HasChunk(id ChunkID) (bool, error)

HasChunk returns true if the chunk is in the store

func (LocalStore) Prune added in v0.2.0

func (s LocalStore) Prune(ctx context.Context, ids map[ChunkID]struct{}) error

Prune removes any chunks from the store that are not contained in a list of chunks

func (LocalStore) RemoveChunk added in v0.2.0

func (s LocalStore) RemoveChunk(id ChunkID) error

RemoveChunk deletes a chunk, typically an invalid one, from the filesystem. Used when verifying and repairing caches.

func (LocalStore) StoreChunk

func (s LocalStore) StoreChunk(chunk *Chunk) error

StoreChunk adds a new chunk to the store

func (LocalStore) String added in v0.2.0

func (s LocalStore) String() string

func (LocalStore) Verify added in v0.2.0

func (s LocalStore) Verify(ctx context.Context, n int, repair bool, w io.Writer) error

Verify all chunks in the store. If repair is set true, bad chunks are deleted. n determines the number of concurrent operations. w is used to write any messages intended for the user, typically os.Stderr.

type Message

type Message struct {
	Type uint64
	Body []byte
}

Message represents a command sent to, or received from the communication partner.

type MountFS added in v0.9.1

type MountFS interface {
	fs.InodeEmbedder

	Close() error
}

type MtreeFS added in v0.8.0

type MtreeFS struct {
	// contains filtered or unexported fields
}

MtreeFS prints the filesystem operations to a writer (which can be os.Stdout) in mtree format.

func NewMtreeFS added in v0.8.0

func NewMtreeFS(w io.Writer) (MtreeFS, error)

NewMtreeFS initializes a new instance of an mtree decoder that writes its output into the provided stream.

func (MtreeFS) CreateDevice added in v0.8.0

func (fs MtreeFS) CreateDevice(n NodeDevice) error

func (MtreeFS) CreateDir added in v0.8.0

func (fs MtreeFS) CreateDir(n NodeDirectory) error

func (MtreeFS) CreateFile added in v0.8.0

func (fs MtreeFS) CreateFile(n NodeFile) error
func (fs MtreeFS) CreateSymlink(n NodeSymlink) error

type NoSuchObject added in v0.3.0

type NoSuchObject struct {
	// contains filtered or unexported fields
}

NoSuchObject is returned by a store that can't find a requested object

func (NoSuchObject) Error added in v0.3.0

func (e NoSuchObject) Error() string

type NodeDevice added in v0.2.0

type NodeDevice struct {
	Name   string
	UID    int
	GID    int
	Mode   os.FileMode
	Major  uint64
	Minor  uint64
	Xattrs Xattrs
	MTime  time.Time
}

NodeDevice holds device information in a catar archive

type NodeDirectory added in v0.2.0

type NodeDirectory struct {
	Name   string
	UID    int
	GID    int
	Mode   os.FileMode
	MTime  time.Time
	Xattrs Xattrs
}

NodeDirectory represents a directory in a catar archive

type NodeFile added in v0.2.0

type NodeFile struct {
	UID    int
	GID    int
	Mode   os.FileMode
	Name   string
	MTime  time.Time
	Xattrs Xattrs
	Size   uint64
	Data   io.Reader
}

NodeFile holds file permissions and data in a catar archive

type NodeSymlink struct {
	Name   string
	UID    int
	GID    int
	Mode   os.FileMode
	MTime  time.Time
	Xattrs Xattrs
	Target string
}

NodeSymlink holds symlink information in a catar archive

type NullChunk added in v0.2.0

type NullChunk struct {
	Data []byte
	ID   ChunkID
}

NullChunk is used in places where it's common to see requests for chunks containing only 0-bytes. When a chunked file has large areas of 0-bytes, the chunking algorithm does not produce split boundaries, which results in many chunks of 0-bytes of size MAX (max chunk size). The NullChunk can be used to make requesting this kind of chunk more efficient by serving it from memory, rather that request it from disk or network and decompress it repeatedly.

func NewNullChunk added in v0.2.0

func NewNullChunk(size uint64) *NullChunk

NewNullChunk returns an initialized chunk consisting of 0-bytes of 'size' which must mach the max size used in the index to be effective

type NullProgressBar added in v0.9.3

type NullProgressBar struct {
}

NullProgressBar wraps https://github.com/cheggaaa/pb and is used when we don't want to show a progressbar.

func (NullProgressBar) Add added in v0.9.3

func (p NullProgressBar) Add(add int) int

func (NullProgressBar) Finish added in v0.9.3

func (p NullProgressBar) Finish()

func (NullProgressBar) Increment added in v0.9.3

func (p NullProgressBar) Increment() int

func (NullProgressBar) Set added in v0.9.3

func (p NullProgressBar) Set(current int)

func (NullProgressBar) SetTotal added in v0.9.3

func (p NullProgressBar) SetTotal(total int)

func (NullProgressBar) Start added in v0.9.3

func (p NullProgressBar) Start()

func (NullProgressBar) Write added in v0.9.3

func (p NullProgressBar) Write(b []byte) (n int, err error)

type Plan added in v0.9.3

type Plan []SeedSegmentCandidate

func (Plan) Validate added in v0.9.3

func (p Plan) Validate(ctx context.Context, n int, pb ProgressBar) (err error)

Validate validates a proposed plan by checking if all the chosen chunks are correctly provided from the seeds. In case a seed has invalid chunks, the entire seed is marked as invalid and an error is returned.

type ProgressBar added in v0.2.0

type ProgressBar interface {
	SetTotal(total int)
	Start()
	Finish()
	Increment() int
	Add(add int) int
	Set(current int)
	io.Writer
}

ProgressBar allows clients to provide their own implementations of graphical progress visualizations. Optional, can be nil to disable this feature.

func NewProgressBar added in v0.9.3

func NewProgressBar(prefix string) ProgressBar

NewProgressBar initializes a wrapper for a https://github.com/cheggaaa/pb progressbar that implements desync.ProgressBar

type Protocol added in v0.2.0

type Protocol struct {
	// contains filtered or unexported fields
}

Protocol handles the casync protocol when using remote stores via SSH

func NewProtocol added in v0.2.0

func NewProtocol(r io.Reader, w io.Writer) *Protocol

NewProtocol creates a new casync protocol handler

func StartProtocol added in v0.2.0

func StartProtocol(u *url.URL) (*Protocol, error)

StartProtocol initiates a connection to the remote store server using the value in CASYNC_SSH_PATH (default "ssh"), and executes the command in CASYNC_REMOTE_PATH (default "casync"). It then performs the HELLO handshake to initialze the connection

func (*Protocol) Initialize added in v0.2.0

func (p *Protocol) Initialize(flags uint64) (uint64, error)

Initialize exchanges HELLOs with the other side to start a protocol session. Returns the (capability) flags provided by the other party.

func (*Protocol) ReadMessage added in v0.2.0

func (p *Protocol) ReadMessage() (Message, error)

ReadMessage reads a generic message from the other end, verifies the length, extracts the type and returns the message body as byte slice

func (*Protocol) RecvHello added in v0.2.0

func (p *Protocol) RecvHello() (uint64, error)

RecvHello waits for the server to send a HELLO, fails if anything else is received. Returns the flags provided by the server.

func (*Protocol) RequestChunk added in v0.2.0

func (p *Protocol) RequestChunk(id ChunkID) (*Chunk, error)

RequestChunk sends a request for a specific chunk to the server, waits for the response and returns the bytes in the chunk. Returns an error if the server reports the chunk as missing

func (*Protocol) SendGoodbye added in v0.2.0

func (p *Protocol) SendGoodbye() error

SendGoodbye tells the other side to terminate gracefully

func (*Protocol) SendHello added in v0.2.0

func (p *Protocol) SendHello(flags uint64) error

SendHello sends a HELLO message to the server, with the flags signaling which service is being requested from it.

func (*Protocol) SendMissing added in v0.2.0

func (p *Protocol) SendMissing(id ChunkID) error

SendMissing tells the client that the requested chunk is not available

func (*Protocol) SendProtocolChunk added in v0.2.0

func (p *Protocol) SendProtocolChunk(id ChunkID, flags uint64, chunk []byte) error

SendProtocolChunk responds to a request with the content of a chunk

func (*Protocol) SendProtocolRequest added in v0.2.0

func (p *Protocol) SendProtocolRequest(id ChunkID, flags uint64) error

SendProtocolRequest requests a chunk from a server

func (*Protocol) WriteMessage added in v0.2.0

func (p *Protocol) WriteMessage(m Message) error

WriteMessage sends a generic message to the server

type ProtocolServer added in v0.2.0

type ProtocolServer struct {
	// contains filtered or unexported fields
}

ProtocolServer serves up chunks from a local store using the casync protocol

func NewProtocolServer added in v0.2.0

func NewProtocolServer(r io.Reader, w io.Writer, s Store) *ProtocolServer

NewProtocolServer returns an initialized server that can serve chunks from a chunk store via the casync protocol

func (*ProtocolServer) Serve added in v0.2.0

func (s *ProtocolServer) Serve(ctx context.Context) error

Serve starts the protocol server. Blocks unless an error is encountered

type PruneStore added in v0.2.0

type PruneStore interface {
	WriteStore
	Prune(ctx context.Context, ids map[ChunkID]struct{}) error
}

PruneStore is a store that supports read, write and pruning of chunks

type RemoteHTTP added in v0.2.0

type RemoteHTTP struct {
	*RemoteHTTPBase
}

RemoteHTTP is a remote casync store accessed via HTTP.

func NewRemoteHTTPStore added in v0.2.0

func NewRemoteHTTPStore(location *url.URL, opt StoreOptions) (*RemoteHTTP, error)

NewRemoteHTTPStore initializes a new store that pulls chunks via HTTP(S) from a remote web server. n defines the size of idle connections allowed.

func (*RemoteHTTP) GetChunk added in v0.2.0

func (r *RemoteHTTP) GetChunk(id ChunkID) (*Chunk, error)

GetChunk reads and returns one chunk from the store

func (*RemoteHTTP) HasChunk added in v0.2.0

func (r *RemoteHTTP) HasChunk(id ChunkID) (bool, error)

HasChunk returns true if the chunk is in the store

func (*RemoteHTTP) StoreChunk added in v0.2.0

func (r *RemoteHTTP) StoreChunk(chunk *Chunk) error

StoreChunk adds a new chunk to the store

type RemoteHTTPBase added in v0.3.0

type RemoteHTTPBase struct {
	// contains filtered or unexported fields
}

RemoteHTTPBase is the base object for a remote, HTTP-based chunk or index stores.

func NewRemoteHTTPStoreBase added in v0.3.0

func NewRemoteHTTPStoreBase(location *url.URL, opt StoreOptions) (*RemoteHTTPBase, error)

NewRemoteHTTPStoreBase initializes a base object for HTTP index or chunk stores.

func (*RemoteHTTPBase) Close added in v0.3.0

func (r *RemoteHTTPBase) Close() error

Close the HTTP store. NOP operation but needed to implement the interface.

func (*RemoteHTTPBase) GetObject added in v0.3.0

func (r *RemoteHTTPBase) GetObject(name string) ([]byte, error)

GetObject reads and returns an object in the form of []byte from the store

func (*RemoteHTTPBase) IssueHttpRequest added in v0.9.0

func (r *RemoteHTTPBase) IssueHttpRequest(method string, u *url.URL, getReader GetReaderForRequestBody, attempt int) (int, []byte, error)

Send a single HTTP request.

func (*RemoteHTTPBase) IssueRetryableHttpRequest added in v0.9.0

func (r *RemoteHTTPBase) IssueRetryableHttpRequest(method string, u *url.URL, getReader GetReaderForRequestBody) (int, []byte, error)

Send a single HTTP request, retrying if a retryable error has occurred.

func (*RemoteHTTPBase) StoreObject added in v0.3.0

func (r *RemoteHTTPBase) StoreObject(name string, getReader GetReaderForRequestBody) error

StoreObject stores an object to the store.

func (*RemoteHTTPBase) String added in v0.3.0

func (r *RemoteHTTPBase) String() string

type RemoteHTTPIndex added in v0.3.0

type RemoteHTTPIndex struct {
	*RemoteHTTPBase
}

RemoteHTTPIndex is a remote index store accessed via HTTP.

func NewRemoteHTTPIndexStore added in v0.3.0

func NewRemoteHTTPIndexStore(location *url.URL, opt StoreOptions) (*RemoteHTTPIndex, error)

NewRemoteHTTPIndexStore initializes a new store that pulls the specified index file via HTTP(S) from a remote web server.

func (*RemoteHTTPIndex) GetIndex added in v0.3.0

func (r *RemoteHTTPIndex) GetIndex(name string) (i Index, e error)

GetIndex returns an Index structure from the store

func (RemoteHTTPIndex) GetIndexReader added in v0.3.0

func (r RemoteHTTPIndex) GetIndexReader(name string) (rdr io.ReadCloser, e error)

GetIndexReader returns an index reader from an HTTP store. Fails if the specified index file does not exist.

func (*RemoteHTTPIndex) StoreIndex added in v0.3.0

func (r *RemoteHTTPIndex) StoreIndex(name string, idx Index) error

StoreIndex adds a new chunk to the store

type RemoteSSH

type RemoteSSH struct {
	// contains filtered or unexported fields
}

RemoteSSH is a remote casync store accessed via SSH. Supports running multiple sessions to improve throughput.

func NewRemoteSSHStore

func NewRemoteSSHStore(location *url.URL, opt StoreOptions) (*RemoteSSH, error)

NewRemoteSSHStore establishes up to n connections with a casync chunk server

func (*RemoteSSH) Close added in v0.2.0

func (r *RemoteSSH) Close() error

Close terminates all client connections

func (*RemoteSSH) GetChunk

func (r *RemoteSSH) GetChunk(id ChunkID) (*Chunk, error)

GetChunk requests a chunk from the server and returns a (compressed) one. It uses any of the n sessions this store maintains in its pool. Blocks until one session becomes available

func (*RemoteSSH) HasChunk added in v0.2.0

func (r *RemoteSSH) HasChunk(id ChunkID) (bool, error)

HasChunk returns true if the chunk is in the store. TODO: Implementing it this way, pulling the whole chunk just to see if it's present, is very inefficient. I'm not aware of a way to implement it with the casync protocol any other way.

func (*RemoteSSH) String added in v0.2.0

func (r *RemoteSSH) String() string

type RepairableCache added in v0.9.3

type RepairableCache struct {
	// contains filtered or unexported fields
}

New cache which GetChunk() function will return ChunkMissing error instead of ChunkInvalid so caller can redownload invalid chunk from store

func NewRepairableCache added in v0.9.3

func NewRepairableCache(l WriteStore) RepairableCache

Create new RepairableCache that wraps WriteStore and modify its GetChunk() so ChunkInvalid error will be replaced by ChunkMissing error

func (RepairableCache) Close added in v0.9.3

func (r RepairableCache) Close() error

func (RepairableCache) GetChunk added in v0.9.3

func (r RepairableCache) GetChunk(id ChunkID) (*Chunk, error)

func (RepairableCache) HasChunk added in v0.9.3

func (r RepairableCache) HasChunk(id ChunkID) (bool, error)

func (RepairableCache) StoreChunk added in v0.9.3

func (r RepairableCache) StoreChunk(c *Chunk) error

func (RepairableCache) String added in v0.9.3

func (r RepairableCache) String() string

type S3IndexStore added in v0.3.0

type S3IndexStore struct {
	S3StoreBase
}

S3IndexStore is a read-write index store with S3 backing

func NewS3IndexStore added in v0.3.0

func NewS3IndexStore(location *url.URL, s3Creds *credentials.Credentials, region string, opt StoreOptions, lookupType minio.BucketLookupType) (s S3IndexStore, e error)

NewS3IndexStore creates an index store with S3 backing. The URL should be provided like this: s3+http://host:port/bucket Credentials are passed in via the environment variables S3_ACCESS_KEY and S3S3_SECRET_KEY, or via the desync config file.

func (S3IndexStore) GetIndex added in v0.3.0

func (s S3IndexStore) GetIndex(name string) (i Index, e error)

GetIndex returns an Index structure from the store

func (S3IndexStore) GetIndexReader added in v0.3.0

func (s S3IndexStore) GetIndexReader(name string) (r io.ReadCloser, e error)

GetIndexReader returns a reader for an index from an S3 store. Fails if the specified index file does not exist.

func (S3IndexStore) StoreIndex added in v0.3.0

func (s S3IndexStore) StoreIndex(name string, idx Index) error

StoreIndex writes the index file to the S3 store

type S3Store added in v0.2.0

type S3Store struct {
	S3StoreBase
}

S3Store is a read-write store with S3 backing

func NewS3Store added in v0.2.0

func NewS3Store(location *url.URL, s3Creds *credentials.Credentials, region string, opt StoreOptions, lookupType minio.BucketLookupType) (s S3Store, e error)

NewS3Store creates a chunk store with S3 backing. The URL should be provided like this: s3+http://host:port/bucket Credentials are passed in via the environment variables S3_ACCESS_KEY and S3S3_SECRET_KEY, or via the desync config file.

func (S3Store) GetChunk added in v0.2.0

func (s S3Store) GetChunk(id ChunkID) (*Chunk, error)

GetChunk reads and returns one chunk from the store

func (S3Store) HasChunk added in v0.2.0

func (s S3Store) HasChunk(id ChunkID) (bool, error)

HasChunk returns true if the chunk is in the store

func (S3Store) Prune added in v0.2.0

func (s S3Store) Prune(ctx context.Context, ids map[ChunkID]struct{}) error

Prune removes any chunks from the store that are not contained in a list (map)

func (S3Store) RemoveChunk added in v0.2.0

func (s S3Store) RemoveChunk(id ChunkID) error

RemoveChunk deletes a chunk, typically an invalid one, from the filesystem. Used when verifying and repairing caches.

func (S3Store) StoreChunk added in v0.2.0

func (s S3Store) StoreChunk(chunk *Chunk) error

StoreChunk adds a new chunk to the store

type S3StoreBase added in v0.3.0

type S3StoreBase struct {
	Location string
	// contains filtered or unexported fields
}

S3StoreBase is the base object for all chunk and index stores with S3 backing

func NewS3StoreBase added in v0.3.0

func NewS3StoreBase(u *url.URL, s3Creds *credentials.Credentials, region string, opt StoreOptions, lookupType minio.BucketLookupType) (S3StoreBase, error)

NewS3StoreBase initializes a base object used for chunk or index stores backed by S3.

func (S3StoreBase) Close added in v0.3.0

func (s S3StoreBase) Close() error

Close the S3 base store. NOP operation but needed to implement the store interface.

func (S3StoreBase) String added in v0.3.0

func (s S3StoreBase) String() string

type SFTPIndexStore added in v0.3.0

type SFTPIndexStore struct {
	*SFTPStoreBase
}

SFTPIndexStore is an index store backed by SFTP over SSH

func NewSFTPIndexStore added in v0.3.0

func NewSFTPIndexStore(location *url.URL, opt StoreOptions) (*SFTPIndexStore, error)

NewSFTPIndexStore initializes and index store backed by SFTP over SSH.

func (*SFTPIndexStore) GetIndex added in v0.3.0

func (s *SFTPIndexStore) GetIndex(name string) (i Index, e error)

GetIndex reads an index from an SFTP store, returns an error if the specified index file does not exist.

func (*SFTPIndexStore) GetIndexReader added in v0.3.0

func (s *SFTPIndexStore) GetIndexReader(name string) (r io.ReadCloser, e error)

GetIndexReader returns a reader of an index from an SFTP store. Fails if the specified index file does not exist.

func (*SFTPIndexStore) StoreIndex added in v0.3.0

func (s *SFTPIndexStore) StoreIndex(name string, idx Index) error

StoreIndex adds a new index to the store

type SFTPStore added in v0.2.0

type SFTPStore struct {
	// contains filtered or unexported fields
}

SFTPStore is a chunk store that uses SFTP over SSH.

func NewSFTPStore added in v0.2.0

func NewSFTPStore(location *url.URL, opt StoreOptions) (*SFTPStore, error)

NewSFTPStore initializes a chunk store using SFTP over SSH.

func (*SFTPStore) Close added in v0.2.0

func (s *SFTPStore) Close() error

Close terminates all client connections

func (*SFTPStore) GetChunk added in v0.2.0

func (s *SFTPStore) GetChunk(id ChunkID) (*Chunk, error)

GetChunk returns a chunk from an SFTP store, returns ChunkMissing if the file does not exist

func (*SFTPStore) HasChunk added in v0.2.0

func (s *SFTPStore) HasChunk(id ChunkID) (bool, error)

HasChunk returns true if the chunk is in the store

func (*SFTPStore) Prune added in v0.2.0

func (s *SFTPStore) Prune(ctx context.Context, ids map[ChunkID]struct{}) error

Prune removes any chunks from the store that are not contained in a list of chunks

func (*SFTPStore) RemoveChunk added in v0.2.0

func (s *SFTPStore) RemoveChunk(id ChunkID) error

RemoveChunk deletes a chunk, typically an invalid one, from the filesystem. Used when verifying and repairing caches.

func (*SFTPStore) StoreChunk added in v0.2.0

func (s *SFTPStore) StoreChunk(chunk *Chunk) error

StoreChunk adds a new chunk to the store

func (*SFTPStore) String added in v0.2.0

func (s *SFTPStore) String() string

type SFTPStoreBase added in v0.3.0

type SFTPStoreBase struct {
	// contains filtered or unexported fields
}

SFTPStoreBase is the base object for SFTP chunk and index stores.

func (*SFTPStoreBase) Close added in v0.3.0

func (s *SFTPStoreBase) Close() error

Close terminates all client connections

func (*SFTPStoreBase) StoreObject added in v0.3.0

func (s *SFTPStoreBase) StoreObject(name string, r io.Reader) error

StoreObject adds a new object to a writable index or chunk store.

func (*SFTPStoreBase) String added in v0.3.0

func (s *SFTPStoreBase) String() string

type SHA256 added in v0.8.0

type SHA256 struct{}

SHA256 hashing algoritm for Digest.

func (SHA256) Algorithm added in v0.8.0

func (h SHA256) Algorithm() crypto.Hash

func (SHA256) Sum added in v0.8.0

func (h SHA256) Sum(data []byte) [32]byte

type SHA512256 added in v0.8.0

type SHA512256 struct{}

SHA512-256 hashing algoritm for Digest.

func (SHA512256) Algorithm added in v0.8.0

func (h SHA512256) Algorithm() crypto.Hash

func (SHA512256) Sum added in v0.8.0

func (h SHA512256) Sum(data []byte) [32]byte

type Seed added in v0.4.0

type Seed interface {
	LongestMatchWith(chunks []IndexChunk) (int, SeedSegment)
	RegenerateIndex(ctx context.Context, n int, attempt int, seedNumber int) error
	SetInvalid(value bool)
	IsInvalid() bool
}

Seed represent a source of chunks other than the store. Typically a seed is another index+blob that present on disk already and is used to copy or clone existing chunks or blocks into the target from.

type SeedSegment added in v0.4.0

type SeedSegment interface {
	FileName() string
	Size() uint64
	Validate(file *os.File) error
	WriteInto(dst *os.File, offset, end, blocksize uint64, isBlank bool) (copied uint64, cloned uint64, err error)
}

SeedSegment represents a matching range between a Seed and a file being assembled from an Index. It's used to copy or reflink data from seeds into a target file during an extract operation.

type SeedSegmentCandidate added in v0.9.3

type SeedSegmentCandidate struct {
	// contains filtered or unexported fields
}

SeedSegmentCandidate represent a single segment that we expect to use in a Plan

type SeedSequencer added in v0.4.0

type SeedSequencer struct {
	// contains filtered or unexported fields
}

SeedSequencer is used to find sequences of chunks from seed files when assembling a file from an index. Using seeds reduces the need to download and decompress chunks from chunk stores. It also enables the use of reflinking/cloning of sections of files from a seed file where supported to reduce disk usage.

func NewSeedSequencer added in v0.4.0

func NewSeedSequencer(idx Index, src ...Seed) *SeedSequencer

NewSeedSequencer initializes a new sequencer from a number of seeds.

func (*SeedSequencer) Next added in v0.4.0

func (r *SeedSequencer) Next() (seed Seed, segment IndexSegment, source SeedSegment, done bool)

Next returns a sequence of index chunks (from the target index) and the longest matching segment from one of the seeds. If source is nil, no match was found in the seeds and the chunk needs to be retrieved from a store. If done is true, the sequencer is complete.

func (*SeedSequencer) Plan added in v0.9.3

func (r *SeedSequencer) Plan() (plan Plan)

Plan returns a new possible plan, representing an ordered list of segments that can be used to re-assemble the requested file

func (*SeedSequencer) RegenerateInvalidSeeds added in v0.9.3

func (r *SeedSequencer) RegenerateInvalidSeeds(ctx context.Context, n int, attempt int) error

RegenerateInvalidSeeds regenerates the index to match the unexpected seed content

func (*SeedSequencer) Rewind added in v0.9.3

func (r *SeedSequencer) Rewind()

Rewind resets the current target index to the beginning.

type SparseFile added in v0.9.1

type SparseFile struct {
	// contains filtered or unexported fields
}

SparseFile represents a file that is written as it is read (Copy-on-read). It is used as a fast cache. Any chunk read from the store to satisfy a read operation is written to the file.

func NewSparseFile added in v0.9.1

func NewSparseFile(name string, idx Index, s Store, opt SparseFileOptions) (*SparseFile, error)

func (*SparseFile) Length added in v0.9.1

func (sf *SparseFile) Length() int64

Length returns the size of the index used for the sparse file.

func (*SparseFile) Open added in v0.9.1

func (sf *SparseFile) Open() (*SparseFileHandle, error)

Open returns a handle for a sparse file.

func (*SparseFile) WriteState added in v0.9.1

func (sf *SparseFile) WriteState() error

WriteState saves the state of file, basically which chunks were loaded and which ones weren't.

type SparseFileHandle added in v0.9.1

type SparseFileHandle struct {
	// contains filtered or unexported fields
}

SparseFileHandle is used to access a sparse file. All read operations performed on the handle are either done on the file if the required ranges are available or loaded from the store and written to the file.

func (*SparseFileHandle) Close added in v0.9.1

func (h *SparseFileHandle) Close() error

func (*SparseFileHandle) ReadAt added in v0.9.1

func (h *SparseFileHandle) ReadAt(b []byte, offset int64) (int, error)

ReadAt reads from the sparse file. All accessed ranges are first written to the file and then returned.

type SparseFileOptions added in v0.9.1

type SparseFileOptions struct {
	// Optional, save the state of the sparse file on exit or SIGHUP. The state file
	// contains information which chunks from the index have been read and are
	// populated in the sparse file. If the state and sparse file exist and match,
	// the sparse file is used as is (not re-populated).
	StateSaveFile string

	// Optional, load all chunks that are marked as read in this state file. It is used
	// to pre-populate a new sparse file if the sparse file or the save state file aren't
	// present or don't match the index. SaveStateFile and StateInitFile can be the same.
	StateInitFile string

	// Optional, number of goroutines to preload chunks from StateInitFile.
	StateInitConcurrency int
}

type SparseMountFS added in v0.9.1

type SparseMountFS struct {
	fs.Inode

	FName string // File name in the mountpoint
	// contains filtered or unexported fields
}

SparseMountFS is used to FUSE mount an index file (as a blob, not an archive). It uses a (local) sparse file as cache to improve performance. Every chunk that is being read is written into the sparse file

func NewSparseMountFS added in v0.9.1

func NewSparseMountFS(idx Index, name string, s Store, sparseFile string, opt SparseFileOptions) (*SparseMountFS, error)

NewSparseMountFS initializes a FUSE filesystem mount based on an index, a sparse file and a chunk store.

func (*SparseMountFS) Close added in v0.9.1

func (r *SparseMountFS) Close() error

Close the sparse file and save its state.

func (*SparseMountFS) OnAdd added in v0.9.1

func (r *SparseMountFS) OnAdd(ctx context.Context)

OnAdd is used to build the static filesystem structure at the start of the mount.

func (*SparseMountFS) WriteState added in v0.9.1

func (r *SparseMountFS) WriteState() error

Save the state of the sparse file.

type Store

type Store interface {
	GetChunk(id ChunkID) (*Chunk, error)
	HasChunk(id ChunkID) (bool, error)
	io.Closer
	fmt.Stringer
}

Store is a generic interface implemented by read-only stores, like SSH or HTTP remote stores currently.

type StoreOptions added in v0.4.0

type StoreOptions struct {
	// Concurrency used in the store. Depending on store type, it's used for
	// the number of goroutines, processes, or connection pool size.
	N int `json:"n,omitempty"`

	// Cert file name for HTTP SSL connections that require mutual SSL.
	ClientCert string `json:"client-cert,omitempty"`
	// Key file name for HTTP SSL connections that require mutual SSL.
	ClientKey string `json:"client-key,omitempty"`

	// CA certificates to trust in TLS connections. If not set, the systems CA store is used.
	CACert string `json:"ca-cert,omitempty"`

	// Trust any certificate presented by the remote chunk store.
	TrustInsecure bool `json:"trust-insecure,omitempty"`

	// Authorization header value for HTTP stores
	HTTPAuth string `json:"http-auth,omitempty"`

	// Cookie header value for HTTP stores
	HTTPCookie string `json:"http-cookie,omitempty"`

	// Timeout for waiting for objects to be retrieved. Infinite if negative. Default: 1 minute
	Timeout time.Duration `json:"timeout,omitempty"`

	// Number of times object retrieval should be attempted on error. Useful when dealing
	// with unreliable connections. Default: 0
	ErrorRetry int `json:"error-retry,omitempty"`

	// Number of nanoseconds to wait before first retry attempt.
	// Retry attempt number N for the same request will wait N times this interval.
	// Default: 0 nanoseconds
	ErrorRetryBaseInterval time.Duration `json:"error-retry-base-interval,omitempty"`

	// If SkipVerify is true, this store will not verify the data it reads and serves up. This is
	// helpful when a store is merely a proxy and the data will pass through additional stores
	// before being used. Verifying the checksum of a chunk requires it be uncompressed, so if
	// a compressed chunkstore is being proxied, all chunks would have to be decompressed first.
	// This setting avoids the extra overhead. While this could be used in other cases, it's not
	// recommended as a damaged chunk might be processed further leading to unpredictable results.
	SkipVerify bool `json:"skip-verify,omitempty"`

	// Store and read chunks uncompressed, without chunk file extension
	Uncompressed bool `json:"uncompressed"`
}

StoreOptions provide additional common settings used in chunk stores, such as compression error retry or timeouts. Not all options available are applicable to all types of stores.

type StoreRouter added in v0.2.0

type StoreRouter struct {
	Stores []Store
}

StoreRouter is used to route requests to multiple stores. When a chunk is requested from the router, it'll query the first store and if that returns ChunkMissing, it'll move on to the next.

func NewStoreRouter added in v0.2.0

func NewStoreRouter(stores ...Store) StoreRouter

NewStoreRouter returns an initialized router

func (StoreRouter) Close added in v0.2.0

func (r StoreRouter) Close() error

Close calls the Close() method on every store in the router. Returns only the first error encountered.

func (StoreRouter) GetChunk added in v0.2.0

func (r StoreRouter) GetChunk(id ChunkID) (*Chunk, error)

GetChunk queries the available stores in order and moves to the next if it gets a ChunkMissing. Fails if any store returns a different error.

func (StoreRouter) HasChunk added in v0.2.0

func (r StoreRouter) HasChunk(id ChunkID) (bool, error)

HasChunk returns true if one of the containing stores has the chunk. It goes through the stores in order and returns as soon as the chunk is found.

func (StoreRouter) String added in v0.2.0

func (r StoreRouter) String() string

type SwapStore added in v0.9.0

type SwapStore struct {
	// contains filtered or unexported fields
}

SwapStore wraps another store and provides the ability to swap out the underlying store with another one while under load. Typically used to reload config for long-running processes, perhaps reloading a store config file on SIGHUP and updating the store on-the-fly without restart.

func NewSwapStore added in v0.9.0

func NewSwapStore(s Store) *SwapStore

NewSwapStore creates an instance of a swap store wrapper that allows replacing the wrapped store at runtime.

func (*SwapStore) Close added in v0.9.0

func (s *SwapStore) Close() error

Close the store. NOP opertation, needed to implement Store interface.

func (*SwapStore) GetChunk added in v0.9.0

func (s *SwapStore) GetChunk(id ChunkID) (*Chunk, error)

GetChunk reads and returns one (compressed!) chunk from the store

func (*SwapStore) HasChunk added in v0.9.0

func (s *SwapStore) HasChunk(id ChunkID) (bool, error)

HasChunk returns true if the chunk is in the store

func (*SwapStore) String added in v0.9.0

func (s *SwapStore) String() string

func (*SwapStore) Swap added in v0.9.0

func (s *SwapStore) Swap(new Store) error

Close the store. NOP opertation, needed to implement Store interface.

type SwapWriteStore added in v0.9.0

type SwapWriteStore struct {
	SwapStore
}

SwapWriteStore does ther same as SwapStore but implements WriteStore as well.

func NewSwapWriteStore added in v0.9.0

func NewSwapWriteStore(s Store) *SwapWriteStore

NewSwapWriteStore initializes as new instance of a swap store that supports writing and swapping at runtime.

func (*SwapWriteStore) StoreChunk added in v0.9.0

func (s *SwapWriteStore) StoreChunk(chunk *Chunk) error

StoreChunk adds a new chunk to the store

type TarReader added in v0.8.0

type TarReader struct {
	// contains filtered or unexported fields
}

TarReader uses a GNU tar archive as source for a tar operation (to produce a catar).

func NewTarReader added in v0.8.0

func NewTarReader(r io.Reader, opts TarReaderOptions) *TarReader

NewTarFS initializes a new instance of a GNU tar archive that can be used for catar archive tar/untar operations.

func (*TarReader) Next added in v0.8.0

func (fs *TarReader) Next() (f *File, err error)

Next returns the next filesystem entry or io.EOF when done. The caller is responsible for closing the returned File object.

type TarReaderOptions added in v0.9.0

type TarReaderOptions struct {
	AddRoot bool
}

type TarWriter added in v0.8.0

type TarWriter struct {
	// contains filtered or unexported fields
}

TarWriter uses a GNU tar archive for tar/untar operations of a catar archive.

func NewTarWriter added in v0.8.0

func NewTarWriter(w io.Writer) TarWriter

NewTarFS initializes a new instance of a GNU tar archive that can be used for catar archive tar/untar operations.

func (TarWriter) Close added in v0.8.0

func (fs TarWriter) Close() error

func (TarWriter) CreateDevice added in v0.8.0

func (fs TarWriter) CreateDevice(n NodeDevice) error

func (TarWriter) CreateDir added in v0.8.0

func (fs TarWriter) CreateDir(n NodeDirectory) error

func (TarWriter) CreateFile added in v0.8.0

func (fs TarWriter) CreateFile(n NodeFile) error
func (fs TarWriter) CreateSymlink(n NodeSymlink) error

type WriteDedupQueue added in v0.9.0

type WriteDedupQueue struct {
	S WriteStore
	*DedupQueue
	// contains filtered or unexported fields
}

WriteDedupQueue wraps a writable store and provides deduplication of incoming chunk requests and store operation. This is useful when a burst of requests for the same chunk is received and the chunk store serving those is slow or when the underlying filesystem does not support atomic rename operations (Windows). With the DedupQueue wrapper, concurrent requests for the same chunk will result in just one request to the upstream store. Implements the WriteStore interface.

func NewWriteDedupQueue added in v0.9.0

func NewWriteDedupQueue(store WriteStore) *WriteDedupQueue

NewWriteDedupQueue initializes a new instance of the wrapper.

func (*WriteDedupQueue) GetChunk added in v0.9.0

func (q *WriteDedupQueue) GetChunk(id ChunkID) (*Chunk, error)

func (*WriteDedupQueue) HasChunk added in v0.9.0

func (q *WriteDedupQueue) HasChunk(id ChunkID) (bool, error)

func (*WriteDedupQueue) StoreChunk added in v0.9.0

func (q *WriteDedupQueue) StoreChunk(chunk *Chunk) error

type WriteStore added in v0.2.0

type WriteStore interface {
	Store
	StoreChunk(c *Chunk) error
}

WriteStore is implemented by stores supporting both read and write operations such as a local store or an S3 store.

type Xattrs added in v0.5.0

type Xattrs map[string]string

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL