shade

package module
v0.0.0-...-c426cd6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 22, 2020 License: Apache-2.0 Imports: 12 Imported by: 15

README

shade

shade (the SHA Drive Engine) stores files locally and/or in the cloud, in a flexible fashion, optionally encrypted.

The primary interface is a FUSE filesystem for interacting with shade. There is a command line tool "throw" which can cheaply add new files to shade. It can be configured such that it can add too the respository, but cannot read the encrypted contents. There is also a command line debugging tool, shadeutil, for investigating and tinkering with the repository.

The basic method of file storage

  1. Represent the file as a series of chunks, of a configurable size (16MB by default).
  2. Calculate a SHA-256 hash for each chunk.
  3. Store the chunk in the configured Drive client
  4. Create a manifest file (a shade.File struct) with:
    • Filename
    • Chunk size
    • Indexed list of chunks
  5. Calculate a SHA-256 hash of the manifest.
  6. Store the shade.File in 1 or more Drive implemgntations (just like Chunk, but retrievable separately).

Retrieving a file works much the same, just in reverse:

  1. Download all of the manifest files.
  2. Find the filename with the latest ModifiedTime which matches the request.
  3. If necessary, decrypt the chunk(s).

shade/drive Drive interface

The Drive interface provides a way to store and retrieve two separate buckets of bytes, called Files and Chunks, each identified by their sha256sum. It also provides a way to list the sha256sum of all known Files. There is also an interface for iteratively listing the sums all the Chunks (potentially a much larger amount of data).

The interface also provides a bit of metadata about the implementation, such as a name for identifying it, if it stores files persistently and/or remotely, and a way to retrieve the configuration that intialized the implementation.

drive.Drive implementations

There are several implementations of drive.Drive clients. Some are only for testing (eg. drive/win, drive/fail), some are for local caching (drive/memory, drive/local), and some are for remote/cloud storage (drive/amazon, drive/google). There are a few special implementations which allow you to combine (drive/cache) or augment (drive/encrypt) the other implementations.

These implementations can be combined in novel ways by the config package. Trust your local machine? You can create a config which will encrypt only the bytes the leave your machine and go to a remote provider. Want to always encrypt bytes at rest? You can build a config which will encrypt even the local disk storage, but still cache all File objects unencrypted in memory for more efficient reads.

Encryption overview

The drive/encrypt module will encrypt writes to its child client. It will AES-256 encrypt the chunked contents of the files, the File objects that describe the metadata, and even the sha256sums of the chunks. It then RSA encrypts the AES-256 key and stores the encrypted key with the File object.

RSA public and private keypairs are provided via the config package. It is supported to provide only a public RSA key pair. This is useful with cmd/throw/throw.go, which is a "write only" tool which cannot read back any of the data once it is writen.

For additional details on the implementation, see the godoc for the drive/encrypt module.

NB: Encrypting the contents stored in Drive comes with two penalties:

  1. Modest CPU usage to encrypt/decrypt on the way in/out.
  2. The chunks of identical files will not be deduplicated.

Cleanup

When files are modified or overwritten (functionally the same), the previous File manifest, and some (or all) of the Chunks of the previous File may be orphaned. The umbrella package contains code for cleaning up these orphaned files and chunks. You can invoke single passes of it with shadeutil. A tool to do periodic cleanup is planned.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ConfigDir

func ConfigDir() string

ConfigDir identifies the correct path to store persistent configuration data on various operating systems.

func NewNonce

func NewNonce() []byte

NewNonce generates a random Nonce for AES-GCM. It panics if the source of randomness fails.

func NewSymmetricKey

func NewSymmetricKey() *[32]byte

NewSymmetricKey generates a random 256-bit AES key for File{}s. It panics if the source of randomness fails.

func Sum

func Sum(data []byte) []byte

Sum is the uniform hash calculation used for all operations on Shade data.

func SumString

func SumString(data []byte) string

SumString returns a string representation of a Shade Sum.

Types

type Chunk

type Chunk struct {
	Index  int
	Sha256 []byte
	Nonce  []byte // If encrypted, use this Nonce to store/retrieve the Sum.
}

Chunk represents a portion of the content of the File being stored.

func NewChunk

func NewChunk() Chunk

NewChunk returns a new Chunk object.

It ensures that each new chunk has a unique cryptographically secure Nonce.

func (*Chunk) String

func (c *Chunk) String() string

type File

type File struct {
	// Filename is a fully qualified path, with no leading slash.
	Filename string
	Filesize int64 // Bytes

	// ModifiedTime represents the "commit" time of this File object.  A given
	// Filename is represented by the valid File with the latest ModifiedTime.
	ModifiedTime time.Time

	// Chunks represets an ordered list of the bytes in the file.
	Chunks []Chunk

	// Chunksize is the maximum size of each plaintext Chunk, in bytes.
	Chunksize int

	// LastChunksize is the size of the last chunk in the File.  Storing this
	// explicity avoids the need to fetch the last chunk to update the Filesize.
	LastChunksize int

	// Deleted indicates all previous versions of this file should be suppressed.
	Deleted bool

	// AesKey is a 256 bit key used to encrypt the Chunks with AES-GCM.  If no
	// key is provided, the blocks are not encrypted.  The GCM nonce is stored at
	// the front of the encrypted Chunk using gcm.Seal(); use gcm.Open() to
	// recover the Nonce when decrypting.  Nb: This increases the encrypted
	// Chunk's size by gcm.NonceSize(), currently 12 bytes.
	AesKey *[32]byte
}

File represents the metadata of a file stored in Shade. It is stored and retrieved by the drive.Client API, and boiled down

func NewFile

func NewFile(filename string) *File

NewFile returns a new File object for the given filename.

It initializes an AesKey, sets the ModifiedTime to time.Now(), and sets the default Chunksize based on --chunksize.

func (*File) FromJSON

func (f *File) FromJSON(fj []byte) error

FromJSON populates the fields of this File struct from a JSON representation. It primarily provides a convenient error message if this fails.

func (*File) String

func (f *File) String() string

func (*File) ToJSON

func (f *File) ToJSON() ([]byte, error)

ToJSON returns a JSON representation of the File struct.

func (*File) UpdateFilesize

func (f *File) UpdateFilesize()

UpdateFilesize calculates the size of the assocaited Chunks and sets the Filesize member of the struct.

Directories

Path Synopsis
cmd
shade
shade presents a fuse filesystem interface.
shade presents a fuse filesystem interface.
shadeutil
shadeutil contains tools for inspecting shade repositories.
shadeutil contains tools for inspecting shade repositories.
throw
throw stores a file in the cloud, encrypted.
throw stores a file in the cloud, encrypted.
Package config reads and parses a JSON config which must represent a single Drive object.
Package config reads and parses a JSON config which must represent a single Drive object.
cache
Package cache is an interface to multiple storage backends for Shade.
Package cache is an interface to multiple storage backends for Shade.
encrypt
Package encrypt is an interface to manage encrypted storage backends.
Package encrypt is an interface to manage encrypted storage backends.
fail
Package fail is a test client.
Package fail is a test client.
google
Package google provides a Shade storage implementation for Google Drive.
Package google provides a Shade storage implementation for Google Drive.
google/zerobyte
zerobyte iterates all the shade files, reads their first byte, and adds it as a Property of the file.
zerobyte iterates all the shade files, reads their first byte, and adds it as a Property of the file.
local
Package local is a persistent local storage backend for Shade.
Package local is a persistent local storage backend for Shade.
memory
Package memory is an in memory storage backend for Shade.
Package memory is an in memory storage backend for Shade.
win
Package win is a test client.
Package win is a test client.
Package umbrella provides utility functions to maintain your Shade repository, such as cleaning up orphaned files and chunks.
Package umbrella provides utility functions to maintain your Shade repository, such as cleaning up orphaned files and chunks.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL