pail

package module
v0.0.0-...-6ca95a4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 19, 2024 License: Apache-2.0 Imports: 30 Imported by: 33

README

===========================================
``pail`` -- Blob Storage System Abstraction
===========================================

Overview
--------

Pail is a high-level Go interface to blob storage containers like AWS's
S3 and similar services. Pail also provides implementation backed by
local file systems, mostly used for testing.

Documentation
-------------

The core API documentation is in the `godoc
<https://godoc.org/github.com/evergreen-ci/pail/>`_.

Contribute
----------

Open tickets in the `EVG project <http://jira.mongodb.org/browse/EVG>`_, and
feel free to open pull requests here.

Development
-----------

The pail project uses a ``makefile`` to coordinate testing. Use the following
command to build the cedar binary: ::

  make build

The artifact is at ``build/pail``. The makefile provides the following
targets:

``test``
   Runs all tests, sequentially, for all packages.

``test-<package>``
   Runs all tests for a specific package

``RACE_DETECTOR=1 make test-package``
   As with their ``test`` counterpart, these targets run tests with
   the race detector enabled.

``lint``, ``lint-<package>``
   Installs and runs the ``gometaliter`` with appropriate settings to
   lint the project.

Documentation

Index

Constants

View Source
const PresignExpireTime = 24 * time.Hour

PresignExpireTime sets the amount of time the link is live before expiring.

Variables

This section is empty.

Functions

func CreateAWSCredentials

func CreateAWSCredentials(awsKey, awsPassword, awsToken string) *credentials.Credentials

CreateAWSCredentials is a wrapper for creating AWS credentials.

func GetHeadObject

func GetHeadObject(r PreSignRequestParams) (*s3.HeadObjectOutput, error)

GetHeadObject fetches the metadata of an S3 object.

func IsKeyNotFoundError

func IsKeyNotFoundError(err error) bool

IsKeyNotFoundError checks an error object to see if it is a key not found error.

func MakeKeyNotFoundError

func MakeKeyNotFoundError(err error) error

MakeKeyNotFoundError constructs a key not found error from an existing error of any type.

func NewKeyNotFoundError

func NewKeyNotFoundError(msg string) error

NewKeyNotFoundError creates a new error object to represent a key not found error.

func NewKeyNotFoundErrorf

func NewKeyNotFoundErrorf(msg string, args ...interface{}) error

NewKeyNotFoundErrorf creates a new error object to represent a key not found error with a formatted message.

func PreSign

func PreSign(r PreSignRequestParams) (string, error)

PreSign returns a presigned URL that expires in 24 hours.

Types

type Bucket

type Bucket interface {
	// Check validity of the bucket. This is dependent on the underlying
	// implementation.
	Check(context.Context) error

	// Exists returns whether the given key exists in the bucket or not.
	Exists(context.Context, string) (bool, error)

	// Join concatenates elements with the appropriate path separator of
	// the bucket, ignoring empty elements. This is analogous to
	// `filepath.Join`.
	Join(...string) string

	// Produces a Writer and Reader interface to the file named by
	// the string.
	Writer(context.Context, string) (io.WriteCloser, error)
	Reader(context.Context, string) (io.ReadCloser, error)

	// Put and Get write simple byte streams (in the form of
	// io.Readers) to/from specified keys.
	//
	// TODO: consider if these, particularly Get are not
	// substantively different from Writer/Reader methods, or
	// might just be a wrapper.
	Put(context.Context, string, io.Reader) error
	Get(context.Context, string) (io.ReadCloser, error)

	// Upload and Download write files from the local file
	// system to the specified key.
	Upload(context.Context, string, string) error
	Download(context.Context, string, string) error

	SyncBucket

	// Copy does a special copy operation that does not require downloading
	// a file. Note that CopyOptions.DestinationBucket must have the same
	// type as the calling bucket object.
	Copy(context.Context, CopyOptions) error

	// Remove the specified object(s) from the bucket.
	// RemoveMany continues on error and returns any accumulated errors.
	Remove(context.Context, string) error
	RemoveMany(context.Context, ...string) error

	// Remove all objects with the given prefix, continuing on error and
	// returning any accumulated errors.
	// Note that this operation is not atomic.
	RemovePrefix(context.Context, string) error

	// Remove all objects matching the given regular expression, continuing
	// on error and returning any accumulated errors.
	// Note that this operation is not atomic.
	RemoveMatching(context.Context, string) error

	// List returns an iterator over the contents of a bucket with the
	// the given prefix. Contents are iterated lexicographically by key
	// name.
	List(context.Context, string) (BucketIterator, error)
}

Bucket defines an interface for accessing a remote blob store, like S3. Should be generic enough to be implemented for GCP equivalent.

Other goals of this project are to allow us to have a single interface for interacting with blob storage, and allow us to fully move off of our legacy goamz package and stabalize all blob-storage operations across all projects. There should be no interface dependencies on external packages required to use this library.

See, the following implemenations for previous approaches.

The preferred AWS SDK is here: https://docs.aws.amazon.com/sdk-for-go/api/

In no particular order:

  • implementation constructors should make it possible to use custom http.Clients (to aid in pooling.)
  • We should probably implement .String methods.
  • Do use the grip package for logging.
  • get/put should support multipart upload/download?
  • we'll want to do retries with back-off (potentially configurable in bucketinfo?)
  • we might need to have variants that Put/Get byte slices rather than readers.
  • pass contexts to requests for timeouts.

func NewGridFSBucket

func NewGridFSBucket(ctx context.Context, opts GridFSOptions) (Bucket, error)

NewGridFSBucket returns a bucket backed by GridFS with the given options.

func NewGridFSBucketWithClient

func NewGridFSBucketWithClient(ctx context.Context, client *mongo.Client, opts GridFSOptions) (Bucket, error)

NewGridFSBucketWithClient returns a new bucket backed by GridFS with the existing Mongo client and given options.

func NewLocalBucket

func NewLocalBucket(opts LocalOptions) (Bucket, error)

NewLocalBucket returns an implementation of the Bucket interface that stores files in the local file system. Returns an error if the directory doesn't exist.

func NewLocalTemporaryBucket

func NewLocalTemporaryBucket(opts LocalOptions) (Bucket, error)

NewLocalTemporaryBucket returns an "local" bucket implementation that stores resources in the local filesystem in a temporary directory created for this purpose. Returns an error if there were issues creating the temporary directory. This implementation does not provide a mechanism to delete the temporary directory.

func NewParallelSyncBucket

func NewParallelSyncBucket(opts ParallelBucketOptions, b Bucket) (Bucket, error)

NewParallelSyncBucket returns a layered bucket implemenation that supports parallel sync operations.

func NewS3Bucket

func NewS3Bucket(options S3Options) (Bucket, error)

NewS3Bucket returns a Bucket implementation backed by S3. This implementation does not support multipart uploads, if you would like to add objects larger than 5 gigabytes see NewS3MultiPartBucket.

func NewS3BucketWithHTTPClient

func NewS3BucketWithHTTPClient(client *http.Client, options S3Options) (Bucket, error)

NewS3BucketWithHTTPClient returns a Bucket implementation backed by S3 with an existing HTTP client connection. This implementation does not support multipart uploads, if you would like to add objects larger than 5 gigabytes see NewS3MultiPartBucket.

func NewS3MultiPartBucket

func NewS3MultiPartBucket(options S3Options) (Bucket, error)

NewS3MultiPartBucket returns a Bucket implementation backed by S3 that supports multipart uploads for large objects.

func NewS3MultiPartBucketWithHTTPClient

func NewS3MultiPartBucketWithHTTPClient(client *http.Client, options S3Options) (Bucket, error)

NewS3MultiPartBucketWithHTTPClient returns a Bucket implementation backed by S3 with an existing HTTP client connection that supports multipart uploads for large objects.

type BucketItem

type BucketItem interface {
	Bucket() string
	Name() string
	Hash() string
	Get(context.Context) (io.ReadCloser, error)
}

BucketItem provides a basic interface for getting an object from a bucket.

type BucketIterator

type BucketIterator interface {
	Next(context.Context) bool
	Err() error
	Item() BucketItem
}

BucketIterator provides a way to interact with the contents of a bucket, as in the output of the List operation.

type CopyOptions

type CopyOptions struct {
	SourceKey         string
	DestinationKey    string
	DestinationBucket Bucket
	IsDestination     bool
}

CopyOptions describes the arguments to the Copy method for moving objects between Buckets.

type GridFSOptions

type GridFSOptions struct {
	Name         string
	Prefix       string
	Database     string
	MongoDBURI   string
	DryRun       bool
	DeleteOnSync bool
	DeleteOnPush bool
	DeleteOnPull bool
	Verbose      bool
}

GridFSOptions support the use and creation of GridFS backed buckets.

type LocalOptions

type LocalOptions struct {
	Path   string
	Prefix string
	// UseSlash sets the prefix separator to the slash ('/') character
	// instead of the OS specific separator.
	UseSlash     bool
	DryRun       bool
	DeleteOnSync bool
	DeleteOnPush bool
	DeleteOnPull bool
	Verbose      bool
}

LocalOptions describes the configuration of a local Bucket.

type ParallelBucketOptions

type ParallelBucketOptions struct {
	// Workers sets the number of worker threads.
	Workers int
	// DryRun enables running in a mode that will not execute any
	// operations that modify the bucket.
	DryRun bool
	// DeleteOnSync will delete all objects from the target that do not
	// exist in the source after the completion of a sync operation
	// (Push/Pull).
	DeleteOnSync bool
	// DeleteOnPush will delete all objects from the target that do not
	// exist in the source after the completion of Push.
	DeleteOnPush bool
	// DeleteOnPull will delete all objects from the target that do not
	// exist in the source after the completion of Pull.
	DeleteOnPull bool
}

ParallelBucketOptions support the use and creation of parallel sync buckets.

type PreSignRequestParams

type PreSignRequestParams struct {
	Bucket          string
	FileKey         string
	AwsKey          string
	AwsSecret       string
	AwsSessionToken string
	Region          string
}

PreSignRequestParams holds all the parameters needed to sign a URL or fetch S3 object metadata.

type S3Options

type S3Options struct {
	// DryRun enables running in a mode that will not execute any
	// operations that modify the bucket.
	DryRun bool
	// DeleteOnSync will delete all objects from the target that do not
	// exist in the destination after the completion of a sync operation
	// (Push/Pull).
	DeleteOnSync bool
	// DeleteOnPush will delete all objects from the target that do not
	// exist in the source after the completion of Push.
	DeleteOnPush bool
	// DeleteOnPull will delete all objects from the target that do not
	// exist in the source after the completion of Pull.
	DeleteOnPull bool
	// Compress enables gzipping of uploaded objects.
	Compress bool
	// UseSingleFileChecksums forces the bucket to checksum files before
	// running uploads and download operation (rather than doing these
	// operations independently.) Useful for large files, particularly in
	// coordination with the parallel sync bucket implementations.
	UseSingleFileChecksums bool
	// Verbose sets the logging mode to "debug".
	Verbose bool
	// MaxRetries sets the number of retry attempts for S3 operations.
	// By default it defers to the AWS SDK's default.
	MaxRetries *int
	// Credentials allows the passing in of explicit AWS credentials. These
	// will override the default credentials chain. (Optional)
	Credentials *credentials.Credentials
	// SharedCredentialsFilepath, when not empty, will override the default
	// credentials chain and the Credentials value (see above). (Optional)
	SharedCredentialsFilepath string
	// SharedCredentialsProfile, when not empty, will fetch the given
	// credentials profile from the shared credentials file. (Optional)
	SharedCredentialsProfile string
	// AssumeRoleARN specifies an IAM role ARN. When not empty, it will be
	// used to assume the given role for this session. See
	// `https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html` for
	// more information. (Optional)
	AssumeRoleARN string
	// AssumeRoleOptions provide a mechanism to override defaults by
	// applying changes to the AssumeRoleProvider struct created with this
	// session. This field is ignored if AssumeRoleARN is not set.
	// (Optional)
	AssumeRoleOptions []func(*stscreds.AssumeRoleProvider)
	// Region specifies the AWS region.
	Region string
	// Name specifies the name of the bucket.
	Name string
	// Prefix specifies the prefix to use. (Optional)
	Prefix string
	// Permissions sets the S3 permissions to use for each object. Defaults
	// to FULL_CONTROL. See
	// `https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html`
	// for more information.
	Permissions S3Permissions
	// ContentType sets the standard MIME type of the objet data. Defaults
	// to nil. See
	//`https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.17`
	// for more information.
	ContentType string
}

S3Options support the use and creation of S3 backed buckets.

type S3Permissions

type S3Permissions string

S3Permissions is a type that describes the object canned ACL from S3.

const (
	S3PermissionsPrivate                S3Permissions = s3.ObjectCannedACLPrivate
	S3PermissionsPublicRead             S3Permissions = s3.ObjectCannedACLPublicRead
	S3PermissionsPublicReadWrite        S3Permissions = s3.ObjectCannedACLPublicReadWrite
	S3PermissionsAuthenticatedRead      S3Permissions = s3.ObjectCannedACLAuthenticatedRead
	S3PermissionsAWSExecRead            S3Permissions = s3.ObjectCannedACLAwsExecRead
	S3PermissionsBucketOwnerRead        S3Permissions = s3.ObjectCannedACLBucketOwnerRead
	S3PermissionsBucketOwnerFullControl S3Permissions = s3.ObjectCannedACLBucketOwnerFullControl
)

Valid S3 permissions.

func (S3Permissions) Validate

func (p S3Permissions) Validate() error

Validate checks that the S3Permissions string is valid.

type SyncBucket

type SyncBucket interface {
	// Sync methods: these methods are the recursive, efficient
	// copy methods of files from S3 to the local file
	// system.
	Push(context.Context, SyncOptions) error
	Pull(context.Context, SyncOptions) error
}

SyncBucket defines an interface to access a remote blob store and synchronize the local file system tree with the remote store.

func NewS3ArchiveBucket

func NewS3ArchiveBucket(options S3Options) (SyncBucket, error)

NewS3ArchiveBucket returns a SyncBucket implementation backed by S3 that supports syncing the local file system as a single archive file in S3 rather than creating an individual object for each file. This SyncBucket is not compatible with regular Bucket implementations.

func NewS3ArchiveBucketWithHTTPClient

func NewS3ArchiveBucketWithHTTPClient(client *http.Client, options S3Options) (SyncBucket, error)

NewS3ArchiveBucketWithHTTPClient is the same as NewS3ArchiveBucket but allows the user to specify an existing HTTP client connection.

type SyncOptions

type SyncOptions struct {
	Local   string
	Remote  string
	Exclude string
}

SyncOptions describes the arguments to the sync operations (Push and Pull). Note that exclude is a regular expression.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL