s3

package module
v2.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 6, 2023 License: MIT Imports: 17 Imported by: 12

README

dp-s3

Client to interact with AWS S3

Getting started

Setting up AWS credentials

In order to access AWS S3, this library will require your access key id and access secret key. You can either setup a default profile in ~/.aws/credentials file:

[default]
aws_access_key_id=<id>
aws_secret_access_key=<secret>
region=eu-west-1

Or export the values as environmental variables:

export AWS_ACCESS_KEY_ID=<id>
export AWS_SECRET_ACCESS_KEY=<secret>

More information in Amazon documentation

Setting up IAM policy

The functionality implemented by this library requires that the user has some permissions defined by an IAM policy.

  • Health-check functionality performs a HEAD bucket operation, requiring allowed s3:ListBucket for all resources.

  • Get functionality requires allowed s3:GetObject for the objects under the hierarchy you want to allow (e.g. my-bucket/prefix/*).

  • Upload (PUT) functionality requires allowed s3:PutObject for the objects under the hierarchy you want to allow (e.g. my-bucket/prefix/*).

  • Multipart upload functionality requires allowed s3:PutObject, s3:GetObject, s3:AbortMultipartUpload, s3:ListMultipartUploadParts for objects under the hierarchy you want to allow (e.g. my-bucket/prefix/*); and s3:ListBucketMultipartUploads for the bucket (e.g. my-bucket).

Please, see our terraform repository for more information.

S3 Client Usage

The S3 client wraps the necessary AWS SDK structs and offers functionality to check buckets, and read and write objects from/to S3.

The client is configured with a specific bucket and region, note that the bucket needs to be created in the region that you provide in order to access it.

There are 2 available constructors:

  • Constructor without AWS session (will create a new session):
import dps3 "github.com/ONSdigital/dp-s3/v2"

s3cli := dps3.NewClient(region, bucket)
  • Constructor with AWS session (will reuse the provided session):
import dps3 "github.com/ONSdigital/dp-s3/v2"

s3cli := dps3.NewClientWithSession(bucket, awsSession)

It is recommended to create a single AWS session in your service and reuse it if you need other clients. The client offers a session getter: s3cli.Session()

A bucket name getter is also offered for convenience: s3cli.BucketName()

Get

The S3 client exposes functions to get S3 objects by using the vanilla SDK or the crypto client, for user-defined encryption keys.

Functions that have the suffix WithPSK allow you to provide a psk for encryption. For example:

  • Get an un-encrypted object from S3
file, err := s3cli.Get("my/s3/file")
  • Get an encrypted object from S3, using a psk:
file, err := s3cli.GetWithPSK("my/s3/file", psk)

You can get a file's metadata via a Head call:

out, err := s3cli.Head("my/s3/file")
Upload

The client also wraps the AWS SDK s3manager uploader, which is a high level client to upload files which automatically splits large files into chunks and uploads them concurrently.

This offers functionality to put objects in S3 in a single func call, hiding the low level details of chunking. More information here

Functions that have the suffix WithPSK allow you to provide a psk for encryption and functions with the suffix WithContext allow you to pass a context, which may be cancelled to abort the operation. For example:

  • Upload an un-encrypted object to S3
result, err := s3cli.Upload(
    &s3manager.UploadInput{
		Body:   file.Reader,
		Key:    &filename,
	},
)
  • Upload an encrypted object to S3, using a psk:
result, err := s3cli.UploadWithPSK(
    &s3manager.UploadInput{
		Body:   file.Reader,
		Key:    &filename,
	},
    psk,
)
  • Upload an encrypted object to S3, passing a context:
result, err := s3cli.UploadWithPSKAndContext(
    ctx,
    &s3manager.UploadInput{
		Body:   file.Reader,
		Key:    &filename,
	},
    psk,
)
Multipart Upload

You may use the low-level AWS SDK s3 client multipart upload methods

and upload objects using multipart upload, which is an AWS SDK functionality to perform uploads in chunks. More information here

Chunk Size

The minimum chunk size allowed in AWS S3 is 5 MegaBytes (MB) if any chunks (excluding the final chunk) are under this size a ErrChunkTooSmall error will be returned from UploadPart and UploadPartWithPsk functions when all chunks have been uploaded.

URL

S3Url is a structure intended to be used for S3 URL string manipulation in its different formats. To create a new structure you need to provide region, bucketName and object key, and optionally the scheme:

s3Url, err := func NewURL(region, bucket, s3ObjectKey)
s3Url, err := func NewURLWithScheme(scheme, region, bucket, s3ObjectKey)

If you want to parse a URL into an s3Url object, you can use ParseURL() method, providing the format style:

s3Url, err := ParseURL(rawURL, URLStyle)

Once you have a valid s3Url object, you can obtain the URL string representation in the required format style by calling String() method:

str, err := s3Url.String(URLStyle)
Valid URL format Styles

The following URL styles are supported:

  • PathStyle: https://s3-eu-west-1.amazonaws.com/myBucket/my/s3/object/key
  • GlobalPathStyle: https://s3.amazonaws.com/myBucket/my/s3/object/key
  • VirtualHostedStyle: https://myBucket.s3-eu-west-1.amazonaws.com/my/s3/object/key
  • GlobalVirtualHostedStyle: https://myBucket.s3.amazonaws.com/my/s3/object/key
  • AliasVirtualHostedStyle: 'https://myBucket/my/s3/object/key

More information in S3 official documentation

Health check

The S3 checker function performs a HEAD bucket operation . The health check will succeed only if the bucket can be accessed using the client (i.e. client must be authenticated correctly, bucket must exist and have been created in the same region as the client).

Read the Health Check Specification for details.

After creating an S3 client as described above, call s3 health checker with s3cli.Checker(context.Background()) and this will return a check object:

{
    "name": "string",
    "status": "string",
    "message": "string",
    "status_code": "int",
    "last_checked": "ISO8601 - UTC date time",
    "last_success": "ISO8601 - UTC date time",
    "last_failure": "ISO8601 - UTC date time"
}

Contributing

See CONTRIBUTING for details.

License

Copyright © 2020, Office for National Statistics (https://www.ons.gov.uk)

Released under MIT license, see LICENSE for details.

Documentation

Overview

file: client.go

Contains the Client struct definition and constructors, as well as getters to read some private fields like bucketName or session.

If multiple clients are required, it is advised to reuse the same AWS session.

file: get.go

Contains methods to get objects or metadata from S3, with or without a used-defined psk for encryption, passing the object key or a full path in a specific aws allowed style.

Requires "s3:GetObject" action allowed by IAM policy for objects inside the bucket, as defined by `read-{bucketName}-bucket` policies in dp-setup

file: healthcheck.go

Contains methods to get the health state of an S3 client from S3, by checking that the bucket exists in the provided region.

Requires "s3:ListBucket" action allowed by IAM policy for the bucket, as defined by `check-{bucketName}-bucket` policies in dp-setup

file: upload.go

Contains methods to efficiently upload files to S3 by using the high level SDK s3manager uploader methods, which automatically split large objects in chunks and uploads them concurrently.

Requires "s3:PutObject" action allowed by IAM policy for the bucket, as defined by `write-{bucketName}-bucket` policies in dp-setup

file: upload_multipart.go

Contains methods to upload files to S3 in chunks by using the low level SDK methods that give the caller control over the multipart uploading process.

Requires "s3:PutObject", "s3:GetObject" and "s3:AbortMultipartUpload" actions allowed by IAM policy for the bucket, as defined by `multipart-{bucketName}-bucket` policies in dp-setup

file: url.go

Contains string manipulation methods to obtain an S3 URL in the different styles supported by AWS and translate from one to another.

Index

Constants

View Source
const (
	// PathStyle example: 'https://s3-eu-west-1.amazonaws.com/myBucket/my/s3/object/key'
	PathStyle = iota
	// GlobalPathStyle example: 'https://s3.amazonaws.com/myBucket/my/s3/object/key'
	GlobalPathStyle
	// VirtualHostedStyle example: 'https://myBucket.s3-eu-west-1.amazonaws.com/my/s3/object/key'
	VirtualHostedStyle
	// GlobalVirtualHostedStyle example: 'https://myBucket.s3.amazonaws.com/my/s3/object/key'
	GlobalVirtualHostedStyle
	// AliasVirtualHostedStyle example: 'https://myBucket/my/s3/object/key'
	AliasVirtualHostedStyle
)

Possible S3 URL format styles, as defined in https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html

View Source
const MsgHealthy = "S3 is healthy"

MsgHealthy is the message in the Check structure when S3 is healthy

View Source
const ServiceName = "S3"

ServiceName S3

Variables

This section is empty.

Functions

This section is empty.

Types

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client client with SDK client, CryptoClient and BucketName

func InstantiateClient

func InstantiateClient(sdkClient S3SDKClient, cryptoClient S3CryptoClient, sdkUploader S3SDKUploader, cryptoUploader S3CryptoUploader, bucketName, region string, s *session.Session) *Client

InstantiateClient creates a new instance of S3 struct with the provided clients, bucket and region.

func NewClient

func NewClient(region string, bucketName string) (*Client, error)

NewClient creates a new S3 Client configured for the given region and bucket name. Note: This function will create a new session, if you already have a session, please use NewUploaderWithSession instead Any error establishing the AWS session will be returned

func NewClientWithCredentials

func NewClientWithCredentials(region string, bucketName string, awsAccessKey string, awsSecretKey string) (*Client, error)

NewClient creates a new S3 Client configured for the given region and bucket name with creds. Note: This function will create a new session, if you already have a session, please use NewUploaderWithSession instead Any error establishing the AWS session will be returned

func NewClientWithSession

func NewClientWithSession(bucketName string, s *session.Session) *Client

NewClientWithSession creates a new S3 Client configured for the given bucket name, using the provided session and region within it.

func (*Client) BucketName

func (cli *Client) BucketName() string

BucketName returns the bucket name used by this S3 client

func (*Client) CheckPartUploaded

func (cli *Client) CheckPartUploaded(ctx context.Context, req *UploadPartRequest) (bool, error)

CheckPartUploaded returns true only if the chunk corresponding to the provided chunkNumber has been uploaded. If all the chunks have been uploaded, we complete the upload operation. A boolean value which indicates if the call uploaded the last part is returned. If an error happens, it will be wrapped and returned.

func (*Client) Checker

func (cli *Client) Checker(ctx context.Context, state *health.CheckState) error

Checker validates that the S3 bucket exists, and updates the provided CheckState accordingly. Any error during the state update will be returned

func (*Client) FileExists

func (cli *Client) FileExists(key string) (bool, error)

func (*Client) Get

func (cli *Client) Get(key string) (io.ReadCloser, *int64, error)

Get returns an io.ReadCloser instance for the given path (inside the bucket configured for this client) and the content length (size in bytes). They 'key' parameter refers to the path for the file under the bucket.

The caller is responsible for closing the returned ReadCloser. For example, it may be closed in a defer statement: defer r.Close()

func (*Client) GetBucketPolicy

func (cli *Client) GetBucketPolicy(BucketName string) (*s3.GetBucketPolicyOutput, error)

func (*Client) GetFromS3URL

func (cli *Client) GetFromS3URL(rawURL string, style URLStyle) (io.ReadCloser, *int64, error)

GetFromS3URL returns an io.ReadCloser instance and the content length (size in bytes) for the given S3 URL, in the format specified by URLStyle. More information about s3 URL styles: https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html If the URL defines a region (if provided) or bucket different from the one configured in this client, an error will be returned.

The caller is responsible for closing the returned ReadCloser. For example, it may be closed in a defer statement: defer r.Close()

func (*Client) GetFromS3URLWithPSK

func (cli *Client) GetFromS3URLWithPSK(rawURL string, style URLStyle, psk []byte) (io.ReadCloser, *int64, error)

GetFromS3URLWithPSK returns an io.ReadCloser instance and the content length (size in bytes) for the given S3 URL, in the format specified by URLStyle, using the provided PSK for encryption. More information about s3 URL styles: https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html If the URL defines a region (if provided) or bucket different from the one configured in this client, an error will be returned.

The caller is responsible for closing the returned ReadCloser. For example, it may be closed in a defer statement: defer r.Close()

func (*Client) GetWithPSK

func (cli *Client) GetWithPSK(key string, psk []byte) (io.ReadCloser, *int64, error)

GetWithPSK returns an io.ReadCloser instance for the given path (inside the bucket configured for this client) and the content length (size in bytes). It uses the provided PSK for encryption. The 'key' parameter refers to the path for the file under the bucket.

The caller is responsible for closing the returned ReadCloser. For example, it may be closed in a defer statement: defer r.Close()

func (*Client) Head

func (cli *Client) Head(key string) (*s3.HeadObjectOutput, error)

Head returns a HeadObjectOutput containing an object metadata obtained from ah HTTP HEAD call

func (*Client) ListObjects added in v2.1.0

func (cli *Client) ListObjects(BucketName string) (*s3.ListObjectsOutput, error)

func (*Client) PutBucketPolicy

func (cli *Client) PutBucketPolicy(BucketName string, policy string) (*s3.PutBucketPolicyOutput, error)

func (*Client) PutWithPSK

func (cli *Client) PutWithPSK(key *string, reader *bytes.Reader, psk []byte) error

PutWithPSK uploads the provided contents to the key in the bucket configured for this client, using the provided PSK. The 'key' parameter refers to the path for the file under the bucket.

func (*Client) Session

func (cli *Client) Session() *session.Session

Session returns the Session of this client

func (*Client) Upload

func (cli *Client) Upload(input *s3manager.UploadInput, options ...func(*s3manager.Uploader)) (*s3manager.UploadOutput, error)

Upload uploads a file to S3 using the AWS s3Manager, which will automatically split up large objects and upload them concurrently.

func (*Client) UploadPart

func (cli *Client) UploadPart(ctx context.Context, req *UploadPartRequest, payload []byte) (MultipartUploadResponse, error)

UploadPart handles the uploading a file to AWS S3, into the bucket configured for this client

func (*Client) UploadPartWithPsk

func (cli *Client) UploadPartWithPsk(ctx context.Context, req *UploadPartRequest, payload []byte, psk []byte) (MultipartUploadResponse, error)

UploadPartWithPsk handles the uploading a file to AWS S3, into the bucket configured for this client, using a user-defined psk

func (*Client) UploadWithContext

func (cli *Client) UploadWithContext(ctx context.Context, input *s3manager.UploadInput, options ...func(*s3manager.Uploader)) (*s3manager.UploadOutput, error)

UploadWithContext uploads a file to S3 using the AWS s3Manager with context, which will automatically split up large objects and upload them concurrently. The provided context may be used to abort the operation.

func (*Client) UploadWithPSK

func (cli *Client) UploadWithPSK(input *s3manager.UploadInput, psk []byte) (*s3manager.UploadOutput, error)

UploadWithPSK uploads a file to S3 using cryptoclient, which allows you to encrypt the file with a given psk.

func (*Client) UploadWithPSKAndContext

func (cli *Client) UploadWithPSKAndContext(ctx context.Context, input *s3manager.UploadInput, psk []byte, options ...func(*s3manager.Uploader)) (*s3manager.UploadOutput, error)

UploadWithPSKAndContext uploads a file to S3 using cryptoclient, which allows you to encrypt the file with a given psk. The provided context may be used to abort the operation.

func (*Client) ValidateBucket

func (cli *Client) ValidateBucket() error

ValidateBucket checks that the bucket exists and returns an error if it does not exist or there was some other error trying to get this information.

func (*Client) ValidateUploadInput

func (cli *Client) ValidateUploadInput(input *s3manager.UploadInput) (log.Data, error)

ValidateUploadInput checks the upload input and returns an error if there is a bucket override mismatch or s3 key is not provided

type ErrChunkNumberNotFound

type ErrChunkNumberNotFound struct {
	S3Error
}

ErrChunkNumberNotFound if a chunk number could not be found in an existing multipart upload.

func NewChunkNumberNotFound

func NewChunkNumberNotFound(err error, logData map[string]interface{}) *ErrChunkNumberNotFound

type ErrChunkTooSmall

type ErrChunkTooSmall struct {
	S3Error
}

func NewChunkTooSmallError

func NewChunkTooSmallError(err error, logData map[string]interface{}) *ErrChunkTooSmall

type ErrListParts

type ErrListParts struct {
	S3Error
}

ErrListParts represents an error returned by S3 ListParts

func NewListPartsError

func NewListPartsError(err error, logData map[string]interface{}) *ErrListParts

type ErrNotUploaded

type ErrNotUploaded struct {
	S3Error
}

ErrNotUploaded if an s3Key could not be found in ListMultipartUploads

func NewErrNotUploaded

func NewErrNotUploaded(err error, logData map[string]interface{}) *ErrNotUploaded

type ErrUnexpectedBucket

type ErrUnexpectedBucket struct {
	S3Error
}

ErrUnexpectedBucket if a request tried to access an unexpected bucket

func NewUnexpectedBucketError

func NewUnexpectedBucketError(err error, logData map[string]interface{}) *ErrUnexpectedBucket

type ErrUnexpectedRegion

type ErrUnexpectedRegion struct {
	S3Error
}

ErrUnexpectedRegion if a request tried to access an unexpected region

func NewUnexpectedRegionError

func NewUnexpectedRegionError(err error, logData map[string]interface{}) *ErrUnexpectedRegion

type MultipartUploadResponse

type MultipartUploadResponse struct {
	Etag             string
	AllPartsUploaded bool
}

type S3CryptoClient

type S3CryptoClient interface {
	UploadPartWithPSK(in *s3.UploadPartInput, psk []byte) (out *s3.UploadPartOutput, err error)
	GetObjectWithPSK(in *s3.GetObjectInput, psk []byte) (out *s3.GetObjectOutput, err error)
	PutObjectWithPSK(in *s3.PutObjectInput, psk []byte) (out *s3.PutObjectOutput, err error)
}

S3CryptoClient represents the cryptoclient with methods required to upload parts with encryption

type S3CryptoUploader

type S3CryptoUploader interface {
	UploadWithPSK(ctx context.Context, in *s3manager.UploadInput, psk []byte) (out *s3manager.UploadOutput, err error)
}

S3CryptoUploader represents the s3crypto Uploader with methods required to upload parts with encryption

type S3Error

type S3Error struct {
	// contains filtered or unexported fields
}

S3Error is the s3 package's error type

func NewError

func NewError(err error, logData map[string]interface{}) *S3Error

NewError creates a new S3Error

func (*S3Error) Error

func (e *S3Error) Error() string

S3Error implements the Go standard error interface

func (*S3Error) LogData

func (e *S3Error) LogData() map[string]interface{}

LogData implements the DataLogger interface which allows you extract embedded log.Data from an error

func (*S3Error) Unwrap

func (e *S3Error) Unwrap() error

Unwrap returns the wrapped error

type S3SDKClient

type S3SDKClient interface {
	ListMultipartUploads(in *s3.ListMultipartUploadsInput) (out *s3.ListMultipartUploadsOutput, err error)
	ListParts(in *s3.ListPartsInput) (out *s3.ListPartsOutput, err error)
	CompleteMultipartUpload(in *s3.CompleteMultipartUploadInput) (out *s3.CompleteMultipartUploadOutput, err error)
	CreateMultipartUpload(in *s3.CreateMultipartUploadInput) (out *s3.CreateMultipartUploadOutput, err error)
	UploadPart(in *s3.UploadPartInput) (out *s3.UploadPartOutput, err error)
	HeadBucket(in *s3.HeadBucketInput) (out *s3.HeadBucketOutput, err error)
	HeadObject(in *s3.HeadObjectInput) (out *s3.HeadObjectOutput, err error)
	GetObject(in *s3.GetObjectInput) (out *s3.GetObjectOutput, err error)
	GetBucketPolicy(in *s3.GetBucketPolicyInput) (out *s3.GetBucketPolicyOutput, err error)
	PutBucketPolicy(in *s3.PutBucketPolicyInput) (out *s3.PutBucketPolicyOutput, err error)
	ListObjects(in *s3.ListObjectsInput) (out *s3.ListObjectsOutput, err error)
}

S3SDKClient represents the sdk client with methods required by dp-s3 client

type S3SDKUploader

type S3SDKUploader interface {
	Upload(in *s3manager.UploadInput, options ...func(*s3manager.Uploader)) (out *s3manager.UploadOutput, err error)
	UploadWithContext(ctx context.Context, in *s3manager.UploadInput, options ...func(*s3manager.Uploader)) (out *s3manager.UploadOutput, err error)
}

S3SDKUploader represents the sdk uploader with methods required by dp-s3 client

type S3Url

type S3Url struct {
	Scheme     string
	Region     string
	BucketName string
	Key        string
}

S3Url represents an S3 URL with bucketName, key and region (optional). This struct is intended to be used for S3 URL string manipulation/translation in its possible format styles.

func NewURL

func NewURL(region, bucketName, key string) (*S3Url, error)

NewURL instantiates a new S3Url struct with the provided region, bucket name and object key

func NewURLWithScheme

func NewURLWithScheme(scheme, region, bucketName, key string) (*S3Url, error)

NewURLWithScheme instantiates a new S3Url struct with the provided scheme, region, bucket and object key

func ParseAliasVirtualHostedURL

func ParseAliasVirtualHostedURL(avhURL string) (*S3Url, error)

ParseAliasVirtualHostedURL creates an S3Url struct from the provided dns-alias-virtual-hosted-style url string Example: 'https://myBucket/my/s3/object/key'

func ParseGlobalPathStyleURL

func ParseGlobalPathStyleURL(gpURL string) (*S3Url, error)

ParseGlobalPathStyleURL creates an S3Url struct from the provided global-path-style url string Example: 'https://s3.amazonaws.com/myBucket/my/s3/object/key' This method is compatible with PathStyle format (if region is present in the URL, it will be ignored)

func ParseGlobalVirtualHostedURL

func ParseGlobalVirtualHostedURL(gvhURL string) (*S3Url, error)

ParseGlobalVirtualHostedURL creates an S3Url struct from the provided global-virtual-hosted-style url string Example: 'https://myBucket.s3.amazonaws.com/my/s3/object/key'

func ParsePathStyleURL

func ParsePathStyleURL(pathStyleURL string) (*S3Url, error)

ParsePathStyleURL creates an S3Url struct from the provided path-style url string Example: 'https://s3-eu-west-1.amazonaws.com/myBucket/my/s3/object/key'.

func ParseURL

func ParseURL(rawURL string, style URLStyle) (*S3Url, error)

ParseURL creates an S3Url struct from the provided rawULR and format style

func ParseVirtualHostedURL

func ParseVirtualHostedURL(vhURL string) (*S3Url, error)

ParseVirtualHostedURL creates an S3Url struct from the provided virtual-hosted-style url string Example: 'https://myBucket.s3-eu-west-1.amazonaws.com/my/s3/object/key'

func (*S3Url) String

func (s3Url *S3Url) String(style URLStyle) (string, error)

String returns the S3 URL string in the requested format style.

type URLStyle

type URLStyle int

URLStyle is the type to define the URL style iota enumeration corresponding an S3 url (path, virtualHosted, etc)

func (URLStyle) String

func (style URLStyle) String() string

Values of the format styles

type UploadPartRequest

type UploadPartRequest struct {
	UploadKey   string
	Type        string
	ChunkNumber int64
	TotalChunks int
	FileName    string
}

UploadPartRequest represents a part upload request

Directories

Path Synopsis
File copied form s3crypto repository Original repo: https://github.com/ONSdigital/s3crypto
File copied form s3crypto repository Original repo: https://github.com/ONSdigital/s3crypto

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL