network

package
v2.1.2+incompatible Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 13, 2019 License: Apache-2.0 Imports: 27 Imported by: 10

Documentation

Index

Constants

View Source
const (
	PharosIntellectualObject PharosObjectType = "IntellectualObject"
	PharosInstitution                         = "Institution"
	PharosGenericFile                         = "GenericFile"
	PharosChecksum                            = "Checksum"
	PharosPremisEvent                         = "PremisEvent"
	PharosWorkItem                            = "WorkItem"
	PharosWorkItemState                       = "WorkItemState"
	PharosDPNWorkItem                         = "DPNWorkItem"
	PharosDPNBag                              = "PharosDPNBag"
)
View Source
const BIG_CHUNK_SIZE = int64(50 * 1024 * 1024)
View Source
const S3_MIN_CHUNK_SIZE = int64(5 * 1024 * 1024)

S3_MIN_CHUNK_SIZE is the minimum chunk size that aws-go-sdk will accept for uploads to S3: 5MB.

Variables

This section is empty.

Functions

func GetS3Session

func GetS3Session(awsRegion, accessKeyId, secretAccessKey string) (*session.Session, error)

Returns an S3 session for this objectList.

func S3HeadHandler

func S3HeadHandler(w http.ResponseWriter, r *http.Request)

func S3HeadRestoreCompletedHandler

func S3HeadRestoreCompletedHandler(w http.ResponseWriter, r *http.Request)

Handles an S3 Head request and replies that a restore is complete.

func S3HeadRestoreInProgressHandler

func S3HeadRestoreInProgressHandler(w http.ResponseWriter, r *http.Request)

Handles an S3 Head request and replies that a restore is already in progress.

func S3RestoreCompletedHandler

func S3RestoreCompletedHandler(w http.ResponseWriter, r *http.Request)

Handles a request to restore a Glacier object to S3, and replies with a 200 to say the restoration is already completed (item already restored to active tier)

func S3RestoreHandler

func S3RestoreHandler(w http.ResponseWriter, r *http.Request)

Accepts a request to restore a Glacier object to S3.

func S3RestoreInProgressHandler

func S3RestoreInProgressHandler(w http.ResponseWriter, r *http.Request)

Handles a request to restore a Glacier object to S3, and replies with a 409 to say the restoration is already in progress. See https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPOSTrestore.html

func S3RestoreRejectHandler

func S3RestoreRejectHandler(w http.ResponseWriter, r *http.Request)

Rejects a request to restore a Glacier object to S3.

Types

type NSQClient

type NSQClient struct {
	URL string
}

NSQClient provides methods for queueing items and querying stats from the NSQ server at URL.

func NewNSQClient

func NewNSQClient(url string) *NSQClient

NewNSQClient returns a new NSQ client that will connect to the NSQ server and the specified url. The URL is typically available through Config.NsqdHttpAddress, and usually ends with :4151. This is the URL to which we post items we want to queue, and from which our workers read.

Note that this client provides write access to queue, so we can add things. It does not provide read access. The workers do the reading.

func (*NSQClient) Enqueue

func (client *NSQClient) Enqueue(topic string, workItemId int) error

Enqueue posts data to NSQ, which essentially means putting it into a work topic. Param topic is the topic under which you want to queue something. For example, prepare_topic, fixity_topic, etc. Param workItemId is the id of the WorkItem record in Pharos we want to queue.

func (*NSQClient) EnqueueString

func (client *NSQClient) EnqueueString(topic string, data string) error

EnqueueString posts string data to the specified NSQ topic

func (*NSQClient) GetStats

func (client *NSQClient) GetStats() (*NSQStatsData, error)

GetStats allows us to get some basic stats from NSQ. The NSQ /stats endpoint returns a richer set of stats than what this fuction returns, but we only need some basic data for integration tests, so that's all we're parsing. The return value is a map whose key is the topic name and whose value is an NSQTopicStats object. NSQ is supposed to support topic_name as a query param, but this doesn't seem to be working in NSQ 0.3.0, so we're just returning stats for all topics right now. Also note that requests to /stats/ (with trailing slash) produce a 404.

type NSQStatsData

type NSQStatsData struct {
	Version   string            `json:"version"`
	Health    string            `json:"status_code"`
	StartTime uint64            `json:"start_time"`
	Topics    []nsqd.TopicStats `json:"topics"`
}

NSQStatsData contains the important info returned by a call to NSQ's /stats endpoint, including the number of items in each topic and queue.

type PharosClient

type PharosClient struct {
	// contains filtered or unexported fields
}

PharosClient supports basic calls to the Pharos Admin REST API. This client does not support the Member API.

func NewPharosClient

func NewPharosClient(hostUrl, apiVersion, apiUser, apiKey string) (*PharosClient, error)

NewPharosClient creates a new pharos client. Param hostUrl should come from the config.json file.

func (*PharosClient) BuildUrl

func (client *PharosClient) BuildUrl(relativeUrl string) string

BuildUrl combines the host and protocol in client.hostUrl with relativeUrl to create an absolute URL. For example, if client.hostUrl is "http://localhost:3456", then client.BuildUrl("/path/to/action.json") would return "http://localhost:3456/path/to/action.json".

func (*PharosClient) ChecksumGet

func (client *PharosClient) ChecksumGet(id int) *PharosResponse

ChecksumGet returns the checksum with the specified id

func (*PharosClient) ChecksumList

func (client *PharosClient) ChecksumList(params url.Values) *PharosResponse

ChecksumList returns a list of checksums. Params include:

  • generic_file_identifier - The identifier of the file to which the checksum belongs.
  • algorithm - The checksum algorithm (constants.AldMd5, constants.AlgSha256)

func (*PharosClient) ChecksumSave

func (client *PharosClient) ChecksumSave(obj *models.Checksum, gfIdentifier string) *PharosResponse

ChecksumSave saves a Checksum to Pharos. The checksum Id should be zero, since we can create but not update Checksums. Param gfIdentifier is the identifier of the GenericFile to which the checksum belongs. The response object will have a new copy of the Checksum if the save was successful.

func (*PharosClient) DPNBagGet

func (client *PharosClient) DPNBagGet(id int) *PharosResponse

DPNBagGet returns the PharosDPNBag with the specified Id. Param id is Pharos' DPNBag.id, not the DPN UUID or the Pharos object_identifier.

func (*PharosClient) DPNBagList

func (client *PharosClient) DPNBagList(params url.Values) *PharosResponse

DPNBagList returns a list of PharosDPNBags that match the specified criteria. Valid params include institution_id, object_identifier, dpn_identifier, node_1, node_2, node_3, created_before, created_after, updated_before, updated_after. Sort by 'dpn_upated_at DESC' to get latest.

func (*PharosClient) DPNBagSave

func (client *PharosClient) DPNBagSave(obj *models.PharosDPNBag) *PharosResponse

DPNBagSave saves a PharosDPNBag record to Pharos.

func (*PharosClient) DPNWorkItemGet

func (client *PharosClient) DPNWorkItemGet(id int) *PharosResponse

DPNWorkItemGet returns the DPNWorkItem with the specified Id.

func (*PharosClient) DPNWorkItemList

func (client *PharosClient) DPNWorkItemList(params url.Values) *PharosResponse

DPNWorkItemList returns a list of DPNWorkItems that match the specified criteria. Valid params include node, task, identifier, queued_before, queued_after, completed_before and completed_after.

func (*PharosClient) DPNWorkItemSave

func (client *PharosClient) DPNWorkItemSave(obj *models.DPNWorkItem) *PharosResponse

DPNWorkItemSave saves a DPNWorkItem record to Pharos. If the WorkItems's ID is zero, this performs a POST to create a new record. For non-zero IDs, this performs a PUT to update the existing record. The response object will include a new copy of the WorkItem if it was saved successfully.

func (*PharosClient) DoRequest

func (client *PharosClient) DoRequest(resp *PharosResponse, method, absoluteUrl string, requestData io.Reader)

DoRequest issues an HTTP request, reads the response, and closes the connection to the remote server.

Param resp should be a PharosResponse.

For a description of the other params, see NewJsonRequest.

If an error occurs, it will be recorded in resp.Error.

func (*PharosClient) FinishRestorationSpotTest

func (client *PharosClient) FinishRestorationSpotTest(workItemId int) *PharosResponse

FinishRestorationSpotTest tells Pharos to send an email to institutional admins saying APTrust has randomly restored one of their bags as part of a spot test.

func (*PharosClient) GenericFileFinishDelete

func (client *PharosClient) GenericFileFinishDelete(identifier string) *PharosResponse

GenericFileFinishDelete tells Pharos we've finished deleting a generic file. We have to create the deletion PREMIS event before calling this. This call returns no data. If response.Error is nil, it succeeded.

func (*PharosClient) GenericFileGet

func (client *PharosClient) GenericFileGet(identifier string, includeRelations bool) *PharosResponse

GenericFileGet returns the GenericFile having the specified identifier. The identifier should be in the format "institution.edu/object_name/path/to/file.ext" If param includeRelations is true, this call will return the GenericFile along with its checksums and premis events. Otherwise, you get just the GenericFile.

func (*PharosClient) GenericFileList

func (client *PharosClient) GenericFileList(params url.Values) *PharosResponse

GenericFileList returns a list of Generic Files. Params include:

  • intellectual_object_identifier - The identifier of the object to which the files belong.
  • not_checked_since [datetime] - Returns a list of files that have not had a fixity check since the specified datetime [yyyy-mm-dd]
  • include_relations=true - Include the file's PremisEvents and Checksums in the response.
  • with_ingest_state=true - Include ingest state data in the response.
  • storage_option - "Standard", "Glacier-OH", "Glacier-OR", "Glacier-VA", "Glacier-Deep-OH", "Glacier-Deep-OR", "Glacier-Deep-VA"

func (*PharosClient) GenericFileRequestRestore

func (client *PharosClient) GenericFileRequestRestore(identifier string) *PharosResponse

GenericFileRequestRestore creates a restore request in Pharos for the file with the specified identifier. This is used in integration testing to create restore requests.

func (*PharosClient) GenericFileSave

func (client *PharosClient) GenericFileSave(obj *models.GenericFile) *PharosResponse

GenericFileSave saves a Generic File record to Pharos. If the Generic File's ID is zero, this performs a POST to create a new record. For non-zero IDs, this performs a PUT to update the existing record. Either way, the record must have an IntellectualObject ID. The response object will have a new copy of the GenericFile if the save was successful.

func (*PharosClient) GenericFileSaveBatch

func (client *PharosClient) GenericFileSaveBatch(objList []*models.GenericFile) *PharosResponse

GenericFileSaveBatch saves a batch of Generic File records to Pharos. This performs a POST to create a new records, so all of the GenericFiles passed in param objList should have Ids of zero. Each record must also have an IntellectualObject ID. The response object will be a list containing a new copy of each GenericFile that was saved. The new copies have correct ids and timestamps. On the Pharos end, the batch insert is run as a transaction, so either all inserts succeed, or the whole transaction is rolled back and no inserts occur.

func (*PharosClient) InstitutionGet

func (client *PharosClient) InstitutionGet(identifier string) *PharosResponse

InstitutionGet returns the institution with the specified identifier.

func (*PharosClient) InstitutionList

func (client *PharosClient) InstitutionList(params url.Values) *PharosResponse

InstitutionList returns a list of APTrust depositor institutions.

func (*PharosClient) IntellectualObjectFinishDelete

func (client *PharosClient) IntellectualObjectFinishDelete(identifier string) *PharosResponse

IntellectualObjectFinishDelete tells Pharos to mark an IntellectualObject as deleted, once we've finished deleting it.

func (*PharosClient) IntellectualObjectGet

func (client *PharosClient) IntellectualObjectGet(identifier string, includeFiles, includeEvents bool) *PharosResponse

IntellectualObjectGet returns the object with the specified identifier, if it exists. Param identifier is an IntellectualObject identifier in the format "institution.edu/object_name". If param includeFiles is true, Pharos will return an IntellectualObject with all of its GenericFiles and their checksums. If param includeEvents is true, Pharos will return the object with its PREMIS events. If both boolean flags are true, Pharos will return the object will all files, checksums and events, resulting in a huge blob of JSON (many megabytes).

func (*PharosClient) IntellectualObjectList

func (client *PharosClient) IntellectualObjectList(params url.Values) *PharosResponse

IntellectualObjectList returns a list of IntellectualObjects matching the filter criteria specified in params. Params include:

* institution - Return objects belonging to this institution. * updated_since - Return object updated since this date. * name_contains - Return objects whose name contains the specified string. * name_exact - Return only object with the exact name specified. * state = 'A' for active records, 'D' for deleted. Default is 'A' * storage_option - "Standard", "Glacier-OH", "Glacier-OR", "Glacier-VA", * "Glacier-Deep-OH", "Glacier-Deep-OR", "Glacier-Deep-VA"

func (*PharosClient) IntellectualObjectPushToDPN

func (client *PharosClient) IntellectualObjectPushToDPN(identifier string) *PharosResponse

IntellectualObjectPushToDPN is used only in integration tests. It creates a WorkItem in Pharos requesting that the IntellectualObject with the specified identifier be ingested into DPN. Check the value of WorkItem() (not IntellectualObject()) in the response.

func (*PharosClient) IntellectualObjectRequestDelete

func (client *PharosClient) IntellectualObjectRequestDelete(identifier string) *PharosResponse

IntellectualObjectRequestDelete creates a delete request in Pharos for the object with the specified identifier. This is used in integration testing to create a set of file deletion requests. This call returns no data.

func (*PharosClient) IntellectualObjectRequestRestore

func (client *PharosClient) IntellectualObjectRequestRestore(identifier string) *PharosResponse

IntellectualObjectRequestRestore creates a restore request in Pharos for the object with the specified identifier. This is used in integration testing to create restore requests.

func (*PharosClient) IntellectualObjectSave

func (client *PharosClient) IntellectualObjectSave(obj *models.IntellectualObject) *PharosResponse

IntellectualObjectSave saves the intellectual object to Pharos. If the object has an ID of zero, this performs a POST to create a new Intellectual Object. If the ID is non-zero, this updates the existing object with a PUT. The response object will contain a new copy of the IntellectualObject if it was successfully saved.

func (*PharosClient) NewJsonRequest

func (client *PharosClient) NewJsonRequest(method, absoluteUrl string, requestData io.Reader) (*http.Request, error)

NewJsonRequest returns a new request with headers indicating JSON request and response formats.

Param method can be "GET", "POST", or "PUT". The Pharos service currently only supports those three.

Param absoluteUrl should be the absolute URL. For get requests, include params in the query string rather than in the requestData param.

Param requestData will be nil for GET requests, and can be constructed from bytes.NewBuffer([]byte) for POST and PUT. For the PharosClient, we're typically sending JSON data in the request body.

func (*PharosClient) PremisEventGet

func (client *PharosClient) PremisEventGet(identifier string) *PharosResponse

PremisEventGet returns the PREMIS event with the specified identifier. The identifier should be a UUID in string format, with dashes. E.g. "49a7d6b5-cdc1-4912-812e-885c08e90c68"

func (*PharosClient) PremisEventList

func (client *PharosClient) PremisEventList(params url.Values) *PharosResponse

PremisEventList returns a list of PREMIS events matching the specified criteria. Parameters include:

  • object_identifier - (string) Return events associated with the specified intellectual object (but not its generic files).
  • file_identifier - (string) Return events associated with the specified generic file.
  • event_type - (string) Return events of the specified type. See the event types listed in contants/constants.go
  • created_since - (iso 8601 datetime string) Return events created on or after the specified datetime.

func (*PharosClient) PremisEventSave

func (client *PharosClient) PremisEventSave(obj *models.PremisEvent) *PharosResponse

PremisEventSave saves a PREMIS event to Pharos. If the event ID is zero, this issues a POST request to create a new event record. If the ID is non-zero, this issues a PUT to update the existing event. The response object will have a new copy of the Premis event if the save was successful.

func (*PharosClient) WorkItemGet

func (client *PharosClient) WorkItemGet(id int) *PharosResponse

WorkItemGet returns the WorkItem with the specified ID.

func (*PharosClient) WorkItemList

func (client *PharosClient) WorkItemList(params url.Values) *PharosResponse

WorkItemList lists the work items meeting the specified filters, or all work items if no filter params are set. Params include:

created_before - DateTime in RFC3339 format created_after - DateTime in RFC3339 format updated_before - DateTime in RFC3339 format updated_after - DateTime in RFC3339 format bag_date - DateTime in RFC3339 format name - Name of the tar file that appeared in the receiving bucket. name_contains - Match on partial tar file name etag - The etag of the file uploaded to the receiving bucket. etag_contains - Match on partial etag. object_identifier - The IntellectualObject identifier (null in some WorkItems) object_identifier_contains - Match on partial IntelObj file_identifier - The GenericFile identifier (null on most WorkItems) file_identifier_contains - Match on partiak GenericFile identifier status - String enum value from constants. StatusFetch, StatusUnpack, etc. stage - String enum value from constants. StageReceive, StageCleanup, etc. item_action - String enum value from constants. ActionIngest, ActionRestore, etc. access - String enum value from constants.AccessRights. state - "A" for active items, "D" for deleted items.

func (*PharosClient) WorkItemSave

func (client *PharosClient) WorkItemSave(obj *models.WorkItem) *PharosResponse

WorkItemSave saves a WorkItem record to Pharos. If the WorkItems's ID is zero, this performs a POST to create a new record. For non-zero IDs, this performs a PUT to update the existing record. The response object will include a new copy of the WorkItem if it was saved successfully.

func (*PharosClient) WorkItemStateGet

func (client *PharosClient) WorkItemStateGet(workItemStateId int) *PharosResponse

WorkItemStateGet returns the WorkItemState with the specified WorkItemStateId.

func (*PharosClient) WorkItemStateSave

func (client *PharosClient) WorkItemStateSave(obj *models.WorkItemState) *PharosResponse

WorkItemStateSave saves a WorkItemState record to Pharos. If the WorkItemState's ID is zero, this performs a POST to create a new record. For non-zero IDs, this performs a PUT to update the existing record. The response object will include a new copy of the WorkItemState if it was saved successfully.

type PharosObjectType

type PharosObjectType string

type PharosResponse

type PharosResponse struct {
	// Count is the total number of items matching the
	// specified filters. This is useful for List requests.
	// Note that the number of items returned in the response
	// may be fewer than ItemCount. For example, the remote
	// server may return only 10 of 10,000 matching records
	// at a time.
	Count int

	// The URL of the next page of results.
	Next *string

	// The URL of the next page of results.
	Previous *string

	// The HTTP request that was (or would have been) sent to
	// the Pharos REST server. This is useful for logging and
	// debugging.
	Request *http.Request

	// The HTTP Response from the server. You can get the
	// HTTP status code, headers, etc. through this. See
	// https://golang.org/pkg/net/http/#Response for more info.
	//
	// Do not try to read Response.Body, since it's already been read
	// and the stream has been closed. Use the RawResponseData()
	// method instead.
	Response *http.Response

	// The error, if any, that occurred while processing this
	// request. Errors may come from the server (4xx or 5xx
	// responses) or from the client (e.g. if it could not
	// parse the JSON response).
	Error error
	// contains filtered or unexported fields
}

func NewPharosResponse

func NewPharosResponse(objType PharosObjectType) *PharosResponse

Creates a new PharosResponse and returns a pointer to it.

func (*PharosResponse) Checksum

func (resp *PharosResponse) Checksum() *models.Checksum

Returns the Checksum parsed from the HTTP response body, or nil.

func (*PharosResponse) Checksums

func (resp *PharosResponse) Checksums() []*models.Checksum

Returns a list of Checksums parsed from the HTTP response body.

func (*PharosResponse) DPNBag

func (resp *PharosResponse) DPNBag() *models.PharosDPNBag

Returns the PharosDPNBag parsed from the HTTP response body, or nil.

func (*PharosResponse) DPNBags

func (resp *PharosResponse) DPNBags() []*models.PharosDPNBag

Returns a list of PharosDPNBag parsed from the HTTP response body.

func (*PharosResponse) DPNWorkItem

func (resp *PharosResponse) DPNWorkItem() *models.DPNWorkItem

Returns the DPNWorkItem parsed from the HTTP response body, or nil.

func (*PharosResponse) DPNWorkItems

func (resp *PharosResponse) DPNWorkItems() []*models.DPNWorkItem

Returns a list of DPNWorkItems parsed from the HTTP response body.

func (*PharosResponse) GenericFile

func (resp *PharosResponse) GenericFile() *models.GenericFile

Returns the GenericFile parsed from the HTTP response body, or nil.

func (*PharosResponse) GenericFiles

func (resp *PharosResponse) GenericFiles() []*models.GenericFile

Returns a list of GenericFiles parsed from the HTTP response body.

func (*PharosResponse) HasNextPage

func (resp *PharosResponse) HasNextPage() bool

Returns true if the response includes a link to the next page of results.

func (*PharosResponse) HasPreviousPage

func (resp *PharosResponse) HasPreviousPage() bool

Returns true if the response includes a link to the previous page of results.

func (*PharosResponse) Institution

func (resp *PharosResponse) Institution() *models.Institution

Returns the Institution parsed from the HTTP response body, or nil.

func (*PharosResponse) Institutions

func (resp *PharosResponse) Institutions() []*models.Institution

Returns a list of Institutions parsed from the HTTP response body.

func (*PharosResponse) IntellectualObject

func (resp *PharosResponse) IntellectualObject() *models.IntellectualObject

Returns the IntellectualObject parsed from the HTTP response body, or nil.

func (*PharosResponse) IntellectualObjects

func (resp *PharosResponse) IntellectualObjects() []*models.IntellectualObject

Returns a list of IntellectualObjects parsed from the HTTP response body.

func (*PharosResponse) ObjectType

func (resp *PharosResponse) ObjectType() PharosObjectType

Returns the type of object(s) contained in this response.

func (*PharosResponse) ParamsForNextPage

func (resp *PharosResponse) ParamsForNextPage() url.Values

Returns the URL parameters to request the next page of results, or nil if there is no next page.

func (*PharosResponse) ParamsForPreviousPage

func (resp *PharosResponse) ParamsForPreviousPage() url.Values

Returns the URL parameters to request the previous page of results, or nil if there is no previous page.

func (*PharosResponse) PremisEvent

func (resp *PharosResponse) PremisEvent() *models.PremisEvent

Returns the PremisEvent parsed from the HTTP response body, or nil.

func (*PharosResponse) PremisEvents

func (resp *PharosResponse) PremisEvents() []*models.PremisEvent

Returns a list of PremisEvents parsed from the HTTP response body.

func (*PharosResponse) RawResponseData

func (resp *PharosResponse) RawResponseData() ([]byte, error)

Returns the raw body of the HTTP response as a byte slice. The return value may be nil.

func (*PharosResponse) UnmarshalJsonList

func (resp *PharosResponse) UnmarshalJsonList() error

UnmarshalJsonList converts JSON response from the Pharos server into a list of usable objects. The Pharos list response has this structure:

{
  "count": 500
  "next": "https://example.com/objects/per_page=20&page=11"
  "previous": "https://example.com/objects/per_page=20&page=9"
  "results": [... array of arbitrary objects ...]
}

func (*PharosResponse) WorkItem

func (resp *PharosResponse) WorkItem() *models.WorkItem

Returns the WorkItem parsed from the HTTP response body, or nil.

func (*PharosResponse) WorkItemState

func (resp *PharosResponse) WorkItemState() *models.WorkItemState

Returns the WorkItemState parsed from the HTTP response body, or nil.

func (*PharosResponse) WorkItemStates

func (resp *PharosResponse) WorkItemStates() []*models.WorkItemState

Returns a list of WorkItemStates parsed from the HTTP response body.

func (*PharosResponse) WorkItems

func (resp *PharosResponse) WorkItems() []*models.WorkItem

Returns a list of WorkItems parsed from the HTTP response body.

type RestoreRequestInfo

type RestoreRequestInfo struct {
	RequestInProgress bool
	RequestIsComplete bool
	S3ExpiryDate      time.Time
}

Contains info parsed from x-amz-restore header, if that header is present. The header will only exist if we recently requested the item be retrieved from Glacier into S3.

type S3Copy

type S3Copy struct {
	AWSRegion         string
	SourceBucket      string
	SourceKey         string
	DestinationBucket string
	DestinationKey    string
	ErrorMessage      string
	Response          *s3.CopyObjectOutput
	// contains filtered or unexported fields
}

func NewS3Copy

func NewS3Copy(accessKeyId, secretAccessKey, region, sourceBucket, sourceKey, destinationBucket, destinationKey string) *S3Copy

Sets up a new S3Copy object. Params:

accessKeyId - The AWS Access Key Id used to authenticate with AWS. secretAccessKey - The AWS secret access key. region - The name of the AWS region where the source file is stored.

E.g. us-east-1 (VA), us-west-2 (Oregon), or use
constants.AWSVirginia, constants.AWSOregon

sourceBucket - The name of the bucket to copy from. sourceKey - The name/key S3 object to be copied. destinationBucket - The name of the bucket to copy to. destinationKey - The name/key of the S3 object in the destination bucket.

func (*S3Copy) Copy

func (client *S3Copy) Copy()

Fetch the file from S3.

func (*S3Copy) CopySource

func (client *S3Copy) CopySource() string

AWS docs say CopySource must be URL encoded.

func (*S3Copy) GetSession

func (client *S3Copy) GetSession() *session.Session

Returns an S3 session for this copy operation.

type S3Download

type S3Download struct {
	AWSRegion       string
	BucketName      string
	KeyName         string
	LocalPath       string
	CalculateMd5    bool
	CalculateSha256 bool
	Md5Digest       string
	Sha256Digest    string
	BytesCopied     int64
	ErrorMessage    string

	// The response from S3 for the attempted download.
	// Don't try to read Response.Body, because if this
	// object is non-nil, the response will already have
	// been read and closed.
	Response *s3.GetObjectOutput
	// contains filtered or unexported fields
}

func NewS3Download

func NewS3Download(accessKeyId, secretAccessKey, region, bucket, key, localPath string, calculateMd5, calculateSha256 bool) *S3Download

Sets up a new S3 download. Params:

accessKeyId - The AWS Access Key Id used to authenticate with AWS. secretAccessKey - The AWS secret access key. region - The name of the AWS region to download from.

E.g. us-east-1 (VA), us-west-2 (Oregon), or use
constants.AWSVirginia, constants.AWSOregon

bucket - The name of the bucket to download from. key - The name of the file to download. localPath - Path to which to save the downloaded file.

This may be /dev/null in cases where we're
just running a fixity check.

calculateMd5 - Should we calculate an md5 checksum on

the download?

calculateSha256 - Should we calculate a sha256 checksum

on the download?

func (*S3Download) Fetch

func (client *S3Download) Fetch()

Fetch the file from S3.

func (*S3Download) GetSession

func (client *S3Download) GetSession() *session.Session

Returns an S3 session for this download.

type S3Head

type S3Head struct {
	AWSRegion    string
	BucketName   string
	ErrorMessage string
	Response     *s3.HeadObjectOutput
	// contains filtered or unexported fields
}

func NewS3Head

func NewS3Head(accessKeyId, secretAccessKey, region, bucket string) *S3Head

Sets up a new S3 head request. Params:

accessKeyId - The AWS Access Key Id used to authenticate with AWS. secretAccessKey - The AWS secret access key. region - The name of the AWS region to download from.

E.g. us-east-1 (VA), us-west-2 (Oregon), or use
constants.AWSVirginia, constants.AWSOregon

bucket - The name of the bucket to download from.

func (*S3Head) DPNStoredFile

func (client *S3Head) DPNStoredFile() *dpn_models.DPNStoredFile

func (*S3Head) GetHeaderMetadata

func (client *S3Head) GetHeaderMetadata(key string) string

GetHeaderMetadata

func (*S3Head) GetRestoreRequestInfo

func (client *S3Head) GetRestoreRequestInfo() (*RestoreRequestInfo, error)

func (*S3Head) GetSession

func (client *S3Head) GetSession() *session.Session

Returns an S3 session for this head request.

func (*S3Head) Head

func (client *S3Head) Head(key string)

Head sends a HEAD request to S3 for the specified key. After calling this, check client.ErrorMessage and client.Response, which contains a HeadObjectOutput struct. See the docs here: https://godoc.org/github.com/aws/aws-sdk-go/service/s3#HeadObjectOutput

The most relevant items for us in the HeadObjectOutput struct are ContentLength, ContentType, LastModified, Metadata, and VersionId.

func (*S3Head) SetSessionEndpoint

func (client *S3Head) SetSessionEndpoint(url string)

SetSessionUrl is a hack that allows us to override the URL endpoint that the S3 client talks to. We do this only during testing, when we want our client to talk to a local test server.

func (*S3Head) StoredFile

func (client *S3Head) StoredFile() *models.StoredFile

type S3ObjectDelete

type S3ObjectDelete struct {
	AWSRegion    string
	ErrorMessage string

	DeleteObjectsInput *s3.DeleteObjectsInput
	Response           *s3.DeleteObjectsOutput
	// contains filtered or unexported fields
}

S3ObjectDelete wraps an S3 client that performs delete operations on S3 objects.

func NewS3ObjectDelete

func NewS3ObjectDelete(accessKeyId, secretAccessKey, region, bucket string, keys []string) *S3ObjectDelete

NewS3ObjectDelete returns a new S3ObjectDelete object. Params:

accessKeyId - The AWS Access Key Id used to authenticate with AWS. secretAccessKey - The AWS secret access key. region - is the S3 region you want to connect to. Regions are listed at

http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region,
and are configured in config settings APTrustS3Region,
APTrustGlacierRegion, and DPNGlacierRegion.

bucket - is the name of the bucket that contains the keys you want to delete. keys - is a list of keys you want to delete from that bucket.

func (*S3ObjectDelete) DeleteList

func (client *S3ObjectDelete) DeleteList()

DeleteList deletes the list of keys you specified. Check s3ObjectDelete.ErrorMessage afterward to see if anything failed. Detailed errors will be in s3ObjectDelete.Response.Errors. The S3 Error type is defined at http://docs.aws.amazon.com/sdk-for-go/api/service/s3.html#type-Error

Note that if you try to delete keys that don't exist, you will not get an error, and those keys will be shown as deleted in s3ObjectDelete.Response.Deleted. That's AWS' design decision.

func (*S3ObjectDelete) GetSession

func (client *S3ObjectDelete) GetSession() *session.Session

GetSession returns an S3 session for this object.

type S3ObjectList

type S3ObjectList struct {
	AWSRegion    string
	ErrorMessage string

	ListObjectsInput *s3.ListObjectsInput
	Response         *s3.ListObjectsOutput
	// contains filtered or unexported fields
}

func NewS3ObjectList

func NewS3ObjectList(accessKeyId, secretAccessKey, region, bucket string, maxKeys int64) *S3ObjectList

NewS3ObjectList returns an object that will list items in an S3 bucket.

accessKeyId - The AWS Access Key Id used to authenticate with AWS. secretAccessKey - The AWS secret access key. region - The S3 region to connect to. bucket - The bucket to list maxKeys - The maximum number of items to list

func (*S3ObjectList) GetList

func (client *S3ObjectList) GetList(prefix string)

Returns a list of objects from this S3 bucket. If param prefix is not an empty string, this returns only keys with the specified prefix. Check *s3ObjectList.Response.IsTruncated to see if you got the complete list. If not, keep calling GetList until IsTruncated == false.

func (*S3ObjectList) GetSession

func (client *S3ObjectList) GetSession() *session.Session

Returns an S3 session for this objectList.

type S3Restore

type S3Restore struct {
	AWSRegion                         string
	BucketName                        string
	KeyName                           string
	Tier                              string
	Days                              int64
	ErrorMessage                      string
	Response                          *s3.RestoreObjectOutput
	RestoreAlreadyInProgress          bool
	AlreadyInActiveTier               bool
	RequestRejectedServiceUnavailable bool

	// TestURL is the URL of a mock S3 server
	// for use in unit tests only.
	TestURL string
	// contains filtered or unexported fields
}

func NewS3Restore

func NewS3Restore(accessKeyId, secretAccessKey, region, bucket, key, tier string, days int64) *S3Restore

Sets up as S3 restore request, which is for S3 items that have been archived into Glacier. Normal S3 items do not need restore requests. You can just use s3_download to get them directly.

s3_restore simply initiates a restore request, which generally takes several hours to complete. Check the S3 bucket periodically to see if the item has been restored.

Params:

accessKeyId - The AWS Access Key Id used to authenticate with AWS. secretAccessKey - The AWS secret access key. region - The name of the AWS region to download from.

E.g. us-east-1 (VA), us-west-2 (Oregon), or use
constants.AWSVirginia, constants.AWSOregon

bucket - The name of the bucket to download from. key - The name of the file to download. tier - The Glacier retrieval tier. Values are "Expedited",

"Standard" and "Bulk". We almost always want "Standard".
"Expedited" is expensive, and "Bulk" doesn't really
saves us much.

days - The number of days to leave the restored item in

the S3 bucket after retrieving it.

func (*S3Restore) GetSession

func (client *S3Restore) GetSession() *session.Session

Returns an S3 session for this restore request.

func (*S3Restore) RequestAccepted

func (client *S3Restore) RequestAccepted() bool

RequestAccepted returns true if the restore request was accepted, or if it is already in progress or has already been completed.

func (*S3Restore) Restore

func (client *S3Restore) Restore()

Restore the archived file from Glacier to S3.

type S3Upload

type S3Upload struct {
	AWSRegion    string
	ErrorMessage string
	UploadInput  *s3manager.UploadInput
	Response     *s3manager.UploadOutput
	// contains filtered or unexported fields
}

Typical usage:

upload := NewS3Upload(constants.AWSVirginia, config.PreservationBucket,

"some_uuid", "application/xml")

upload.AddMetadata("institution", "college.edu") upload.AddMetadata("bag", "college.edu/bag") upload.AddMetadata("bagpath", "data/file.xml") upload.AddMetadata("md5", "12345678") upload.AddMetadata("sha256", "87654321") reader, err := os.Open("/path/to/file.txt")

if err != nil {
   ... whatever ...
}

defer reader.Close() upload.Send(reader)

if upload.ErrorMessage != "" {
   ... do something ...
}

urlOfNewItem := upload.Response.Location

func NewS3Upload

func NewS3Upload(accessKeyId, secretAccessKey, region, bucket, key, contentType string) *S3Upload

Creates a new S3 upload object using the s3Manager.Uploader described at https://godoc.org/github.com/aws/aws-sdk-go/service/s3/s3manager#Uploader

The uploader uses concurrent goroutines for speed, and is smart enough to handle both normal and multi-part uploads. It also cleans up stray file parts in cases where a multi-part upload fails.

Params:

accessKeyId - The AWS Access Key Id used to authenticate with AWS. secretAccessKey - The AWS secret access key. region - The name of the AWS region to download from.

E.g. us-east-1 (VA), us-west-2 (Oregon), or use
constants.AWSVirginia, constants.AWSOregon

bucket - The name of the bucket to download from. key - The name of the file to download. contentType - A standard Content-Type header, like text/html. APTrust

ingest uploads should always set this, but the apt_upload
utility in partner_apps doesn't have to. If contentType
is an empty string, the uploader will ignore it.

func (*S3Upload) AddMetadata

func (client *S3Upload) AddMetadata(key, value string)

Adds metadata to the upload. We should be adding the following:

x-amz-meta-institution x-amz-meta-bag x-amz-meta-bagpath x-amz-meta-md5 x-amz-meta-sha256

func (*S3Upload) Concurrency

func (client *S3Upload) Concurrency() int

func (*S3Upload) GetSession

func (client *S3Upload) GetSession() *session.Session

Returns an S3 session for this upload.

func (*S3Upload) PartSize

func (client *S3Upload) PartSize() int64

func (*S3Upload) Send

func (client *S3Upload) Send(reader io.Reader)

Upload a file to S3. If ErrorMessage == "", the upload succeeded. Check S3Upload.Response.Localtion for the item's S3 URL. Caller is responsible for closing the reader.

If you're sending anything larger than constants.S3LargeFileSize, don't pass a tarReader into this function. Pass in a File, or something that supports Seek() and ReadAt(). Otherwise, Amazon's s3 upload library will be very stupid and read ALL of the buffered chunks into memory at once. (Seriously, what's the point of buffering and chunking if you do that?) That causes the worker process to crash due to lack of memory. (Esp. when we're dealing with 1TB files.) See apt_storer for an example.

func (*S3Upload) SendWithSize

func (client *S3Upload) SendWithSize(reader io.Reader, fileSize int64)

SendWithSize attempts to work around what seems to be a bug in the underlying AWS S3 library. The underlying library is not setting a correct chunk size on files over 50GB, causing uploads to fail with this message:

MultipartUpload: upload multipart failed caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit

PT #148913619 https://www.pivotaltracker.com/story/show/148913619

type VolumeClient

type VolumeClient struct {
	// contains filtered or unexported fields
}

VolumeClient connects to the VolumeService, which keeps track of how much disk space is used/available in our staging area. Workers use this service to determine whether there is enough disk space to start work on a job. We don't even want to start downloading a 250GB bag if we're going to run out of disk space before the download complete. Doing so will likely cause other worker tasks to fail due to lack of disk space.

func NewVolumeClient

func NewVolumeClient(port int) *VolumeClient

NewVolumeClient returns a new VolumeClient. Param port is the port number on which the service is running. That info should be available in config.VolumeServicePort.

func (*VolumeClient) BaseURL

func (client *VolumeClient) BaseURL() string

BaseURL returns the base URL of the VolumeService, which should always be running on localhost. (The service has to be able to stat local disks, so it should be running on localhost.)

func (*VolumeClient) Ping

func (client *VolumeClient) Ping(msTimeout int) error

Ping sends a message to the VolumeService to see if it's running. If the service isn't running, you'll get an error. Otherwise, in the immortal words of Judge Spaulding Smails, "You'll get nothing and like it."

func (*VolumeClient) Release

func (client *VolumeClient) Release(path string) error

Release tells the VolumeService that you're done with whatever disk space you reserved for the file at path.

func (*VolumeClient) Report

func (client *VolumeClient) Report(path string) (map[string]uint64, error)

Report returns information about all current disk space reservations from the VolumeService. In the map this function returns, the keys are file paths, and the values are the number of bytes reserved for those file paths.

func (*VolumeClient) Reserve

func (client *VolumeClient) Reserve(path string, bytes uint64) (bool, error)

Reserve tells the VolumeService that you want to reserve space on the local staging volume. Param path is the file path you're reserving space for, and bytes is the number of bytes you want to reserve.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL