dyndump: github.com/gwatts/dyndump/dyndump Index | Files

package dyndump

import "github.com/gwatts/dyndump/dyndump"

Package dyndump exports and imports an entire DynamoDB table.

It supports parallel connections to the DynamoDB and S3 services for increased throughput along with rate limiting to a specific read & write capacity.

Items are written to an ItemWriter interface until the table is exhausted, or the Stop method is called.

It also provides an S3Writer type that can be passed to a Fetcher to stream received data to an S3 bucket.


Package Files

doc.go fetcher.go json.go loader.go metadata.go s3deleter.go s3reader.go s3writer.go util.go


const (
    // DefaultPartSize sets the default maximum size of objects sent to S3.
    DefaultPartSize = 50 * 1024 * 1024 // 50 MiB

    // DefaultS3MaxParallel sets the default maximum number of concurrent
    // write requests for S3.
    DefaultS3MaxParallel = 2

    // MinPartSize defines the minimum value that can be used for PartSize.
    MinPartSize = 1000

type DynPuter Uses

type DynPuter interface {
    PutItem(input *dynamodb.PutItemInput) (*dynamodb.PutItemOutput, error)

DynPuter defines the portion of the DynamoDB service the Loader requires.

type DynScanner Uses

type DynScanner interface {
    Scan(input *dynamodb.ScanInput) (*dynamodb.ScanOutput, error)

DynScanner defines the portion of the dynamodb service that Fetcher requires.

type Fetcher Uses

type Fetcher struct {
    Dyn            DynScanner
    TableName      string
    ConsistentRead bool       // Setting to true will use double the read capacity.
    MaxParallel    int        // Maximum number of parallel requests to make to Dynamo.
    MaxItems       int64      // Maximum (approximately) number of items to read from Dynamo.
    ReadCapacity   float64    // Average global read capacity to use for the scan.
    Writer         ItemWriter // Retrieved items are sent to this ItemWriter.
    // contains filtered or unexported fields

Fetcher fetches data from DynamoDB at a specified capacity and writes fetched items to a writer implementing the ItemWriter interface.

func (*Fetcher) Run Uses

func (f *Fetcher) Run() error

Run executes the fetcher, starting as many parallel reads as specified by the MaxParallel option and returns when the read has finished, failed, or been stopped.

func (*Fetcher) Stats Uses

func (f *Fetcher) Stats() FetcherStats

Stats returns current aggregate statistics about an ongoing or completed run. It is safe to call from concurrent goroutines.

func (*Fetcher) Stop Uses

func (f *Fetcher) Stop()

Stop requests a clean shutdown of active readers. Active readers will complete the current request and then exit.

type FetcherStats Uses

type FetcherStats struct {
    ItemsRead    int64
    BytesRead    int64
    CapacityUsed float64

FetcherStats is returned by Fetcher.Stats to return current global throughput statistics.

type ItemReader Uses

type ItemReader interface {
    ReadItem() (item map[string]*dynamodb.AttributeValue, err error)

ItemReader is the interface expected by a Loader to retrieve items from a source for loading into a DynamoDB table.

type ItemWriter Uses

type ItemWriter interface {
    WriteItem(item map[string]*dynamodb.AttributeValue) error

ItemWriter is the interface expected by a Fetcher when writing retrieved DynamoDB items. Must support writes from concurrent goroutines.

type Loader Uses

type Loader struct {
    Dyn            DynPuter
    TableName      string     // Table name to restore to
    MaxParallel    int        // Maximum number of put operations to execute concurrently
    MaxItems       int64      // Maximum (approximately) number of items to write to Dynamo.
    WriteCapacity  float64    // Maximum Dynamo write capacity to use for writes
    Source         ItemReader // The source to fetch items from
    AllowOverwrite bool       // If true then any existing records will be ovewritten
    HashKey        string     // The attribute name of the hash key for the table
    // contains filtered or unexported fields

Loader reads records from an ItemReader and loads them into a DynamoDB table.

func (*Loader) Run Uses

func (ld *Loader) Run() error

Run executes the loader, starting goroutines to execute parallel puts as required. Returns when the load has finished, failed or been stopped.

func (*Loader) Stats Uses

func (ld *Loader) Stats() LoaderStats

Stats return the current loader statistics.

func (*Loader) Stop Uses

func (ld *Loader) Stop()

Stop requests a clean shutdown of current put operations. It does not block. It will cause Run to exit when the loaders finish.

type LoaderStats Uses

type LoaderStats struct {
    ItemsWritten int64
    ItemsSkipped int64
    BytesWritten int64
    CapacityUsed float64

LoaderStats are returned by Loader.Stats

type Metadata Uses

type Metadata struct {
    TableName         string             `json:"table_name"`
    TableARN          string             `json:"table_arn"`
    Status            MetadataStatus     `json:"status"`             // "running", "failed" or "completed"
    Type              MetadataBackupType `json:"backup_type"`        // "full" or "query"
    StartTime         time.Time          `json:"backup_start_time"`  // The time the backup started.
    EndTime           *time.Time         `json:"backup_end_time"`    // The time the backup was completed, or failed.
    UncompressedBytes int64              `json:"uncompressed_bytes"` // Size of the uncompressed JSON, in bytes.
    CompressedBytes   int64              `json:"compressed_bytes"`   // Size of the gzipped JSON takes, in bytes.
    ItemCount         int64              `json:"item_count"`         // Number of items in the backup.
    PartCount         int64              `json:"part_count"`         // Number of S3 objects comprising the backup

Metadata is stored alongside backups pushed to S3.

type MetadataBackupType Uses

type MetadataBackupType string

MetadataBackupType represents the type or mode of backup.

const (
    // BackupFull is set on a complete backup of a DynamoDB table.
    BackupFull MetadataBackupType = "full"

    // BackupQuery is set on a selective backup of a DynamoDB table.
    BackupQuery MetadataBackupType = "query"

type MetadataStatus Uses

type MetadataStatus string

MetadataStatus represents the state of the backup.

const (
    // StatusRunning represents a backup in progress.
    StatusRunning MetadataStatus = "running"

    // StatusFailed represents an aborted or failed backup.
    StatusFailed MetadataStatus = "failed"

    // StatusCompleted represents a successfully completed backup.
    StatusCompleted MetadataStatus = "completed"

type S3DeleteGetLister Uses

type S3DeleteGetLister interface {
    DeleteObjects(input *s3.DeleteObjectsInput) (*s3.DeleteObjectsOutput, error)

S3DeleteGetLister defines the portion of hte S3 service required by S3Deleter.

type S3Deleter Uses

type S3Deleter struct {
    // contains filtered or unexported fields

S3Deleter deletes all parts of a Dynamo backup from S3.

Given a bucket and path prefix, it will check that the backup has a valid metadata file and then remove all of the parts that are associated with it, before finally removing the metadata file itself.

func NewS3Deleter Uses

func NewS3Deleter(s3 S3DeleteGetLister, bucket, pathPrefix string) (*S3Deleter, error)

NewS3Deleter creates and initializes an S3Deleter. It will attempt to fetch the metadata object from S3 before returning to confirm that a valid backup actually exists at the given pathPrefix.

func (*S3Deleter) Abort Uses

func (d *S3Deleter) Abort()

Abort requests the deleter discontinues deleting the backup.

func (*S3Deleter) Completed Uses

func (d *S3Deleter) Completed() int64

Completed returns the number of parts that have been deleted from S3 so far. It may be called while a delete is in progress.

func (*S3Deleter) Delete Uses

func (d *S3Deleter) Delete() (err error)

Delete starts deleting the configured backup. It will block until the delete operations complete.

func (*S3Deleter) Metadata Uses

func (d *S3Deleter) Metadata() Metadata

Metadata returns the metadata read by NewS3Deleter.

type S3GetLister Uses

type S3GetLister interface {
    GetObject(input *s3.GetObjectInput) (*s3.GetObjectOutput, error)
    ListObjectsPages(input *s3.ListObjectsInput, fn func(p *s3.ListObjectsOutput, lastPage bool) (shouldContinue bool)) error

S3GetLister defines the portion of the S3 service required by S3Reader.

type S3Puter Uses

type S3Puter interface {
    PutObject(input *s3.PutObjectInput) (*s3.PutObjectOutput, error)

S3Puter defines the portion of the S3 service required by S3Writer.

type S3Reader Uses

type S3Reader struct {
    S3         S3GetLister
    Bucket     string // Bucket is the name of the S3 Bucket to read from
    PathPrefix string // PathPrefix is the prefix used to store the backup
    // contains filtered or unexported fields

S3Reader reads raw decompressed data from S3 and exposes it as a single byte stream by implementing the io.Reader interface.

func (*S3Reader) Metadata Uses

func (r *S3Reader) Metadata() (md Metadata, err error)

Metadata returns the backup's metadata information.

func (*S3Reader) Read Uses

func (r *S3Reader) Read(p []byte) (n int, err error)

Read reads a block of data from the backup It is not safe to call this concurrently from different goroutines.

type S3Writer Uses

type S3Writer struct {
    S3          S3Puter
    Bucket      string // S3 bucket name to upload to
    PathPrefix  string // Prefix to apply to each part of the backup
    PartSize    int    // number of bytes to store each part
    MaxParallel int    // Maximum number of parallel uploads to perform to S3
    // contains filtered or unexported fields

S3Writer takes a stream of JSON data and uploads it in parallel to S3.

It divides the stream into multiple pieces which store a maximum of approximately PartSize bytes each.

Each part is given a key name beginning with PathPrefix and also uploads a metadata file on completion which summarizes the table.

func NewS3Writer Uses

func NewS3Writer(s3 S3Puter, bucket, pathPrefix string, metadata Metadata) *S3Writer

NewS3Writer creates and initializes a new S3Writer

func (*S3Writer) Abort Uses

func (w *S3Writer) Abort() error

Abort closes the writer and marks the metadata state as failed

func (*S3Writer) Close Uses

func (w *S3Writer) Close() error

Close causes the writers to finish processing their uploads and will cause Run to exit once they finish.

func (*S3Writer) Run Uses

func (w *S3Writer) Run() error

Run starts goroutines to feed incoming data sent to Write to S3.

func (*S3Writer) Write Uses

func (w *S3Writer) Write(p []byte) (n int, err error)

Write takes a single block of JSON text and sends it to S3. It will return an error if a Put to S3 has failed.

type SimpleDecoder Uses

type SimpleDecoder struct {
    // contains filtered or unexported fields

SimpleDecoder implements the ItemReader interface to convert JSON entries to DynamoDB attributes items.

func NewSimpleDecoder Uses

func NewSimpleDecoder(r io.Reader) *SimpleDecoder

NewSimpleDecoder creates and initializes a new SimpleDeocder.

func (*SimpleDecoder) ReadItem Uses

func (d *SimpleDecoder) ReadItem() (item map[string]*dynamodb.AttributeValue, err error)

ReadItem implements ItemReader.

type SimpleEncoder Uses

type SimpleEncoder struct {
    // contains filtered or unexported fields

SimpleEncoder implements the ItemWriter interface to convert DynamoDB items to a JSON stream.

func NewSimpleEncoder Uses

func NewSimpleEncoder(w io.Writer) *SimpleEncoder

NewSimpleEncoder creates an initializes a new SimpleEncoder.

func (*SimpleEncoder) WriteItem Uses

func (e *SimpleEncoder) WriteItem(item map[string]*dynamodb.AttributeValue) error

WriteItem implemnts ItemWriter.

Package dyndump imports 19 packages (graph) and is imported by 1 packages. Updated 2016-07-30. Refresh now. Tools for package owners.