gcsext

package module
v0.0.18 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 3, 2019 License: Apache-2.0 Imports: 21 Imported by: 0

README

google-cloudstorage-ext - Utilities for working with google cloud storage

testing

will unfourtuntely fail for non Kvantic users as list-permissions are required; will maybe be fixed later.

Examples

ReadAllByPrefix(ctx context.Context, bucket *storage.BucketHandle, prefix string) (io.ReadCloser, error)


import (
	"context"

	gcsext "github.com/kvanticoss/google-cloudstorage-ext"

	"cloud.google.com/go/storage"
)

func main(){
	ctx := context.Background()
	client, err := storage.NewClient(ctx)
	if err != nil {
		panic(err)
	}

	r, err := gcsext.ReadAllByPrefix(ctx, client.Bucket(baseBucket), "/some/gcs/path/in/the/bucket")
	if err != nil {
		panic(err)
	}
	// Do something with r which is a continous byte-stream of all objects in the bucket that
    // starts with the prefix provided. virtual folders ("placehodlers") are ignored. mostly useful
    // for reading json and text files.
}

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrIteratorStop is returned by RecordIterators where there are not more records to be found.
	ErrIteratorStop = iterator.ErrIteratorStop
)

Functions

func CombineFilters added in v0.0.6

func CombineFilters(filters ...func(*storage.ObjectAttrs) bool) func(*storage.ObjectAttrs) bool

CombineFilters creates an iterator-filter function by "AND"-ing all filters.

func FilterOutVirtualGcsFolders added in v0.0.6

func FilterOutVirtualGcsFolders(objAttr *storage.ObjectAttrs) bool

FilterOutVirtualGcsFolders is a predicate function which removes the GCS virtual folders by requiring the name to end with "/" and the hash to match "placeholder" content

func FolderReadersByPrefixWithFilter added in v0.0.10

func FolderReadersByPrefixWithFilter(
	ctx context.Context,
	bucket *storage.BucketHandle,
	prefix string,
	predicate func(*storage.ObjectAttrs) bool,
) func() (string, []io.ReadCloser, error)

FolderReadersByPrefixWithFilter returns an interator which in turn returns (potentially uncompressed gzip) readers for each unique folder found under the prefix

func GetGCSWriterFactory added in v0.0.11

func GetGCSWriterFactory(ctx context.Context, bucket *storage.BucketHandle) writerfactory.WriterFactory

GetGCSWriterFactory returns a writer factory backed by GCS

func IterateJSONRecordsByFoldersSorted added in v0.0.11

func IterateJSONRecordsByFoldersSorted(
	ctx context.Context,
	bucket *storage.BucketHandle,
	prefix string,
	new func() interface{},
	predicate func(*storage.ObjectAttrs) bool,
) func() (string, interface{}, error)

IterateJSONRecordsByFoldersSorted returns a RecordIterator with the guarratee that records will come in sorted order (assumes the record implements the Lesser interface and that each object in the GCS folder is saved in a sorted order). Files between folders are not guarranteed to be sorted as folders are read sequencially

func IterateJSONRecordsByFoldersSortedCB added in v0.0.12

func IterateJSONRecordsByFoldersSortedCB(
	ctx context.Context,
	bucket *storage.BucketHandle,
	prefix string,
	new func() interface{},
	predicate func(*storage.ObjectAttrs) bool,
	callback func(string, func() (interface{}, error)) error,
) error

IterateJSONRecordsByFoldersSortedCB works like IterateJSONRecordsByFoldersSorted but through an callback pattern

func ReadAllByPrefix

func ReadAllByPrefix(ctx context.Context, bucket *storage.BucketHandle, prefix string) (io.ReadCloser, error)

ReadAllByPrefix Reads all files one into 1 combined bytestream. Autoamtically handles decompression of .gz First error will close the stream.

func ReadFilteredByPrefix

func ReadFilteredByPrefix(ctx context.Context, bucket *storage.BucketHandle, prefix string, predicate func(*storage.ObjectAttrs) bool) (io.ReadCloser, error)

ReadFilteredByPrefix Reads all files one into 1 combined bytestream. Autoamtically handles decompression of .gz only objects that predicate(*storage.ObjectAttrs) bool returns true will be kept First error will close the stream.

func ReadFoldersByPrefixWithFilter added in v0.0.10

func ReadFoldersByPrefixWithFilter(
	ctx context.Context,
	bucket *storage.BucketHandle,
	prefix string,
	predicate func(*storage.ObjectAttrs) bool,
) func() (string, io.ReadCloser, error)

ReadFoldersByPrefixWithFilter Reads all files one into 1 combined bytestream per folder, autoamtically handles decompression of .gz only objects that predicate(*storage.ObjectAttrs) bool returns true will be kept First error will close the stream.

func RemoveFolder added in v0.0.12

func RemoveFolder(
	ctx context.Context,
	bucket *storage.BucketHandle,
	prefix string,
	predicate func(*storage.ObjectAttrs) bool,
) error

RemoveFolder remove all contents under the specificed prefix; unless a predicate function is present and returns false Will stop and return on first error

func SortGCSFolders added in v0.0.12

func SortGCSFolders(
	ctx context.Context,
	bucket *storage.BucketHandle,
	prefix string,
	newer func() iterator.Lesser,
	srcPredicate func(*storage.ObjectAttrs) bool,
	destinationPrefix string,
	cacheFactory recordbuffer.ReadWriteResetterFactory,
	bo *backoff.RandExpBackoff,
	removeDuplicates bool,
	removeSrcOnSuccess bool,
) error

SortGCSFolders sorts all files picked up by the prefix + predicate and saves them into sorted NewLineJson under the filename given by destination prefix. If the destination prefix contains a .gz suffix the contents will be gzipped new line JSON

@ctx - context @bucket - *storage.BucketHandle to operate on @prefix - The prefix under which to start searching for folders. @newer - A factory for creating a new instances of records to unmarshal into; For sorting to take place the record must implment github.com/kvanticoss/goutils/iterator.Lesser @srcPredicate - an optional predicate to identify ONLY the files to be merged. @destinationPrefix - The destination file-name for merged files. The final name will be folder/destinationPrefix @cacheFactory - The bytesBuffer to use for soring. @bo - A backoff time; can be left null for default of 5 re-attempts with at least 15 sleep intervals @removeDuplicates - Should duplicated records be removed. @removeSrcOnSuccess - Should we remove the original files after compacting them. Will reuse srcPredicate for file removals

func TouchFile added in v0.0.8

func TouchFile(
	ctx context.Context,
	bucket *storage.BucketHandle,
	path string,
) (*storage.ObjectHandle, error)

TouchFile ensures a files exists by creating it if it doesn't exists and/or returning it otherwise. Will add gzip headers if the file ends in ".gz"

Types

type Lesser

type Lesser interface {
	Less(other interface{}) bool
}

Lesser is implemented by type which can be compared to each other and should answer i'm I/this/self less than the other record (argument 1)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL