pullman

package
v0.12.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 29, 2023 License: Apache-2.0 Imports: 9 Imported by: 0

README

PullMan

Manage your file pulls with PullMan. The primary use-case is supporting a long-running process that can have remote repositories configured dynamically and concurrent pulls from those repositories handled efficiently.

This project is a work in progress.

API

There is a single method in the functional API:

func (p *PullManager) Pull(ctx context.Context, pc PullCommand) error

The PullCommand contains all the needed information to process a request to pull resources. A PullManager instance is intended to be used concurrently from multiple threads calling Pull().

See Concepts for details.

Example Usage
package main

import (
	"context"
	"fmt"

	"github.com/go-logr/zapr"
	"go.uber.org/zap"

	"github.com/kserve/modelmesh-runtime-adapter/pullman"
	_ "github.com/kserve/modelmesh-runtime-adapter/pullman/storageproviders/http"
)

func main() {
	// set-up the logger
	zaplog, err := zap.NewDevelopment()
	if err != nil {
		panic("Error creating logger...")
	}
	// create a manager
	manager := pullman.NewPullManager(zapr.NewLogger(zaplog))

	// construct the PullCommand
	configJSON := []byte(`{
		"type": "http",
		"url": "http://httpbin.org"
	}`)
	rc := &pullman.RepositoryConfig{}
	_ = json.Unmarshal(configJSON, rc)

	pts := []pullman.Target{
		{
			RemotePath: "uuid",
		},
		{
			RemotePath: "/image/jpeg",
			LocalPath:  "random_image.jpg",
		},
	}

	pc := pullman.PullCommand{
		RepositoryConfig: rc,
		Directory:        "./output",
		Targets:          pts,
	}

	pullErr := manager.Pull(context.Background(), pc)
	if pullErr != nil {
		fmt.Printf("Failed to pull files: %v\n", pullErr)
	}
}

Executing the above code results in two files being downloaded:

  • a random JPEG image at output/random_image.jpg
  • a file containing JSON with a uuid at output/uuid

Concepts

Storage Provider

The StorageProvider interface abstracts creating clients to a remote service that files can be pulled from. For generic providers, a service can be identified by the communication protocol (s3, http, ftp, etc). A StorageProvider is identified by a string type, and available storage providers are typically registered with PullMan at boot-up via an init function:

func init() {
	p := Provider{
		// some configurations for the provider
	}
	pullman.RegisterProvider(providerType, p)
}

This allows a user of PullMan to control what provider implementations it makes available. A StorageProvider is a factory for RepositoryClients and creates them from a provider-specific configuration abstracted as a Config.

Repository Client

A RepositoryClient encapsulates the connections to a remote service and knows how to pull resources from it.

Creating and updating RepositoryClient instances can happen dynamically and asynchronously from pulling any resources. PullMan manages a cache of RepositoryClients based on requests it has processed and will re-use clients where possible.

Pull Command

A PullCommand contains all the needed information for PullMan to process a request to pull resources. Both remote and local resources are identified by paths. LocalPath is a filesystem path, and RemotePath is always composed of segments separated by forward slashes. The RemotePath may point to a single resource or an abstraction pointing to multiple resources (analogous to a directory). The definition of a "directory" may be different for different storage providers but must always be compatible with a filesystem path. For example, an HTTP request that gets a multipart/form-data body as a response could result in writing multiple files when pulling that resource.

// Represents the request to the puller to be fulfilled
type PullCommand struct {
	// repository from which files will be pulled
	RepositoryConfig Config
	// local directory where files will be pulled to
	Directory string
	// the list of targets to be pulled
	Targets []Target
}

type Target struct {
	// remote path to the desired resource(s)
	RemotePath string
	// path to local file to pull the resource to (may have default based on RemotePath)
	LocalPath string
}

Documentation

Overview

Copyright 2021 IBM Corporation

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CacheOptions

func CacheOptions(ttl time.Duration, cleanupPeriod time.Duration) func(*PullManager) error

func GetString

func GetString(c Config, key string) (string, bool)

helper functions to have consistent behavior when working with Configs

func HashStrings

func HashStrings(strings ...string) string

HashStrings generates a hash from the concatenation of the passed strings Provides a common way for providers to implement GetKey in the case that some configuration's values are considered secret. Use the hash as part of the key instead

func OpenFile

func OpenFile(path string) (*os.File, error)

OpenFile will check the path and the filesystem for mismatch errors

func RegisterProvider

func RegisterProvider(providerType string, provider StorageProvider)

RegisterProvider should only be called when initializing the application

Types

type Config

type Config interface {
	// GetType returns the type of the config
	GetType() string

	// Get returns a key's value and a bool if it was specified
	Get(name string) (interface{}, bool)
}

Config represents simple key/value configuration with a type/class

type PullCommand

type PullCommand struct {
	// repository from which files will be pulled
	RepositoryConfig Config
	// local directory where files will be pulled to
	Directory string
	// the list of paths referring to resources to be pulled
	Targets []Target
}

Represents the command sent to PullMan to be fulfilled

type PullManager

type PullManager struct {
	// contains filtered or unexported fields
}

func NewPullManager

func NewPullManager(log logr.Logger, options ...func(*PullManager)) *PullManager

func (*PullManager) Pull

func (p *PullManager) Pull(ctx context.Context, pc PullCommand) error

Pull processes the PullCommand, pulling files to the local filesystem

type RepositoryClient

type RepositoryClient interface {
	Pull(context.Context, PullCommand) error
}

A RepositoryClient is the worker that executes a PullCommand

type RepositoryConfig

type RepositoryConfig struct {
	// contains filtered or unexported fields
}

Generic config abstraction used by PullMan

func NewRepositoryConfig

func NewRepositoryConfig(storageType string, config map[string]interface{}) *RepositoryConfig

func (*RepositoryConfig) Get

func (rc *RepositoryConfig) Get(key string) (interface{}, bool)

func (*RepositoryConfig) GetString

func (rc *RepositoryConfig) GetString(key string) (string, bool)

func (*RepositoryConfig) GetType

func (rc *RepositoryConfig) GetType() string

func (*RepositoryConfig) MarshalJSON

func (rc *RepositoryConfig) MarshalJSON() ([]byte, error)

func (*RepositoryConfig) Set

func (rc *RepositoryConfig) Set(key string, val interface{})

func (*RepositoryConfig) UnmarshalJSON

func (rc *RepositoryConfig) UnmarshalJSON(bs []byte) error

type StorageProvider

type StorageProvider interface {
	NewRepository(config Config, log logr.Logger) (RepositoryClient, error)
	// GetKey generates a string from the config that only includes fields
	// required to build the connection to the storage service. If the key
	// of two configs match, a single RepositoryClient must be able to
	// handle pulls with both configs.
	// Note: GetKey should not validate the config
	GetKey(config Config) string
}

A StorageProvider is a factory for RepositoryClients

type Target

type Target struct {
	// remote path to resource(s) to be pulled
	RemotePath string
	// filepath to write the file(s) to
	LocalPath string
}

Directories

Path Synopsis
storageproviders
gcs
pvc
s3

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL