scaling

package
v0.0.0-...-37f5ccb Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 11, 2023 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Index

Constants

View Source
const (
	// DefaultMinReplicas is the minimal amount of replicas for a service.
	DefaultMinReplicas = 1

	// DefaultMaxReplicas is the amount of replicas a service will auto-scale up to.
	DefaultMaxReplicas = 5

	DefaultZeroDuration = 3 * time.Minute

	// DefaultScalingFactor is the defining proportion for the scaling increments.
	DefaultScalingFactor = 10

	ScaleTypeRPS      ScaleType = "rps"
	ScaleTypeCapacity ScaleType = "capacity"

	// MinScaleLabel label indicating min scale for a Inference
	MinScaleLabel = "ai.tensorchord.scale.min"

	// MaxScaleLabel label indicating max scale for a Inference
	MaxScaleLabel = "ai.tensorchord.scale.max"

	// ScalingFactorLabel label indicates the scaling factor for a Inference
	ScalingFactorLabel = "ai.tensorchord.scale.factor"

	// TargetLoadLabel label indicates the target load for a Inference
	TargetLoadLabel = "ai.tensorchord.scale.target"

	// ZeroDurationLabel label indicates the zero duration for a Inference
	ZeroDurationLabel = "ai.tensorchord.scale.zero-duration"

	// ScaleTypeLabel label indicates the scale type for a Inference
	ScaleTypeLabel = "ai.tensorchord.scale.type"

	FrameworkLabel = "ai.tensorchord.framework"
)

Variables

This section is empty.

Functions

func Retry

func Retry(r routine, label string, attempts int, interval time.Duration) error

Types

type FunctionScaleResult

type FunctionScaleResult struct {
	Available bool
	Error     error
	Found     bool
	Duration  time.Duration
}

FunctionScaleResult holds the result of scaling from zero

type InferenceScaler

type InferenceScaler struct {
	// contains filtered or unexported fields
}

InferenceScaler scales from zero

func NewInferenceScaler

func NewInferenceScaler(r runtime.Runtime,
	defaultTTL time.Duration) (*InferenceScaler, error)

InferenceScaler create a new scaler with the specified ScalingConfig

func (*InferenceScaler) Scale

func (s *InferenceScaler) Scale(ctx context.Context,
	namespace, inferenceName string) FunctionScaleResult

Scale scales a function from zero replicas to 1 or the value set in the minimum replicas metadata

type ScaleType

type ScaleType string

type ServiceQuery

type ServiceQuery interface {
	GetReplicas(service, namespace string) (response ServiceQueryResponse, err error)
	SetReplicas(service, namespace string, count uint64) error
}

ServiceQuery provides interface for replica querying/setting

type ServiceQueryResponse

type ServiceQueryResponse struct {
	Framework         string
	TargetLoad        uint64
	ZeroDuration      time.Duration
	Replicas          uint64
	MaxReplicas       uint64
	MinReplicas       uint64
	ScalingFactor     uint64
	AvailableReplicas uint64
	Annotations       map[string]string
}

ServiceQueryResponse response from querying a function status

func AsServerQueryResponse

func AsServerQueryResponse(inf *types.InferenceDeployment) (*ServiceQueryResponse, error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL