keepbalance

package
v0.0.0-...-288f078 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 26, 2024 License: AGPL-3.0, Apache-2.0, CC-BY-SA-3.0 Imports: 38 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var Command = command{}

Functions

func EachCollection

func EachCollection(ctx context.Context, db *sqlx.DB, c *arvados.Client, f func(arvados.Collection) error, progress func(done, total int)) error

EachCollection calls f once for every readable collection. EachCollection stops if it encounters an error, such as f returning a non-nil error.

The progress function is called periodically with done (number of times f has been called) and total (number of times f is expected to be called).

Types

type Balancer

type Balancer struct {
	DB      *sqlx.DB
	Logger  logrus.FieldLogger
	Dumper  logrus.FieldLogger
	Metrics *metrics

	ChunkPrefix    string
	LostBlocksFile string

	*BlockStateMap
	KeepServices       map[string]*KeepService
	DefaultReplication int
	MinMtime           int64
	// contains filtered or unexported fields
}

Balancer compares the contents of keepstore servers with the collections stored in Arvados, and issues pull/trash requests needed to get (closer to) the optimal data layout.

In the optimal data layout: every data block referenced by a collection is replicated at least as many times as desired by the collection; there are no unreferenced data blocks older than BlobSignatureTTL; and all N existing replicas of a given data block are in the N best positions in rendezvous probe order.

func (*Balancer) CheckSanityEarly

func (bal *Balancer) CheckSanityEarly(c *arvados.Client) error

CheckSanityEarly checks for configuration and runtime errors that can be detected before GetCurrentState() and ComputeChangeSets() are called.

If it returns an error, it is pointless to run GetCurrentState or ComputeChangeSets: after doing so, the statistics would be meaningless and it would be dangerous to run any Commit methods.

func (*Balancer) CheckSanityLate

func (bal *Balancer) CheckSanityLate() error

CheckSanityLate checks for configuration and runtime errors after GetCurrentState() and ComputeChangeSets() have finished.

If it returns an error, it is dangerous to run any Commit methods.

func (*Balancer) ClearTrashLists

func (bal *Balancer) ClearTrashLists(ctx context.Context, c *arvados.Client) error

ClearTrashLists sends an empty trash list to each keep service. Calling this before GetCurrentState avoids races.

When a block appears in an index, we assume that replica will still exist after we delete other replicas on other servers. However, it's possible that a previous rebalancing operation made different decisions (e.g., servers were added/removed, and rendezvous order changed). In this case, the replica might already be on that server's trash list, and it might be deleted before we send a replacement trash list.

We avoid this problem if we clear all trash lists before getting indexes. (We also assume there is only one rebalancing process running at a time.)

func (*Balancer) CommitPulls

func (bal *Balancer) CommitPulls(ctx context.Context, c *arvados.Client) error

CommitPulls sends the computed lists of pull requests to the keepstore servers. This has the effect of increasing replication of existing blocks that are either underreplicated or poorly distributed according to rendezvous hashing.

func (*Balancer) CommitTrash

func (bal *Balancer) CommitTrash(ctx context.Context, c *arvados.Client) error

CommitTrash sends the computed lists of trash requests to the keepstore servers. This has the effect of deleting blocks that are overreplicated or unreferenced.

func (*Balancer) ComputeChangeSets

func (bal *Balancer) ComputeChangeSets()

ComputeChangeSets compares, for each known block, the current and desired replication states. If it is possible to get closer to the desired state by copying or deleting blocks, it adds those changes to the relevant KeepServices' ChangeSets.

It does not actually apply any of the computed changes.

func (*Balancer) DiscoverKeepServices

func (bal *Balancer) DiscoverKeepServices(c *arvados.Client) error

DiscoverKeepServices sets the list of KeepServices by calling the API to get a list of all services, and selecting the ones whose ServiceType is "disk"

func (*Balancer) GetCurrentState

func (bal *Balancer) GetCurrentState(ctx context.Context, c *arvados.Client, pageSize, bufs int) error

GetCurrentState determines the current replication state, and the desired replication level, for every block that is either retrievable or referenced.

It determines the current replication state by reading the block index from every known Keep service.

It determines the desired replication level by retrieving all collection manifests in the database (API server).

It encodes the resulting information in BlockStateMap.

func (*Balancer) PrintStatistics

func (bal *Balancer) PrintStatistics()

PrintStatistics writes statistics about the computed changes to bal.Logger. It should not be called until ComputeChangeSets has finished.

func (*Balancer) Run

func (bal *Balancer) Run(ctx context.Context, client *arvados.Client, cluster *arvados.Cluster, runOptions RunOptions) (nextRunOptions RunOptions, err error)

Run performs a balance operation using the given config and runOptions, and returns RunOptions suitable for passing to a subsequent balance operation.

Run should only be called once on a given Balancer object.

func (*Balancer) SetKeepServices

func (bal *Balancer) SetKeepServices(srvList arvados.KeepServiceList) error

SetKeepServices sets the list of KeepServices to operate on.

type BlockState

type BlockState struct {
	Refs     map[string]bool // pdh => true (only tracked when len(Replicas)==0)
	RefCount int
	Replicas []Replica
	Desired  map[string]int
}

BlockState indicates the desired storage class and number of replicas (according to the collections we know about) and the replicas actually stored (according to the keepstore indexes we know about).

type BlockStateMap

type BlockStateMap struct {
	// contains filtered or unexported fields
}

BlockStateMap is a goroutine-safe wrapper around a map[arvados.SizedDigest]*BlockState.

func NewBlockStateMap

func NewBlockStateMap() *BlockStateMap

NewBlockStateMap returns a newly allocated BlockStateMap.

func (*BlockStateMap) AddReplicas

func (bsm *BlockStateMap) AddReplicas(mnt *KeepMount, idx []arvados.KeepServiceIndexEntry)

AddReplicas updates the map to indicate that mnt has a replica of each block in idx.

func (*BlockStateMap) Apply

func (bsm *BlockStateMap) Apply(f func(arvados.SizedDigest, *BlockState))

Apply runs f on each entry in the map.

func (*BlockStateMap) GetConfirmedReplication

func (bsm *BlockStateMap) GetConfirmedReplication(blkids []arvados.SizedDigest, classes []string) int

GetConfirmedReplication returns the replication level of the given blocks, considering only the specified storage classes.

If len(classes)==0, returns the replication level without regard to storage classes.

Safe to call concurrently with other calls to GetCurrent, but not with different BlockStateMap methods.

func (*BlockStateMap) IncreaseDesired

func (bsm *BlockStateMap) IncreaseDesired(pdh string, classes []string, n int, blocks []arvados.SizedDigest)

IncreaseDesired updates the map to indicate the desired replication for the given blocks in the given storage class is at least n.

If pdh is non-empty, it will be tracked and reported in the "lost blocks" report.

type ChangeSet

type ChangeSet struct {
	PullLimit  int
	TrashLimit int

	Pulls           []Pull
	PullsDeferred   int // number that weren't added because of PullLimit
	Trashes         []Trash
	TrashesDeferred int // number that weren't added because of TrashLimit
	// contains filtered or unexported fields
}

ChangeSet is a set of change requests that will be sent to a keepstore server.

func (*ChangeSet) AddPull

func (cs *ChangeSet) AddPull(p Pull)

AddPull adds a Pull operation.

func (*ChangeSet) AddTrash

func (cs *ChangeSet) AddTrash(t Trash)

AddTrash adds a Trash operation

func (*ChangeSet) String

func (cs *ChangeSet) String() string

String implements fmt.Stringer.

type KeepMount

type KeepMount struct {
	arvados.KeepMount
	KeepService *KeepService
}

func (*KeepMount) String

func (mnt *KeepMount) String() string

String implements fmt.Stringer.

type KeepService

type KeepService struct {
	arvados.KeepService

	*ChangeSet
	// contains filtered or unexported fields
}

KeepService represents a keepstore server that is being rebalanced.

func (*KeepService) CommitPulls

func (srv *KeepService) CommitPulls(ctx context.Context, c *arvados.Client) error

CommitPulls sends the current list of pull requests to the storage server (even if the list is empty).

func (*KeepService) CommitTrash

func (srv *KeepService) CommitTrash(ctx context.Context, c *arvados.Client) error

CommitTrash sends the current list of trash requests to the storage server (even if the list is empty).

func (*KeepService) String

func (srv *KeepService) String() string

String implements fmt.Stringer.

func (*KeepService) URLBase

func (srv *KeepService) URLBase() string

URLBase returns scheme://host:port for this server.

type Pull

type Pull struct {
	arvados.SizedDigest
	From *KeepService
	To   *KeepMount
}

Pull is a request to retrieve a block from a remote server, and store it locally.

func (Pull) MarshalJSON

func (p Pull) MarshalJSON() ([]byte, error)

MarshalJSON formats a pull request the way keepstore wants to see it.

type Replica

type Replica struct {
	*KeepMount
	Mtime int64
}

Replica is a file on disk (or object in an S3 bucket, or blob in an Azure storage container, etc.) as reported in a keepstore index response.

type RunOptions

type RunOptions struct {
	Once                  bool
	CommitConfirmedFields bool
	ChunkPrefix           string
	Logger                logrus.FieldLogger
	Dumper                logrus.FieldLogger

	// SafeRendezvousState from the most recent balance operation,
	// or "" if unknown. If this changes from one run to the next,
	// we need to watch out for races. See
	// (*Balancer)ClearTrashLists.
	SafeRendezvousState string
}

RunOptions controls runtime behavior. The flags/options that belong here are the ones that are useful for interactive use. For example, "CommitTrash" is a runtime option rather than a config item because it invokes a troubleshooting feature rather than expressing how balancing is meant to be done at a given site.

RunOptions fields are controlled by command line flags.

type Server

type Server struct {
	http.Handler

	Cluster    *arvados.Cluster
	ArvClient  *arvados.Client
	RunOptions RunOptions
	Metrics    *metrics

	Logger logrus.FieldLogger
	Dumper logrus.FieldLogger

	DB *sqlx.DB
}

func (*Server) CheckHealth

func (srv *Server) CheckHealth() error

CheckHealth implements service.Handler.

func (*Server) Done

func (srv *Server) Done() <-chan struct{}

Done implements service.Handler.

type Trash

type Trash struct {
	arvados.SizedDigest
	Mtime int64
	From  *KeepMount
}

Trash is a request to delete a block.

func (Trash) MarshalJSON

func (t Trash) MarshalJSON() ([]byte, error)

MarshalJSON formats a trash request the way keepstore wants to see it, i.e., as a bare locator with no +size hint.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL