autopilot

package
v0.0.0-...-52b58a3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 31, 2022 License: Apache-2.0, MPL-2.0 Imports: 9 Imported by: 0

README

raft-autopilot

Raft Autopilot

Documentation

Overview

This code was taken from the same implementation in a branch from Consul and then had the package updated and the mutex type unexported.

Index

Constants

View Source
const (
	DefaultUpdateInterval    = 2 * time.Second
	DefaultReconcileInterval = 10 * time.Second
)

Variables

This section is empty.

Functions

func ServerLessThan

func ServerLessThan(id1 raft.ServerID, id2 raft.ServerID, s *State) bool

ServerLessThan will lookup both servers in the given State and return true if the first id corresponds to a server that is logically less than lower than, better than etc. the second server. The following criteria are considered in order of most important to least important

1. A Leader server is always less than all others 2. A voter is less than non voters 3. Healthy servers are less than unhealthy servers 4. Servers that have been stable longer are consider less than.

func SortServers

func SortServers(ids []raft.ServerID, s *State)

SortServers will take a list of raft ServerIDs and sort it using information from the State. See the ServerLessThan function for details about how two servers get compared.

Types

type ApplicationIntegration

type ApplicationIntegration interface {
	// AutopilotConfig is used to retrieve the latest configuration from the delegate
	AutopilotConfig() *Config

	// NotifyState will be called when the autopilot state is updated. The application may choose to emit metrics
	// or perform other actions based on this information.
	NotifyState(*State)

	// FetchServerStats will be called to request the application fetch the ServerStats out of band. Usually this
	// will require an RPC to each server.
	FetchServerStats(context.Context, map[raft.ServerID]*Server) map[raft.ServerID]*ServerStats

	// KnownServers fetchs the list of servers as known to the application
	KnownServers() map[raft.ServerID]*Server

	// RemoveFailedServer notifies the application to forcefully remove the server in the failed state
	// It is expected that this returns nearly immediately so if a longer running operation needs to be
	// performed then the Delegate implementation should spawn a go routine itself.
	RemoveFailedServer(*Server)
}

type Autopilot

type Autopilot struct {
	// contains filtered or unexported fields
}

Autopilot is the type to manage a running Raft instance.

Each Raft node in the cluster will have a corresponding Autopilot instance but only 1 Autopilot instance should run at a time in the cluster. So when a node gains Raft leadership the corresponding Autopilot instance should have it's Start method called. Then if leadership is lost that node should call the Stop method on the Autopilot instance.

func New

func New(raft Raft, delegate ApplicationIntegration, options ...Option) *Autopilot

New will create a new Autopilot instance utilizing the given Raft and Delegate. If the WithPromoter option is not provided the default StablePromoter will be used.

func (*Autopilot) AddServer

func (a *Autopilot) AddServer(s *Server) error

AddServer is a helper for adding a new server to the raft configuration. This may remove servers with duplicate addresses or ids first and after its all done will trigger autopilot to remove dead servers if there are any. Servers added by this method will start in a non-voting state and later on autopilot will promote them to voting status if desired by the configured promoter. If too many removals would be required that would cause leadership loss then an error is returned instead of performing any Raft configuration changes.

func (*Autopilot) GetServerHealth

func (a *Autopilot) GetServerHealth(id raft.ServerID) *ServerHealth

GetServerHealth returns the latest ServerHealth for a given server. The returned struct should not be modified or else it will im

func (*Autopilot) GetState

func (a *Autopilot) GetState() *State

GetState retrieves the current autopilot State

func (*Autopilot) IsRunning

func (a *Autopilot) IsRunning() (ExecutionStatus, <-chan struct{})

IsRunning returns the current execution status of the autopilot go routines as well as a chan which will be closed when the routines are no longer running

func (*Autopilot) NumVoters

func (a *Autopilot) NumVoters() (int, error)

NumVoters is a helper for calculating the number of voting peers in the current raft configuration. This function ignores any autopilot state and will make the calculation based on a newly retrieved Raft configuration.

func (*Autopilot) RemoveDeadServers

func (a *Autopilot) RemoveDeadServers()

RemoveDeadServers will trigger an immediate removal of dead/failed servers.

func (*Autopilot) RemoveServer

func (a *Autopilot) RemoveServer(id raft.ServerID) error

RemoveServer is a helper to remove a server from Raft if it exists in the latest Raft configuration

func (*Autopilot) Start

func (a *Autopilot) Start(ctx context.Context)

Start will launch the go routines in the background to perform Autopilot. When the context passed in is cancelled or the Stop method is called then these routines will exit.

func (*Autopilot) Stop

func (a *Autopilot) Stop() <-chan struct{}

Stop will terminate the go routines being executed to perform autopilot.

type Config

type Config struct {
	// CleanupDeadServers controls whether to remove dead servers when a new
	// server is added to the Raft peers.
	CleanupDeadServers bool

	// LastContactThreshold is the limit on the amount of time a server can go
	// without leader contact before being considered unhealthy.
	LastContactThreshold time.Duration

	// MaxTrailingLogs is the amount of entries in the Raft Log that a server can
	// be behind before being considered unhealthy.
	MaxTrailingLogs uint64

	// MinQuorum sets the minimum number of servers required in a cluster
	// before autopilot can prune dead servers.
	MinQuorum uint

	// ServerStabilizationTime is the minimum amount of time a server must be
	// in a stable, healthy state before it can be added to the cluster. Only
	// applicable with Raft protocol version 3 or higher.
	ServerStabilizationTime time.Duration

	Ext interface{}
}

Config represents all the tunables of autopilot

type ExecutionStatus

type ExecutionStatus string

ExecutionStatus represents the current status of the autopilot background go routines

const (
	NotRunning   ExecutionStatus = "not-running"
	Running      ExecutionStatus = "running"
	ShuttingDown ExecutionStatus = "shutting-down"
)

type FailedServers

type FailedServers struct {
	// StaleNonVoters are the ids of those server in the raft configuration as non-voters
	// that are not present in the delegates view of what servers should be available
	StaleNonVoters []raft.ServerID
	// StaleVoters are the ids of those servers in the raft configuration as voters that
	// are not present in the delegates view of what servers should be available
	StaleVoters []raft.ServerID

	// FailedNonVoters are the servers without voting rights in the cluster that the
	// delegate has indicated are in a failed state
	FailedNonVoters []*Server
	// FailedVoters are the servers without voting rights in the cluster that the
	// delegate has indicated are in a failed state
	FailedVoters []*Server
}

type NodeStatus

type NodeStatus string

NodeStatus represents the health of a server as know to the autopilot consumer. This should not take into account Raft health and the server being on a new enough term and index.

const (
	NodeUnknown NodeStatus = "unknown"
	NodeAlive   NodeStatus = "alive"
	NodeFailed  NodeStatus = "failed"
	NodeLeft    NodeStatus = "left"
)

type NodeType

type NodeType string
const (
	NodeVoter NodeType = "voter"
)

type Option

type Option func(*Autopilot)

Option is an option to be used when creating a new Autopilot instance

func WithLogger

func WithLogger(logger hclog.Logger) Option

WithLogger returns an Option to set the Autopilot instance's logger

func WithPromoter

func WithPromoter(promoter Promoter) Option

WithPromoter returns an option to set the Promoter type that Autpilot will use. When the option is not given the default StablePromoter from this package will be used.

func WithReconcileInterval

func WithReconcileInterval(t time.Duration) Option

WithReconcileInterval returns an Option to set the Autopilot instance's reconcile interval.

func WithUpdateInterval

func WithUpdateInterval(t time.Duration) Option

WithUpdateInterval returns an Option to set the Autopilot instance's update interval.

type Promoter

type Promoter interface {
	// GetServerExt returns some object that should be stored in the Ext field of the Server
	// This value will not be used by the code in this repo but may be used by the other
	// Promoter methods and the application utilizing autopilot. If the value returned is
	// nil the extended state will not be updated.
	GetServerExt(*Config, *ServerState) interface{}

	// GetStateExt returns some object that should be stored in the Ext field of the State
	// This value will not be used by the code in this repo but may be used by the other
	// Promoter methods and the application utilizing autopilot. If the value returned is
	// nil the extended state will not be updated.
	GetStateExt(*Config, *State) interface{}

	// GetNodeTypes returns a map of ServerID to NodeType for all the servers which
	// should have their NodeType field updated
	GetNodeTypes(*Config, *State) map[raft.ServerID]NodeType

	// CalculatePromotionsAndDemotions
	CalculatePromotionsAndDemotions(*Config, *State) RaftChanges

	// FilterFailedServerRemovals takes in the current state and structure outlining all the
	// failed/stale servers and will return those failed servers which the promoter thinks
	// should be allowed to be removed.
	FilterFailedServerRemovals(*Config, *State, *FailedServers) *FailedServers
}

Promoter is an interface to provide promotion/demotion algorithms to the core autopilot type. The BasicPromoter satisfies this interface and will promote any stable servers but other algorithms could be implemented. The implementation of these methods shouldn't "block". While they are synchronous autopilot expects the algorithms to not make any network or other requests which way cause an indefinite amount of waiting to occur.

Note that all parameters passed to these functions should be considered read-only and their modification could result in undefined behavior of the core autopilot routines including potential crashes.

func DefaultPromoter

func DefaultPromoter() Promoter

type Raft

type Raft interface {
	AddNonvoter(id raft.ServerID, address raft.ServerAddress, prevIndex uint64, timeout time.Duration) raft.IndexFuture
	AddVoter(id raft.ServerID, address raft.ServerAddress, prevIndex uint64, timeout time.Duration) raft.IndexFuture
	DemoteVoter(id raft.ServerID, prevIndex uint64, timeout time.Duration) raft.IndexFuture
	LastIndex() uint64
	Leader() raft.ServerAddress
	GetConfiguration() raft.ConfigurationFuture
	RemoveServer(id raft.ServerID, prevIndex uint64, timeout time.Duration) raft.IndexFuture
	Stats() map[string]string
	LeadershipTransferToServer(id raft.ServerID, address raft.ServerAddress) raft.Future
}

Raft is the interface of all the methods on the Raft type that autopilot needs to function. Autopilot will take in an interface for Raft instead of a concrete type to allow for dependency injection in tests.

type RaftChanges

type RaftChanges struct {
	Promotions []raft.ServerID
	Demotions  []raft.ServerID
	Leader     raft.ServerID
}

type RaftState

type RaftState string

RaftState is the status of a single server in the Raft cluster.

const (
	RaftNone     RaftState = "none"
	RaftLeader   RaftState = "leader"
	RaftVoter    RaftState = "voter"
	RaftNonVoter RaftState = "non-voter"
	RaftStaging  RaftState = "staging"
)

func (RaftState) IsPotentialVoter

func (s RaftState) IsPotentialVoter() bool

type Server

type Server struct {
	ID          raft.ServerID
	Name        string
	Address     raft.ServerAddress
	NodeStatus  NodeStatus
	Version     string
	Meta        map[string]string
	RaftVersion int
	IsLeader    bool

	NodeType NodeType
	Ext      interface{}
}

Server represents one Raft server

type ServerHealth

type ServerHealth struct {
	// Healthy is whether or not the server is healthy according to the current
	// Autopilot config.
	Healthy bool

	// StableSince is the last time this server's Healthy value changed.
	StableSince time.Time
}

func (*ServerHealth) IsStable

func (h *ServerHealth) IsStable(now time.Time, minStableDuration time.Duration) bool

IsStable returns true if the ServerState shows a stable, passing state according to the given AutopilotConfig

type ServerState

type ServerState struct {
	Server Server
	State  RaftState
	Stats  ServerStats
	Health ServerHealth
}

func (*ServerState) HasVotingRights

func (s *ServerState) HasVotingRights() bool

type ServerStats

type ServerStats struct {
	// LastContact is the time since this node's last contact with the leader.
	LastContact time.Duration

	// LastTerm is the highest leader term this server has a record of in its Raft log.
	LastTerm uint64

	// LastIndex is the last log index this server has a record of in its Raft log.
	LastIndex uint64
}

ServerStats holds miscellaneous Raft metrics for a server

type StablePromoter

type StablePromoter struct{}

func (*StablePromoter) CalculatePromotionsAndDemotions

func (_ *StablePromoter) CalculatePromotionsAndDemotions(c *Config, s *State) RaftChanges

CalculatePromotionsAndDemotions will return a list of all promotions and demotions to be done as well as the server id of the desired leader. This particular interface implementation maintains a stable leader and will promote healthy servers to voting status. It will never change the leader ID nor will it perform demotions.

func (*StablePromoter) FilterFailedServerRemovals

func (_ *StablePromoter) FilterFailedServerRemovals(_ *Config, _ *State, failed *FailedServers) *FailedServers

func (*StablePromoter) GetNodeTypes

func (_ *StablePromoter) GetNodeTypes(_ *Config, s *State) map[raft.ServerID]NodeType

func (*StablePromoter) GetServerExt

func (_ *StablePromoter) GetServerExt(_ *Config, srv *ServerState) interface{}

func (*StablePromoter) GetStateExt

func (_ *StablePromoter) GetStateExt(_ *Config, _ *State) interface{}

type State

type State struct {
	Healthy          bool
	FailureTolerance int
	Servers          map[raft.ServerID]*ServerState
	Leader           raft.ServerID
	Voters           []raft.ServerID
	Ext              interface{}
	// contains filtered or unexported fields
}

func (*State) ServerStabilizationTime

func (s *State) ServerStabilizationTime(c *Config) time.Duration

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL