multiraft

package
v0.0.0-...-9741dc9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 26, 2014 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Overview

Package multiraft implements the Raft distributed consensus algorithm.

In contrast to other Raft implementations, this version is optimized for the case where one server is a part of many Raft consensus groups (likely with overlapping membership). This entails the use of a shared log and coalesced timers for heartbeats.

A cluster consists of a collection of nodes; the local node is represented by a MultiRaft object. Each node may participate in any number of groups. Nodes must have a globally unique ID (a string), and groups have a globally unique name. The application is responsible for providing a Transport interface that knows how to communicate with other nodes based on their IDs, and a Storage interface to manage persistent data.

The Raft protocol is documented in "In Search of an Understandable Consensus Algorithm" by Diego Ongaro and John Ousterhout. https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ChangeMembershipOperation

type ChangeMembershipOperation int8

ChangeMembershipOperation indicates the operation being performed by a ChangeMembershipPayload.

const (
	// ChangeMembershipAddObserver adds a non-voting node. The given node will
	// retrieve a snapshot and catch up with logs.
	ChangeMembershipAddObserver ChangeMembershipOperation = iota
	// ChangeMembershipRemoveObserver removes a non-voting node.
	ChangeMembershipRemoveObserver

	// ChangeMembershipAddMember adds a full (voting) node. The given node must already be an
	// observer; it will be removed from the Observers list when this
	// operation is processed.
	// TODO(bdarnell): enforce the requirement that a node be added as an observer first.
	ChangeMembershipAddMember

	// ChangeMembershipRemoveMember removes a voting node. It is not possible to remove the
	// last node; the result of attempting to do so is undefined.
	ChangeMembershipRemoveMember
)

Values for ChangeMembershipOperation.

type ChangeMembershipPayload

type ChangeMembershipPayload struct {
	Operation ChangeMembershipOperation
	Node      uint64
}

ChangeMembershipPayload is the Payload of an entry with Type LogEntryChangeMembership. Nodes are added or removed one at a time to minimize the risk of quorum failures in the new configuration.

type ClientInterface

type ClientInterface interface {
	Go(serviceMethod string, args interface{}, reply interface{}, done chan *rpc.Call) *rpc.Call
	Close() error
}

ClientInterface is the interface expected of the client provided by a transport. It is satisfied by rpc.Client, but could be implemented in other ways (using rpc.Call as a dumb data structure)

type Config

type Config struct {
	Storage   Storage
	Transport Transport
	// Ticker may be nil to use real time and TickInterval.
	Ticker Ticker

	// A new election is called if the ElectionTimeout elapses with no contact from the leader.
	// The actual ElectionTimeout is chosen randomly from the range [ElectionTimeoutMin,
	// ElectionTimeoutMax) to minimize the chances of several servers trying to become leaders
	// simultaneously. The Raft paper suggests a range of 150-300ms for local networks;
	// geographically distributed installations should use higher values to account for the
	// increased round trip time.
	ElectionTimeoutTicks   int
	HeartbeatIntervalTicks int
	TickInterval           time.Duration

	// If Strict is true, some warnings become fatal panics and additional (possibly expensive)
	// sanity checks will be done.
	Strict bool
}

Config contains the parameters necessary to construct a MultiRaft object.

func (*Config) Validate

func (c *Config) Validate() error

Validate returns an error if any required elements of the Config are missing or invalid. Called automatically by NewMultiRaft.

type EventCommandCommitted

type EventCommandCommitted struct {
	Command []byte
}

An EventCommandCommitted is broadcast whenever a command has been committed.

type EventLeaderElection

type EventLeaderElection struct {
	GroupID uint64
	NodeID  uint64
}

An EventLeaderElection is broadcast when a group completes an election. TODO(bdarnell): emit EventLeaderElection from follower nodes as well.

type GroupPersistentState

type GroupPersistentState struct {
	GroupID   uint64
	HardState raftpb.HardState
}

GroupPersistentState is a unified view of the readable data (except for log entries) about a group; used by Storage.LoadGroups.

type LogEntry

type LogEntry struct {
	Entry raftpb.Entry
}

LogEntry is the persistent form of a raft log entry.

type LogEntryState

type LogEntryState struct {
	Index int
	Entry LogEntry
	Error error
}

LogEntryState is used by Storage.GetLogEntries to bundle a LogEntry with its index and an optional error.

type MemoryStorage

type MemoryStorage struct {
	// contains filtered or unexported fields
}

MemoryStorage is an in-memory implementation of Storage for testing.

func NewMemoryStorage

func NewMemoryStorage() *MemoryStorage

NewMemoryStorage creates a MemoryStorage.

func (*MemoryStorage) AppendLogEntries

func (m *MemoryStorage) AppendLogEntries(groupID uint64, entries []*LogEntry) error

AppendLogEntries implements the Storage interface.

func (*MemoryStorage) SetGroupState

func (m *MemoryStorage) SetGroupState(groupID uint64,
	state *GroupPersistentState) error

SetGroupState implements the Storage interface.

type MultiRaft

type MultiRaft struct {
	Config

	Events chan interface{}
	// contains filtered or unexported fields
}

MultiRaft represents a local node in a raft cluster. The owner is responsible for consuming the Events channel in a timely manner.

func NewMultiRaft

func NewMultiRaft(nodeID uint64, config *Config) (*MultiRaft, error)

NewMultiRaft creates a MultiRaft object.

func (*MultiRaft) ChangeGroupMembership

func (m *MultiRaft) ChangeGroupMembership(groupID uint64, changeOp ChangeMembershipOperation,
	nodeID uint64) error

ChangeGroupMembership submits a proposed membership change to the cluster. TODO(bdarnell): same concerns as SubmitCommand TODO(bdarnell): do we expose ChangeMembershipAdd{Member,Observer} to the application level or does MultiRaft take care of the non-member -> observer -> full member cycle?

func (*MultiRaft) CreateGroup

func (m *MultiRaft) CreateGroup(groupID uint64, initialMembers []uint64) error

CreateGroup creates a new consensus group and joins it. The application should arrange to call CreateGroup on all nodes named in initialMembers.

func (*MultiRaft) DoRPC

func (m *MultiRaft) DoRPC(name string, req, resp interface{}) error

DoRPC implements ServerInterface

func (*MultiRaft) Start

func (m *MultiRaft) Start()

Start runs the raft algorithm in a background goroutine.

func (*MultiRaft) Stop

func (m *MultiRaft) Stop()

Stop terminates the running raft instance and shuts down all network interfaces.

func (*MultiRaft) SubmitCommand

func (m *MultiRaft) SubmitCommand(groupID uint64, command []byte) error

SubmitCommand sends a command (a binary blob) to the cluster. This method returns when the command has been successfully sent, not when it has been committed. TODO(bdarnell): should SubmitCommand wait until the commit? TODO(bdarnell): what do we do if we lose leadership before a command we proposed commits?

type RPCInterface

type RPCInterface interface {
	SendMessage(req *SendMessageRequest, resp *SendMessageResponse) error
}

RPCInterface is the methods we expose for use by net/rpc.

type SendMessageRequest

type SendMessageRequest struct {
	GroupID uint64
	Message raftpb.Message
}

SendMessageRequest wraps a raft message.

type SendMessageResponse

type SendMessageResponse struct {
}

SendMessageResponse is empty (raft uses a one-way messaging model; if a response is needed it will be sent as a separate message).

type ServerInterface

type ServerInterface interface {
	DoRPC(name string, req, resp interface{}) error
}

ServerInterface is a generic interface based on net/rpc.

type Storage

type Storage interface {

	// SetGroupState is called to update the persistent state for the given group.
	SetGroupState(groupID uint64, state *GroupPersistentState) error

	// AppendLogEntries is called to add entries to the log. The entries will always span
	// a contiguous range of indices just after the current end of the log.
	AppendLogEntries(groupID uint64, entries []*LogEntry) error
}

The Storage interface is supplied by the application to manage persistent storage of raft data.

type Ticker

type Ticker interface {
	// This channel will be readable once per tick. The time value returned is unspecified;
	// the channel has this type for compatibility with time.Ticker but other implementations
	// may not return real times.
	Chan() <-chan time.Time
}

Ticker encapsulates the timing-related parts of the raft protocol.

type Transport

type Transport interface {
	// Listen informs the Transport of the local node's ID and callback interface.
	// The Transport should associate the given id with the server object so other Transport's
	// Connect methods can find it.
	Listen(id uint64, server ServerInterface) error

	// Stop undoes a previous Listen.
	Stop(id uint64)

	// Connect looks up a node by id and returns a stub interface to submit RPCs to it.
	Connect(id uint64) (ClientInterface, error)
}

The Transport interface is supplied by the application to manage communication with other nodes. It is responsible for mapping from IDs to some communication channel (in the simplest case, a host:port pair could be used as an ID, although this would make it impossible to move an instance from one host to another except by syncing up a new node from scratch).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL