raft

package
v0.0.0-...-a1c4c83 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 28, 2022 License: Apache-2.0 Imports: 12 Imported by: 0

Documentation

Index

Constants

View Source
const (
	// StateFollower indicates the node is currently a follower and only responds to requests.
	StateFollower = "follower"

	// StateCandidate indicates the node is currently a candidate and is requesting votes from the other nodes.
	StateCandidate = "candidate"

	// StateLeader indicates the node is the leader and is servicing user requests along with replicating log entires.
	StateLeader = "leader"
)

Variables

View Source
var ErrLeaderUnknown = errors.New("leader node not found")

ErrLeaderUnknown is returned if the node is not the leader and doesn't know where the leader is.

View Source
var ErrNotFound = errors.New("not found")

ErrNotFound is returned if something was not found (usually a key).

Functions

This section is empty.

Types

type AppendEntriesInput

type AppendEntriesInput struct {
	// Term is the leaders term.
	Term Term `json:"term"`

	// ID is the leaders id so follower can redirect clients.
	ID Peer `json:"id"`

	// PreviousLogIndex is the index of log entry immediately preceding new ones.
	PreviousLogIndex int `json:"previous_log_index"`

	// PreviousLogTerm is the term of prevLogIndex entry.
	PreviousLogTerm Term `json:"previous_log_term"`

	// Entries is a log entries to store (empty for heartbeat; may send more than one for efficiency).
	Entries []Entry `json:"entries"`

	// CommitIndex is the leader’s commitIndex.
	CommitIndex int `json:"commit_index"`
}

AppendEntriesInput encapsulates the request options available for the 'AppendEntries' API.

type AppendEntriesOutput

type AppendEntriesOutput struct {
	// Term is the current term, for leader to update itself.
	Term Term `json:"term"`

	// Success is a bollean set to true if follower contained an entry matching prevLogIndex and prevLogTerm.
	Success bool `json:"success"`
}

AppendEntriesOutput encapsulates the response options available for the 'AppendEntries' API.

type Client

type Client interface {
	// Address returns the address client is using to communicate to the node.
	Address() *url.URL

	// AppendEntries is invoked by leader to replicate log entries (§5.3); also used as heartbeat.
	AppendEntries(ctx context.Context, input AppendEntriesInput) (output AppendEntriesOutput, err error)

	// RequestVote is invoked by candidates to gather votes.
	RequestVote(ctx context.Context, input RequestVoteInput) (output RequestVoteOutput, err error)
}

Client is a client allowing a node to communicate to other nodes in the cluster.

type DeleteInput

type DeleteInput struct {
	// Key is the key being deleted.
	Key string `json:"key"`
}

DeleteInput encapsulates the request options for the 'Delete' API.

type Entry

type Entry struct {
	// Term is the term when the log entry was applied.
	Term Term `json:"term"`

	// Key is the key being mutated.
	Key string `json:"key"`

	// Value is the value being mutated.
	Value any `json:"value"`

	// System indicates that this is a system event and shouldn't be returned to users.
	//
	//nolint:godox
	// TODO(jamesl33): Use to allow log compaction/node addition/removal.
	System bool `json:"system"`

	// Deleted indicates that this is a deletion.
	Deleted bool `json:"deleted"`
}

Entry in the replicated log.

type Follower

type Follower interface {
	// AppendEntries heartbeats/replicates the log to the node.
	//
	// 1. Reply false if term < currentTerm.
	// 2. Reply false if log doesn’t contain an entry at prevLogIndex whose term matches prevLogTerm.
	// 3. If an existing entry conflicts with a new one (same index but different terms), delete the existing entry and
	//    all that follow it.
	// 4. Append any new entries not already in the log
	// 5. If leaderCommit > commitIndex, set commitIndex = min(leaderCommit, index of last new entry)
	AppendEntries(input AppendEntriesInput) (output AppendEntriesOutput, err error)
}

Follower is the API for heartbeating/log replication to a node in the cluster.

  1. Respond to RPCs from candidates and leaders
  2. If election timeout elapses without receiving AppendEntries RPC from current leader or granting vote to candidate: convert to candidate

type GetInput

type GetInput struct {
	// Key is the get being fetched.
	Key string `json:"key"`
}

GetInput encapsulates the request options for the 'Get' API.

type GetOutput

type GetOutput struct {
	// Key is the key that was fetched.
	Key string `json:"key"`

	// Value is the value associated with the key.
	Value any `json:"value"`
}

GetOutput encapsulates the response options for the 'Get' API.

type Leader

type Leader interface {
	// Set a key/value pair.
	Set(input SetInput) (err error)

	// Get a key value pair.
	Get(input GetInput) (output GetOutput, err error)

	// Delete a key.
	Delete(input DeleteInput) (err error)
}

Leader is the API for servicing client requests to the replicated log.

  1. Upon election: send initial empty AppendEntries RPCs (heartbeat) to each server; repeat during idle periods to prevent election timeouts.

  2. If command received from client: append entry to local log, respond after entry applied to state machine.

  3. If last log index ≥ nextIndex for a follower: send AppendEntries RPC with log entries starting at nextIndex:

    - If successful: update nextIndex and matchIndex for follower.

    - If AppendEntries fails because of log inconsistency: decrement nextIndex and retry.

  4. If there exists an N such that N > commitIndex, a majority of matchIndex[i] ≥ N, and log[N].term == currentTerm: set commitIndex = N.

type Log

type Log []Entry

Log is a replicated log where entries contain a command (key/value pair) which is applied to a state machine after having been replicated to a quorum of nodes.

func (Log) Get

func (l Log) Get(key string, system bool) (Entry, bool)

Get the latest value for the given key.

func (Log) Replicate

func (l Log) Replicate(prev int, entries ...Entry) Log

Replicate appends the given entries to the log, truncating any entries which have a mismatched term.

type Node

type Node struct {
	// contains filtered or unexported fields
}

Node implements the 'Protocol' interface exposing a distributed key/value data store using Raft.

func NewNode

func NewNode(id Peer, peers map[Peer]Client) *Node

NewNode creates a node with the given id, connected to the provided peers.

func (*Node) AppendEntries

func (n *Node) AppendEntries(input AppendEntriesInput) (AppendEntriesOutput, error)

AppendEntries performs client side log replication and transitions state where required.

func (*Node) Delete

func (n *Node) Delete(input DeleteInput) error

Delete adds an entry to the log indicating the deletion of a key.

NOTE: This is a tombstone; both the entry and the tombstone will remain indefinitely.

func (*Node) Get

func (n *Node) Get(input GetInput) (GetOutput, error)

Get returns an entry from the log if it exists (and is not deleted).

NOTE: System entries will not be returned.

func (*Node) RequestVote

func (n *Node) RequestVote(input RequestVoteInput) (RequestVoteOutput, error)

RequestVote casts this nodes vote and transitions state where required.

func (*Node) Set

func (n *Node) Set(input SetInput) error

Set creates a new entry in the log and triggers replication.

type NotLeaderError

type NotLeaderError struct {
	URL *url.URL
}

NotLeaderError is returned if the node is not the leader but does know where the leader is.

func (NotLeaderError) Error

func (n NotLeaderError) Error() string

Error implements the 'error' interface returning a useful error indicating whether the leader node is.

type Peer

type Peer uint

Peer is the identifier of a peer node in the cluster.

type Protocol

type Protocol interface {
	Follower
	Voter
	Leader
}

Protocol is a definition of the Raft consensus protocol for managing a replicated log.

All Nodes:

  1. If commitIndex > lastApplied: increment lastApplied, apply log[lastApplied] to state machine.
  2. If RPC request or response contains term T > currentTerm: set currentTerm = T, convert to follower.

Followers:

  1. Respond to RPCs from candidates and leaders
  2. If election timeout elapses without receiving AppendEntries RPC from current leader or granting vote to candidate: convert to candidate

Candidates:

  1. On conversion to candidate, start election:

    - Increment currentTerm.

    - Vote for self.

    - Reset election timer.

    - Send RequestVote RPCs to all other servers.

  2. If votes received from majority of servers: become leader

  3. If AppendEntries RPC received from new leader: convert to follower

  4. If election timeout elapses: start new election

Leaders:

  1. Upon election: send initial empty AppendEntries RPCs (heartbeat) to each server; repeat during idle periods to prevent election timeouts.

  2. If command received from client: append entry to local log, respond after entry applied to state machine.

  3. If last log index ≥ nextIndex for a follower: send AppendEntries RPC with log entries starting at nextIndex:

    - If successful: update nextIndex and matchIndex for follower.

    - If AppendEntries fails because of log inconsistency: decrement nextIndex and retry.

  4. If there exists an N such that N > commitIndex, a majority of matchIndex[i] ≥ N, and log[N].term == currentTerm: set commitIndex = N.

type RequestVoteInput

type RequestVoteInput struct {
	// Term is the candidate’s term.
	Term Term `json:"term"`

	// CandidateID is the id of the candidate requesting the vote.
	CandidateID Peer `json:"candidate_id"`

	// LastLogIndex is the index of candidate’s last log entry.
	LastLogIndex int `json:"last_log_index"`

	// LastLogTerm is the term of candidate’s last log entry.
	LastLogTerm Term `json:"last_log_term"`
}

RequestVoteInput encapsulates the request options available for the 'RequestVote' API.

type RequestVoteOutput

type RequestVoteOutput struct {
	// Term is the current term, for candidate to update itself.
	Term Term `json:"term"`

	// Granted is set to true to indicate that the candidate received the vote.
	Granted bool `json:"granted"`
}

RequestVoteOutput encapsulates the response options for the 'RequestVote' API.

type SetInput

type SetInput struct {
	// Key is the key being mutated.
	Key string `json:"key"`

	// Value is the new value that will be associated with the key.
	Value any `json:"value"`
}

SetInput encapsulates the request options for the 'Set' API.

type State

type State string

State is the state of the node i.e. follower, candidate or leader.

type Term

type Term int

Term is an election term, each term must only have one leader; new elections increment the term.

type Voter

type Voter interface {
	// RequestVote requests a vote from the node.
	//
	// 1. Reply false if term < currentTerm.
	// 2. If votedFor is null or candidateId, and candidate’s log is at least as up-to-date as receiver’s log, grant
	//    vote.
	RequestVote(input RequestVoteInput) (output RequestVoteOutput, err error)
}

Voter is the API for requesting a vote from a node in the cluster.

  1. On conversion to candidate, start election:

    - Increment currentTerm.

    - Vote for self.

    - Reset election timer.

    - Send RequestVote RPCs to all other servers.

  2. If votes received from majority of servers: become leader

  3. If AppendEntries RPC received from new leader: convert to follower

  4. If election timeout elapses: start new election

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL