iss

package
v0.0.0-...-bd4e2c0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 19, 2023 License: Apache-2.0 Imports: 16 Imported by: 0

README

Insanely Scalable SMR (ISS)

ISS is the first modular algorithm to make leader-driven total order broadcast scale in a robust way. At its interface, ISS is a classic state machine replication (SMR) system that establishes a total order of client requests with typical liveness and safety properties, applicable to any replicated service, such as resilient databases or a blockchain ordering layer. It is a further development and a successor of the Mir-BFT protocol (not to be confused with the MirBFT library using which ISS is implemented, see the description of MirBFT).

ISS achieves scalability without requiring a primary node to periodically decide on the protocol configuration. It multiplexes multiple instances of a leader-driven consensus protocol which operate concurrently and (almost) independently. We abstract away the logic of the used consensus protocol and only define an interface - that we call "Sequenced Broadcast" (SB) - that such a consensus protocol must use to interact with ISS.

ISS maintains a contiguous log of (batches of) client requests at each node. Each position in the log corresponds to a unique sequence number and ISS agrees on the assignment of a unique request batch to each sequence number. Our goal is to introduce as much parallelism as possible in assigning batches to sequence numbers while avoiding request duplication, i.e., assigning the same request to more than one sequence number. To this end, ISS subdivides the log into non-overlapping segments. Each segment, representing a subset of the log's sequence numbers, corresponds to an independent consensus protocol instance that has its own leader and executes concurrently with other instances.

To prevent the leaders of two different segments from concurrently proposing the same request, and thus wasting resources, while also preventing malicious leaders from censoring (i.e., not proposing) certain requests, we adopt and generalize the partitioning of the request space introduced by Mir-BFT. At any point in time, ISS assigns a different subset of client requests (that we call a bucket) to each segment. ISS periodically changes this assignment, such that each request is guaranteed to eventually be assigned to a segment with a correct leader.

The figure below shows the high-level architecture of the ISS protocol. High-level architecture of the ISS protocol

Documentation

Overview

Package iss contains the implementation of the ISS protocol, the new generation of MirBFT. For the details of the protocol, see (TODO). To use ISS, instantiate it by calling `iss.New` and use it as the Protocol module when instantiating a mirbft.Node. A default configuration (to pass, among other arguments, to `iss.New`) can be obtained from `iss.DefaultConfig`.

Current status: This package is currently being implemented and is not yet functional.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CheckConfig

func CheckConfig(c *Config) error

CheckConfig checks whether the given configuration satisfies all necessary constraints.

func CheckpointMessage

func CheckpointMessage(epoch t.EpochNr, sn t.SeqNr) *messagepb.Message

func Event

func Event(event *isspb.ISSEvent) *eventpb.Event

func HashOrigin

func HashOrigin(origin *isspb.ISSHashOrigin) *eventpb.HashOrigin

func LogEntryHashOrigin

func LogEntryHashOrigin(logEntrySN t.SeqNr) *eventpb.HashOrigin

func Message

func Message(msg *isspb.ISSMessage) *messagepb.Message

func PbftCommitMessage

func PbftCommitMessage(content *isspbftpb.Commit) *isspb.SBInstanceMessage

func PbftPersistCommit

func PbftPersistCommit(commit *isspbftpb.Commit) *isspb.SBInstanceEvent

func PbftPersistPrepare

func PbftPersistPrepare(prepare *isspbftpb.Prepare) *isspb.SBInstanceEvent

func PbftPersistPreprepare

func PbftPersistPreprepare(preprepare *isspbftpb.Preprepare) *isspb.SBInstanceEvent

func PbftPrepareMessage

func PbftPrepareMessage(content *isspbftpb.Prepare) *isspb.SBInstanceMessage

func PbftPreprepareMessage

func PbftPreprepareMessage(content *isspbftpb.Preprepare) *isspb.SBInstanceMessage

func PbftReqWaitReference

func PbftReqWaitReference(sn t.SeqNr, view t.PBFTViewNr) *isspb.SBReqWaitReference

func PersistCheckpointEvent

func PersistCheckpointEvent(sn t.SeqNr, appSnapshot []byte) *eventpb.Event

func PersistStableCheckpointEvent

func PersistStableCheckpointEvent(stableCheckpoint *isspb.StableCheckpoint) *eventpb.Event

func RetransmitRequestsMessage

func RetransmitRequestsMessage(requests []*requestpb.RequestRef) *messagepb.Message

func SBBatchReadyEvent

func SBBatchReadyEvent(batch *requestpb.Batch, pendingReqsLeft t.NumRequests) *isspb.SBInstanceEvent

func SBCutBatchEvent

func SBCutBatchEvent(maxSize t.NumRequests) *isspb.SBInstanceEvent

func SBDeliverEvent

func SBDeliverEvent(sn t.SeqNr, batch *requestpb.Batch, aborted bool) *isspb.SBInstanceEvent

func SBEvent

func SBEvent(epoch t.EpochNr, instance t.SBInstanceID, event *isspb.SBInstanceEvent) *eventpb.Event

func SBHashOrigin

func SBHashOrigin(epoch t.EpochNr, instance t.SBInstanceID, origin *isspb.SBInstanceHashOrigin) *eventpb.HashOrigin

func SBHashResultEvent

func SBHashResultEvent(digest []byte, origin *isspb.SBInstanceHashOrigin) *isspb.SBInstanceEvent

func SBInitEvent

func SBInitEvent() *isspb.SBInstanceEvent

func SBMessage

func SBMessage(epoch t.EpochNr, instance t.SBInstanceID, msg *isspb.SBInstanceMessage) *messagepb.Message

func SBMessageReceivedEvent

func SBMessageReceivedEvent(message *isspb.SBInstanceMessage, from t.NodeID) *isspb.SBInstanceEvent

func SBPendingRequestsEvent

func SBPendingRequestsEvent(numRequests t.NumRequests) *isspb.SBInstanceEvent

func SBRequestsReady

func SBRequestsReady(ref *isspb.SBReqWaitReference) *isspb.SBInstanceEvent

func SBTickEvent

func SBTickEvent() *isspb.SBInstanceEvent

func SBWaitForRequestsEvent

func SBWaitForRequestsEvent(reference *isspb.SBReqWaitReference, requests []*requestpb.RequestRef) *isspb.SBInstanceEvent

func StableCheckpointEvent

func StableCheckpointEvent(stableCheckpoint *isspb.StableCheckpoint) *eventpb.Event

Types

type CommitLogEntry

type CommitLogEntry struct {
	// Sequence number at which this entry has been ordered.
	Sn t.SeqNr

	// The delivered request batch.
	Batch *requestpb.Batch

	// The digest (hash) of the batch.
	Digest []byte

	// A flag indicating whether this entry is an actual request batch (false)
	// or whether the orderer delivered a special abort value (true).
	Aborted bool

	// In case Aborted is true, this field indicates the ID of the node
	// that is suspected to be the reason for the orderer aborting (usually the leader).
	// This information can be used by the leader selection policy at epoch transition.
	Suspect t.NodeID
}

The CommitLogEntry type represents an entry of the commit log, the final output of the ordering process. Whenever an orderer delivers a batch (or a special abort value), it is inserted to the commit log in form of a commitLogEntry.

type Config

type Config struct {

	// The IDs of all nodes that execute the protocol.
	// Must not be empty.
	Membership []t.NodeID

	// The length of an ISS segment, in sequence numbers.
	// This is the number of commitLog entries each orderer needs to output in an epoch.
	// Depending on the number of leaders (and thus orderers), this will result in epoch of different lengths.
	// If set to 0, the EpochLength parameter must be non-zero and will be used to calculate the length of the segments
	// such that their lengths sum up to EpochLength.
	// Must not be negative.
	SegmentLength int

	// The length of an ISS epoch, in sequence numbers.
	// If EpochLength is non-zero, the epoch will always have a fixed
	// length, regardless of the number of leaders.
	// In each epoch, the corresponding segment lengths will be calculated to sum up to EpochLength,
	// potentially resulting in different segment length across epochs as well as within an epoch.
	// If set to zero, SegmentLength must be non-zero and will be used directly to set the length of each segment.
	// Must not be negative.
	// TODO: That EpochLength is not implemented now. SegmentLength has to be used.
	EpochLength int

	// The maximal number of requests in a proposed request batch.
	// As soon as the buckets assigned to an orderer contain MaxBatchSize requests,
	// the orderer may decide to immediately propose a new request batch.
	// This is meaningful, for example, for the PBFT orderer.
	// Setting MaxBatchSize to zero signifies no limit on batch size.
	// TODO: Consider making this orderer-specific.
	MaxBatchSize t.NumRequests

	// The maximum number of logical time ticks between two proposals of an orderer, where applicable.
	// For orderers that wait for a request batch to fill before proposing it (e.g. PBFT),
	// this parameter caps the waiting time in order to bound latency.
	// When MaxProposeDelay ticks have elapsed since the last proposal made by an orderer,
	// the orderer proposes a new request batch, even if the batch is not full (or even completely empty).
	// Must not be negative.
	MaxProposeDelay int

	// Total number of buckets used by ISS.
	// In each epoch, these buckets are re-distributed evenly among the orderers.
	// Must be greater than 0.
	NumBuckets int

	// The logic for selecting leader nodes in each epoch.
	// For details see the documentation of the LeaderSelectionPolicy type.
	// ATTENTION: The leader selection policy is stateful!
	// Must not be nil.
	LeaderPolicy LeaderSelectionPolicy

	// Number of logical time ticks to wait until demanding retransmission of missing requests.
	// If a node receives a proposal containing requests that are not in the node's buckets,
	// it cannot accept the proposal.
	// In such a case, the node will wait for RequestNAckTimeout ticks
	// before trying to fetch those requests from other nodes.
	// Must be positive.
	RequestNAckTimeout int

	// Maximal number of bytes used for message backlogging buffers
	// (only message payloads are counted towards MsgBufCapacity).
	// On reception of a message that the node is not yet ready to process
	// (e.g., a message from a future epoch received from another node that already transitioned to that epoch),
	// the message is stored in a buffer for later processing (e.g., when this node also transitions to that epoch).
	// This total buffer capacity is evenly split among multiple buffers, one for each node,
	// so that one misbehaving node cannot exhaust the whole buffer space.
	// The most recently received messages that together do not exceed the capacity are stored.
	// If the capacity is set to 0, all messages that cannot yet be processed are dropped on reception.
	// Must not be negative.
	MsgBufCapacity int

	// View change timeout for the PBFT sub-protocol, in ticks.
	// TODO: Separate this in a sub-group of the ISS config, maybe even use a field of type PBFTConfig in Config.
	PBFTViewChangeBatchTimeout   int
	PBFTViewChangeSegmentTimeout int
}

The Config type defines all the ISS configuration parameters. Note that some fields specify delays in ticks of the logical clock. To obtain real time delays, these need to be multiplied by the period of the ticker provided to the Node at runtime.

func DefaultConfig

func DefaultConfig(membership []t.NodeID) *Config

DefaultConfig returns the default configuration for a given membership. There is no guarantee that this configuration ensures good performance, but it will pass the CheckConfig test. DefaultConfig is intended for use during testing and hello-world examples. A proper deployment is expected to craft a custom configuration, for which DefaultConfig can serve as a starting point.

type ISS

type ISS struct {
	// contains filtered or unexported fields
}

The ISS type represents the ISS protocol module to be used when instantiating a node. The type should not be instantiated directly, but only properly initialized values returned from the New() function should be used.

func New

func New(ownID t.NodeID, config *Config, logger logging.Logger) (*ISS, error)

New returns a new initialized instance of the ISS protocol module to be used when instantiating a mirbft.Node. Arguments:

  • ownID: the ID of the node being instantiated with ISS.
  • config: ISS protocol-specific configuration (e.g. number of buckets, batch size, etc...). see the documentation of the Config type for details.
  • logger: Logger the ISS implementation uses to output log messages.

func (*ISS) ApplyEvent

func (iss *ISS) ApplyEvent(event *eventpb.Event) *events.EventList

ApplyEvent receives one event and applies it to the ISS protocol state machine, potentially altering its state and producing a (potentially empty) list of more events to be applied to other modules.

func (*ISS) Status

func (iss *ISS) Status() (s *statuspb.ProtocolStatus, err error)

Status returns a protobuf representation of the current protocol state that can be later printed (TODO: Say how). This functionality is meant mostly for debugging and is *not* meant to provide an interface for serializing and deserializing the whole protocol state.

type LeaderSelectionPolicy

type LeaderSelectionPolicy interface {

	// Leaders returns the (ordered) list of leaders based on the given epoch e and on the state of this policy object.
	Leaders(e t.EpochNr) []t.NodeID

	// Suspect updates the state of the policy object by announcing it that node `node` has been suspected in epoch `e`.
	Suspect(e t.EpochNr, node t.NodeID)
}

A LeaderSelectionPolicy implements the algorithm for selecting a set of leaders in each ISS epoch. In a nutshell, it gathers information about suspected leaders in the past epochs and uses it to calculate the set of leaders for future epochs. Its state can be updated using Suspect() and the leader set for an epoch is queried using Leaders(). A leader set policy must be deterministic, i.e., calling Leaders() after the same sequence of Suspect() invocations always returns the same set of leaders at every Node.

type PBFTConfig

type PBFTConfig struct {

	// The IDs of all nodes that execute this instance of the protocol.
	// Must not be empty.
	Membership []t.NodeID

	// The maximum number of logical time ticks between two proposals of new batches during normal operation.
	// This parameter caps the waiting time in order to bound latency.
	// When MaxProposeDelay ticks have elapsed since the last proposal,
	// the protocol tries to propose a new request batch, even if the batch is not full (or even completely empty).
	// Must not be negative.
	MaxProposeDelay int

	// Maximal number of bytes used for message backlogging buffers
	// (only message payloads are counted towards MsgBufCapacity).
	// Same as Config.MsgBufCapacity, but used only for one instance of PBFT.
	// Must not be negative.
	MsgBufCapacity int

	// The maximal number of requests in a proposed request batch.
	// As soon as the number of pending requests reaches MaxBatchSize,
	// the PBFT instance may decide to immediately propose a new request batch.
	// Setting MaxBatchSize to zero signifies no limit on batch size.
	MaxBatchSize t.NumRequests

	// Per-batch view change timeout for view 0, in ticks.
	// If no batch is delivered by a PBFT instance within this timeout, the node triggers a view change.
	// With each new view, the timeout doubles (without changing this value)
	ViewChangeBatchTimeout int

	// View change timeout for view 0 for the whole segment, in ticks.
	// If not all batches of the associated segment are delivered by a PBFT instance within this timeout,
	// the node triggers a view change.
	// With each new view, the timeout doubles (without changing this value)
	ViewChangeSegmentTimeout int
}

PBFTConfig holds PBFT-specific configuration parameters used by a concrete instance of PBFT. They are mostly inherited from the ISS configuration at the time of creating the PBFT instance.

type SimpleLeaderPolicy

type SimpleLeaderPolicy struct {
	Membership []t.NodeID
}

The SimpleLeaderPolicy is a trivial leader selection policy. It must be initialized with a set of node IDs and always returns that full set as leaders, regardless of which nodes have been suspected. In other words, each node is leader each epoch with this policy.

func (*SimpleLeaderPolicy) Leaders

func (simple *SimpleLeaderPolicy) Leaders(e t.EpochNr) []t.NodeID

Leaders always returns the whole membership for the SimpleLeaderPolicy. All nodes are always leaders.

func (*SimpleLeaderPolicy) Suspect

func (simple *SimpleLeaderPolicy) Suspect(e t.EpochNr, node t.NodeID)

Suspect does nothing for the SimpleLeaderPolicy.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL