consistent

package module
v0.10.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 9, 2022 License: MIT Imports: 6 Imported by: 59

README

consistent

Go Reference Build Status Linter Coverage Go Report Card Mentioned in Awesome Go

This library provides a consistent hashing function which simultaneously achieves both uniformity and consistency.

For detailed information about the concept, you should take a look at the following resources:

Table of Content

Overview

In this package's context, the keys are distributed among partitions and partitions are distributed among members as well.

When you create a new consistent instance or call Add/Remove:

  • The member's name is hashed and inserted into the hash ring,
  • Average load is calculated by the algorithm defined in the paper,
  • Partitions are distributed among members by hashing partition IDs and none of them exceed the average load.

Average load cannot be exceeded. So if all members are loaded at the maximum while trying to add a new member, it panics.

When you want to locate a key by calling LocateKey:

  • The key(byte slice) is hashed,
  • The result of the hash is mod by the number of partitions,
  • The result of this modulo - MOD(hash result, partition count) - is the partition in which the key will be located,
  • Owner of the partition is already determined before calling LocateKey. So it returns the partition owner immediately.

No memory is allocated by consistent except hashing when you want to locate a key.

Note that the number of partitions cannot be changed after creation.

Notable Users

buraksezer/consistent is used at production by the following projects:

Install

With a correctly configured Go environment:

go get github.com/buraksezer/consistent

You will find some useful usage samples in examples folder.

Configuration

type Config struct {
	// Hasher is responsible for generating unsigned, 64 bit hash of provided byte slice.
	Hasher Hasher

	// Keys are distributed among partitions. Prime numbers are good to
	// distribute keys uniformly. Select a big PartitionCount if you have
	// too many keys.
	PartitionCount int

	// Members are replicated on consistent hash ring. This number controls
	// the number each member is replicated on the ring.
	ReplicationFactor int

	// Load is used to calculate average load. See the code, the paper and Google's 
	// blog post to learn about it.
	Load float64
}

Any hash algorithm can be used as hasher which implements Hasher interface. Please take a look at the Sample section for an example.

Usage

LocateKey function finds a member in the cluster for your key:

// With a properly configured and initialized consistent instance
key := []byte("my-key")
member := c.LocateKey(key)

It returns a thread-safe copy of the member you added before.

The second most frequently used function is GetClosestN.

// With a properly configured and initialized consistent instance

key := []byte("my-key")
members, err := c.GetClosestN(key, 2)

This may be useful to find backup nodes to store your key.

Benchmarks

On an early 2015 Macbook:

BenchmarkAddRemove-4     	  100000	     22006 ns/op
BenchmarkLocateKey-4     	 5000000	       252 ns/op
BenchmarkGetClosestN-4   	  500000	      2974 ns/op

Examples

The most basic use of consistent package should be like this. For detailed list of functions, visit godoc.org. More sample code can be found under _examples.

package main

import (
	"fmt"

	"github.com/buraksezer/consistent"
	"github.com/cespare/xxhash"
)

// In your code, you probably have a custom data type 
// for your cluster members. Just add a String function to implement 
// consistent.Member interface.
type myMember string

func (m myMember) String() string {
	return string(m)
}

// consistent package doesn't provide a default hashing function. 
// You should provide a proper one to distribute keys/members uniformly.
type hasher struct{}

func (h hasher) Sum64(data []byte) uint64 {
	// you should use a proper hash function for uniformity.
	return xxhash.Sum64(data)
}

func main() {
	// Create a new consistent instance
	cfg := consistent.Config{
		PartitionCount:    7,
		ReplicationFactor: 20,
		Load:              1.25,
		Hasher:            hasher{},
	}
	c := consistent.New(nil, cfg)

	// Add some members to the consistent hash table.
	// Add function calculates average load and distributes partitions over members
	node1 := myMember("node1.olric.com")
	c.Add(node1)

	node2 := myMember("node2.olric.com")
	c.Add(node2)

	key := []byte("my-key")
	// calculates partition id for the given key
	// partID := hash(key) % partitionCount
	// the partitions are already distributed among members by Add function.
	owner := c.LocateKey(key)
	fmt.Println(owner.String())
	// Prints node2.olric.com
}

Another useful example is _examples/relocation_percentage.go. It creates a consistent object with 8 members and distributes partitions among them. Then adds 9th member, here is the result with a proper configuration and hash function:

bloom:consistent burak$ go run _examples/relocation_percentage.go
partID: 218 moved to node2.olric from node0.olric
partID: 173 moved to node9.olric from node3.olric
partID: 225 moved to node7.olric from node0.olric
partID:  85 moved to node9.olric from node7.olric
partID: 220 moved to node5.olric from node0.olric
partID:  33 moved to node9.olric from node5.olric
partID: 254 moved to node9.olric from node4.olric
partID:  71 moved to node9.olric from node3.olric
partID: 236 moved to node9.olric from node2.olric
partID: 118 moved to node9.olric from node3.olric
partID: 233 moved to node3.olric from node0.olric
partID:  50 moved to node9.olric from node4.olric
partID: 252 moved to node9.olric from node2.olric
partID: 121 moved to node9.olric from node2.olric
partID: 259 moved to node9.olric from node4.olric
partID:  92 moved to node9.olric from node7.olric
partID: 152 moved to node9.olric from node3.olric
partID: 105 moved to node9.olric from node2.olric

6% of the partitions are relocated

Moved partition count is highly dependent on your configuration and quailty of hash function. You should modify the configuration to find an optimum set of configurations for your system.

_examples/load_distribution.go is also useful to understand load distribution. It creates a consistent object with 8 members and locates 1M key. It also calculates average load which cannot be exceeded by any member. Here is the result:

Maximum key count for a member should be around this:  147602
member: node2.olric, key count: 100362
member: node5.olric, key count: 99448
member: node0.olric, key count: 147735
member: node3.olric, key count: 103455
member: node6.olric, key count: 147069
member: node1.olric, key count: 121566
member: node4.olric, key count: 147932
member: node7.olric, key count: 132433

Average load can be calculated by using the following formula:

load := (consistent.AverageLoad() * float64(keyCount)) / float64(config.PartitionCount)

Contributions

Please don't hesitate to fork the project and send a pull request or just e-mail me to ask questions and share ideas.

License

MIT License, - see LICENSE for more details.

Documentation

Overview

Package consistent provides a consistent hashing function with bounded loads. This implementation also adds partitioning logic on top of the original algorithm. For more information about the underlying algorithm, please take a look at https://research.googleblog.com/2017/04/consistent-hashing-with-bounded-loads.html

Example Use:

cfg := consistent.Config{
	PartitionCount:    71,
	ReplicationFactor: 20,
	Load:              1.25,
	Hasher:            hasher{},
}

Now you can create a new Consistent instance. This function can take a list of the members.

c := consistent.New(members, cfg)

In the following sample, you add a new Member to the consistent hash ring. myMember is just a Go struct that implements the Member interface. You should know that modifying the consistent hash ring distributes partitions among members using the algorithm defined on Google Research Blog.

c.Add(myMember)

Remove a member from the consistent hash ring:

c.Remove(member-name)

LocateKey hashes the key and calculates partition ID with this modulo operation: MOD(hash result, partition count) The owner of the partition is already calculated by New/Add/Remove. LocateKey just returns the member that is responsible for the key.

key := []byte("my-key")
member := c.LocateKey(key)

Index

Constants

View Source
const (
	DefaultPartitionCount    int     = 271
	DefaultReplicationFactor int     = 20
	DefaultLoad              float64 = 1.25
)

Variables

View Source
var ErrInsufficientMemberCount = errors.New("insufficient member count")

ErrInsufficientMemberCount represents an error which means there are not enough members to complete the task.

Functions

This section is empty.

Types

type Config

type Config struct {
	// Hasher is responsible for generating unsigned, 64-bit hash of provided byte slice.
	Hasher Hasher

	// Keys are distributed among partitions. Prime numbers are good to
	// distribute keys uniformly. Select a big PartitionCount if you have
	// too many keys.
	PartitionCount int

	// Members are replicated on consistent hash ring. This number means that a member
	// how many times replicated on the ring.
	ReplicationFactor int

	// Load is used to calculate average load. See the code, the paper and Google's blog post to learn about it.
	Load float64
}

Config represents a structure to control consistent package.

type Consistent

type Consistent struct {
	// contains filtered or unexported fields
}

Consistent holds the information about the members of the consistent hash circle.

func New

func New(members []Member, config Config) *Consistent

New creates and returns a new Consistent object.

func (*Consistent) Add

func (c *Consistent) Add(member Member)

Add adds a new member to the consistent hash circle.

func (*Consistent) AverageLoad

func (c *Consistent) AverageLoad() float64

AverageLoad exposes the current average load.

func (*Consistent) FindPartitionID

func (c *Consistent) FindPartitionID(key []byte) int

FindPartitionID returns partition id for given key.

func (*Consistent) GetClosestN

func (c *Consistent) GetClosestN(key []byte, count int) ([]Member, error)

GetClosestN returns the closest N member to a key in the hash ring. This may be useful to find members for replication.

func (*Consistent) GetClosestNForPartition

func (c *Consistent) GetClosestNForPartition(partID, count int) ([]Member, error)

GetClosestNForPartition returns the closest N member for given partition. This may be useful to find members for replication.

func (*Consistent) GetMembers

func (c *Consistent) GetMembers() []Member

GetMembers returns a thread-safe copy of members. If there are no members, it returns an empty slice of Member.

func (*Consistent) GetPartitionOwner

func (c *Consistent) GetPartitionOwner(partID int) Member

GetPartitionOwner returns the owner of the given partition.

func (*Consistent) LoadDistribution

func (c *Consistent) LoadDistribution() map[string]float64

LoadDistribution exposes load distribution of members.

func (*Consistent) LocateKey

func (c *Consistent) LocateKey(key []byte) Member

LocateKey finds a home for given key

func (*Consistent) Remove

func (c *Consistent) Remove(name string)

Remove removes a member from the consistent hash circle.

type Hasher

type Hasher interface {
	Sum64([]byte) uint64
}

Hasher is responsible for generating unsigned, 64-bit hash of provided byte slice. Hasher should minimize collisions (generating same hash for different byte slice) and while performance is also important fast functions are preferable (i.e. you can use FarmHash family).

type Member

type Member interface {
	String() string
}

Member interface represents a member in consistent hash ring.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL