kbucket

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 21, 2023 License: MIT Imports: 9 Imported by: 0

README

KBucket

Go Report Card Go Reference license

Kademlia DHT K-bucket implementation as a binary tree. Ported from Tristan Slominski's k-bucket.

Installation

$ go get github.com/attilabuti/k-bucket@latest

Usage

For more information, please see the Package Docs.

Overview

A Distributed Hash Table (DHT) is a decentralized distributed system that provides a lookup table similar to a hash table.

k-bucket is an implementation of a storage mechanism for keys within a DHT. It stores contact objects which represent locations and addresses of nodes in the decentralized distributed system. contact objects are typically identified by a SHA-1 hash, however this restriction is lifted in this implementation. Additionally, node ids of different lengths can be compared.

This Kademlia DHT k-bucket implementation is meant to be as minimal as possible. It assumes that contact objects consist only of id. It is useful, and necessary, to attach other properties to a contact. For example, one may want to attach ip and port properties, which allow an application to send IP traffic to the contact. However, this information is extraneous and irrelevant to the operation of a k-bucket.

arbiter function

This k-bucket implementation implements a conflict resolution mechanism using an arbiter function. The purpose of the arbiter is to choose between two contact objects with the same id but perhaps different properties and determine which one should be stored. As the arbiter function returns the actual object to be stored, it does not need to make an either/or choice, but instead could perform some sort of operation and return the result as a new object that would then be stored. See kBucket.update(node, index, contact) for detailed semantics of which contact (incumbent or candidate) is selected.

For example, an arbiter function implementing a VectorClock mechanism would look something like:

// Contact example
contact := Contact{
    Id: []byte("contactId"),
    VectorClock: 0
};

func arbiter(incumbent Contact, candidate Contact) Contact {
	if incumbent.VectorClock > candidate.VectorClock {
		return incumbent
	}

	return candidate
}
Documentation

For more information, please see the Package Docs.

Implementation of a Kademlia DHT k-bucket used for storing contact (peer node) information.

For a step by step example of k-bucket operation you may find the following slideshow useful: Distribute All The Things.

KBucket starts off as a single k-bucket with capacity of k. As contacts are added, once the k+1 contact is added, the k-bucket is split into two k-buckets. The split happens according to the first bit of the contact node id. The k-bucket that would contain the local node id is the "near" k-bucket, and the other one is the "far" k-bucket. The "far" k-bucket is marked as don't split in order to prevent further splitting. The contact nodes that existed are then redistributed along the two new k-buckets and the old k-bucket becomes an inner node within a tree data structure.

As even more contacts are added to the "near" k-bucket, the "near" k-bucket will split again as it becomes full. However, this time it is split along the second bit of the contact node id. Again, the two newly created k-buckets are marked "near" and "far" and the "far" k-bucket is marked as don't split. Again, the contact nodes that existed in the old bucket are redistributed. This continues as long as nodes are being added to the "near" k-bucket, until the number of splits reaches the length of the local node id.

As more contacts are added to the "far" k-bucket and it reaches its capacity, it does not split. Instead, the k-bucket emits a "ping" event (register a listener: emitter.On("kbucket.ping", function (old Contacts, new Contact) {...}); and includes a slice of old contact nodes that it hasn't heard from in a while and requires you to confirm that those contact nodes still respond (literally respond to a PING RPC). If an old contact node still responds, it should be re-added (kBucket.Add(old Contact)) back to the k-bucket. This puts the old contact on the "recently heard from" end of the list of nodes in the k-bucket. If the old contact does not respond, it should be removed (kBucket.Remove(oldContact.Id []byte)) and the new contact being added now has room to be stored (kBucket.Add(new Contact)).

Events
  • kbucket.added

    • newContact Contact: The new contact that was added.
    • Emitted only when "newContact" was added to bucket and it was not stored in the bucket before.
  • kbucket.ping

    • old Contacts: The slice of contacts to ping.
    • new Contact: The new contact to be added if one of old contacts does not respond.
    • Emitted every time a contact is added that would exceed the capacity of a "don't split" k-bucket it belongs to.
  • kbucket.removed

    • contact Contact: The contact that was removed.
    • Emitted when "contact" was removed from the bucket.
  • kbucket.updated

    • old Contact: The contact that was stored prior to the update.
    • new Contact: The new contact that is now stored after the update.
    • Emitted when a previously existing ("previously existing" means "oldContact.id" equals "newContact.id") contact was added to the bucket and it was replaced with "newContact".

Further reading

Issues

Submit the issues if you find any bug or have any suggestion.

Contribution

Fork the repo and submit pull requests.

License

This extension is licensed under the MIT License.

Documentation

Overview

KBucket

Kademlia DHT K-bucket implementation as a binary tree. KBucket was ported from Tristan Slominski's k-bucket: github.com/tristanls/k-bucket

A Distributed Hash Table (DHT) is a decentralized distributed system that provides a lookup table similar to a hash table.

KBucket is an implementation of a storage mechanism for keys within a DHT. It stores Contact objects which represent locations and addresses of nodes in the decentralized distributed system. Contact objects are typically identified by a SHA-1 hash, however this restriction is lifted in this implementation. Additionally, node ids of different lengths can be compared.

This Kademlia DHT k-bucket implementation is meant to be as minimal as possible. It assumes that Contact objects consist only of Id. It is useful, and necessary, to attach other properties to a Contact. For example, one may want to attach ip and port properties, which allow an application to send IP traffic to the Contact. However, this information is extraneous and irrelevant to the operation of a k-bucket.

KBucket events:

kbucket.added
	  	newContact Contact: The new contact that was added.
	Emitted only when "newContact" was added to bucket and it was not stored
	in the bucket before.

kbucket.ping
	  	old Contacts: The slice of contacts to ping.
		new Contact: The new contact to be added if one of old contacts does not respond.
	Emitted every time a contact is added that would exceed the capacity of a
	"don't split" k-bucket it belongs to.

kbucket.removed
	  	contact Contact: The contact that was removed.
	Emitted when "contact" was removed from the bucket.

kbucket.updated
	  	old Contact: The contact that was stored prior to the update.
		new Contact: The new contact that is now stored after the update.
	Emitted when a previously existing ("previously existing" means "oldContact.id"
	equals "newContact.id") contact was added to the bucket and it was replaced with
	"newContact".

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CompareAddrPorts

func CompareAddrPorts(a, b netip.AddrPort) bool

CompareAddrPorts compares two netip.AddrPorts. It will return true if the two AddrPorts are equal, false otherwise.

func GenerateId

func GenerateId() ([]byte, error)

GenerateId generates a random 160-bit ID (SHA-1). It will return an error if the system's secure random number generator fails to function correctly, in which case the caller should not continue.

func GenerateRandomBytes

func GenerateRandomBytes(n int) ([]byte, error)

GenerateRandomBytes returns securely generated random bytes. It will return an error if the system's secure random number generator fails to function correctly, in which case the caller should not continue.

Types

type Contact

type Contact struct {
	Id          []byte         // The node id.
	AddrPort    netip.AddrPort // The address and port of the node.
	SeenAt      time.Time      // SeenAt is the time this node was last seen.
	VectorClock int
	Metadata    map[string]any // Optional satellite data to include with the Contact.
	// contains filtered or unexported fields
}

type Contacts

type Contacts []Contact

type KBucket

type KBucket struct {

	// Optional satellite data to include with the KBucket. Metadata property is
	// guaranteed not be altered, it is provided as an explicit container for users
	// of KBucket to store implementation-specific data.
	Metadata map[string]any
	// contains filtered or unexported fields
}

func NewKBucket

func NewKBucket(options KBucketOptions, emitter *eventemitter.Emitter) (*KBucket, error)

Implementation of a Kademlia DHT k-bucket used for storing contact (peer node) information. NewKBucket creates a new KBucket with the given options.

func (*KBucket) Add

func (b *KBucket) Add(contact Contact) *KBucket

Adds a contact to the KBucket.

func (*KBucket) Clear

func (b *KBucket) Clear()

Clear removes all contacts from the KBucket.

func (*KBucket) Closest

func (b *KBucket) Closest(id []byte, n uint) Contacts

Get the n closest contacts to the provided node id. "Closest" here means: closest according to the XOR metric of the contact node id. Return the maximum of n closest contacts to the node id.

func (*KBucket) Count

func (b *KBucket) Count() int

Count returns the number of contacts in the KBucket.

func (*KBucket) Distance

func (b *KBucket) Distance(fid, sid []byte) int

Default distance function. Finds the XOR distance between first id and second id. Returns the XOR distance between first id and second id.

func (*KBucket) Get

func (b *KBucket) Get(id []byte) Contact

Get a contact by its exact id. Returns an empty Contact if the contact is not found.

func (*KBucket) GetId

func (b *KBucket) GetId() []byte

GetId returns the local node id.

func (*KBucket) Has

func (b *KBucket) Has(id []byte) bool

Has returns true if the contact with the given id is in the KBucket, false otherwise.

func (*KBucket) Remove

func (b *KBucket) Remove(id []byte)

Removes contact with the provided id.

func (*KBucket) Seen

func (b *KBucket) Seen(id []byte) bool

Seen updates the contact's last seen time. Returns true if the contact was found, false otherwise.

func (*KBucket) SetArbiterFn

func (b *KBucket) SetArbiterFn(arbiterFn func(Contact, Contact) Contact)

SetArbiterFn overrides the default arbiter function.

func (*KBucket) SetDistanceFn

func (b *KBucket) SetDistanceFn(distanceFn func([]byte, []byte) int)

SetDistanceFn overrides the default distance function.

func (*KBucket) ToSlice

func (b *KBucket) ToSlice() Contacts

ToSlice returns a slice with all contacts in the KBucket.

type KBucketOptions

type KBucketOptions struct {
	// An optional byte slice representing the local node id. If not provided, a
	// local node id will be created via "GenerateId()". (Default: randomly generated)
	LocalNodeId []byte

	// The number of nodes that a KBucket can contain before being full or split.
	// (Default: 20)
	NodesPerKBucket int

	// The number of nodes to ping when a bucket that should not be split becomes
	// full. KBucket will emit a "kbucket.ping" event that contains "NodesToPing"
	// nodes that have not been contacted the longest. (Default: 3)
	NodesToPing int
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL