dht

package module
v0.0.0-...-5a20f31 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 19, 2020 License: MIT Imports: 21 Imported by: 22

README

See the video on the Youtube.

中文版README

Introduction

DHT implements the bittorrent DHT protocol in Go. Now it includes:

It contains two modes, the standard mode and the crawling mode. The standard mode follows the BEPs, and you can use it as a standard dht server. The crawling mode aims to crawl as more metadata info as possiple. It doesn't follow the standard BEPs protocol. With the crawling mode, you can build another BTDigg.

bthub.io is a BT search engine based on the crawling mode.

Installation

go get github.com/shiyanhui/dht

Example

Below is a simple spider. You can move here to see more samples.

import (
    "fmt"
    "github.com/shiyanhui/dht"
)

func main() {
    downloader := dht.NewWire(65535)
    go func() {
        // once we got the request result
        for resp := range downloader.Response() {
            fmt.Println(resp.InfoHash, resp.MetadataInfo)
        }
    }()
    go downloader.Run()

    config := dht.NewCrawlConfig()
    config.OnAnnouncePeer = func(infoHash, ip string, port int) {
        // request to download the metadata info
        downloader.Request([]byte(infoHash), ip, port)
    }
    d := dht.New(config)

    d.Run()
}

Download

You can download the demo compiled binary file here.

Note

  • The default crawl mode configure costs about 300M RAM. Set MaxNodes and BlackListMaxSize to fit yourself.
  • Now it cant't run in LAN because of NAT.

TODO

  • NAT Traversal.
  • Implements the full BEP-3.
  • Optimization.

FAQ

Why it is slow compared to other spiders ?

Well, maybe there are several reasons.

  • DHT aims to implements the standard BitTorrent DHT protocol, not born for crawling the DHT network.
  • NAT Traversal issue. You run the crawler in a local network.
  • It will block ip which looks like bad and a good ip may be mis-judged.

License

MIT, read more here

Documentation

Overview

Package dht implements the bittorrent dht protocol. For more information see http://www.bittorrent.org/beps/bep_0005.html.

Index

Constants

View Source
const (
	// StandardMode follows the standard protocol
	StandardMode = iota
	// CrawlMode for crawling the dht network.
	CrawlMode
)
View Source
const (
	// REQUEST represents request message type
	REQUEST = iota
	// DATA represents data message type
	DATA
	// REJECT represents reject message type
	REJECT
)
View Source
const (
	// BLOCK is 2 ^ 14
	BLOCK = 16384
	// MaxMetadataSize represents the max medata it can accept
	MaxMetadataSize = BLOCK * 1000
	// EXTENDED represents it is a extended message
	EXTENDED = 20
	// HANDSHAKE represents handshake bit
	HANDSHAKE = 0
)

Variables

View Source
var (
	// ErrNotReady is the error when DHT is not initialized.
	ErrNotReady = errors.New("dht is not ready")
	// ErrOnGetPeersResponseNotSet is the error that config
	// OnGetPeersResponseNotSet is not set when call dht.GetPeers.
	ErrOnGetPeersResponseNotSet = errors.New("OnGetPeersResponse is not set")
)

Functions

func Decode

func Decode(data []byte) (result interface{}, err error)

Decode decodes a bencoded string to string, int, list or map.

func DecodeDict

func DecodeDict(data []byte, start int) (
	result interface{}, index int, err error)

DecodeDict decodes a map value.

func DecodeInt

func DecodeInt(data []byte, start int) (
	result interface{}, index int, err error)

DecodeInt decodes int value in the data.

func DecodeList

func DecodeList(data []byte, start int) (
	result interface{}, index int, err error)

DecodeList decodes a list value.

func DecodeString

func DecodeString(data []byte, start int) (
	result interface{}, index int, err error)

DecodeString decodes a string in the data. It returns a tuple (decoded result, the end position, error).

func Encode

func Encode(data interface{}) string

Encode encodes a string, int, dict or list value to a bencoded string.

func EncodeDict

func EncodeDict(data map[string]interface{}) string

EncodeDict encodes a dict value.

func EncodeInt

func EncodeInt(data int) string

EncodeInt encodes a int value.

func EncodeList

func EncodeList(data []interface{}) string

EncodeList encodes a list value.

func EncodeString

func EncodeString(data string) string

EncodeString encodes a string value.

func ParseKey

func ParseKey(data map[string]interface{}, key string, t string) error

ParseKey parses the key in dict data. `t` is type of the keyed value. It's one of "int", "string", "map", "list".

func ParseKeys

func ParseKeys(data map[string]interface{}, pairs [][]string) error

ParseKeys parses keys. It just wraps ParseKey.

Types

type Config

type Config struct {
	// in mainline dht, k = 8
	K int
	// for crawling mode, we put all nodes in one bucket, so KBucketSize may
	// not be K
	KBucketSize int
	// candidates are udp, udp4, udp6
	Network string
	// format is `ip:port`
	Address string
	// the prime nodes through which we can join in dht network
	PrimeNodes []string
	// the kbucket expired duration
	KBucketExpiredAfter time.Duration
	// the node expired duration
	NodeExpriedAfter time.Duration
	// how long it checks whether the bucket is expired
	CheckKBucketPeriod time.Duration
	// peer token expired duration
	TokenExpiredAfter time.Duration
	// the max transaction id
	MaxTransactionCursor uint64
	// how many nodes routing table can hold
	MaxNodes int
	// callback when got get_peers request
	OnGetPeers func(string, string, int)
	// callback when receive get_peers response
	OnGetPeersResponse func(string, *Peer)
	// callback when got announce_peer request
	OnAnnouncePeer func(string, string, int)
	// blcoked ips
	BlockedIPs []string
	// blacklist size
	BlackListMaxSize int
	// StandardMode or CrawlMode
	Mode int
	// the times it tries when send fails
	Try int
	// the size of packet need to be dealt with
	PacketJobLimit int
	// the size of packet handler
	PacketWorkerLimit int
	// the nodes num to be fresh in a kbucket
	RefreshNodeNum int
}

Config represents the configure of dht.

func NewCrawlConfig

func NewCrawlConfig() *Config

NewCrawlConfig returns a config in crawling mode.

func NewStandardConfig

func NewStandardConfig() *Config

NewStandardConfig returns a Config pointer with default values.

type DHT

type DHT struct {
	*Config

	Ready bool
	// contains filtered or unexported fields
}

DHT represents a DHT node.

func New

func New(config *Config) *DHT

New returns a DHT pointer. If config is nil, then config will be set to the default config.

func (*DHT) GetPeers

func (dht *DHT) GetPeers(infoHash string) error

GetPeers returns peers who have announced having infoHash.

func (*DHT) IsCrawlMode

func (dht *DHT) IsCrawlMode() bool

IsCrawlMode returns whether mode is CrawlMode.

func (*DHT) IsStandardMode

func (dht *DHT) IsStandardMode() bool

IsStandardMode returns whether mode is StandardMode.

func (*DHT) Run

func (dht *DHT) Run()

Run starts the dht.

type Peer

type Peer struct {
	IP   net.IP
	Port int
	// contains filtered or unexported fields
}

Peer represents a peer contact.

func (*Peer) CompactIPPortInfo

func (p *Peer) CompactIPPortInfo() string

CompactIPPortInfo returns "Compact node info". See http://www.bittorrent.org/beps/bep_0005.html.

type Request

type Request struct {
	InfoHash []byte
	IP       string
	Port     int
}

Request represents the request context.

type Response

type Response struct {
	Request
	MetadataInfo []byte
}

Response contains the request context and the metadata info.

type Wire

type Wire struct {
	// contains filtered or unexported fields
}

Wire represents the wire protocol.

func NewWire

func NewWire(blackListSize, requestQueueSize, workerQueueSize int) *Wire

NewWire returns a Wire pointer.

  • blackListSize: the blacklist size
  • requestQueueSize: the max requests it can buffers
  • workerQueueSize: the max goroutine downloading workers

func (*Wire) Request

func (wire *Wire) Request(infoHash []byte, ip string, port int)

Request pushes the request to the queue.

func (*Wire) Response

func (wire *Wire) Response() <-chan Response

Response returns a chan of Response.

func (*Wire) Run

func (wire *Wire) Run()

Run starts the peer wire protocol.

Directories

Path Synopsis
sample

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL