chunk

package module

v0.4.0 Latest Latest Go to latest Published: May 26, 2023 License: MIT Imports: 14 Imported by: 10

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/bittorrent/go-btfs-chunker

Links

Open Source Insights

README ¶

go-btfs-chunker

go-btfs-chunker implements data Splitters for go-btfs.

go-btfs-chunker provides the Splitter interface. BTFS splitters read data from a reader an create "chunks". These chunks are used to build the BTFS DAGs (Merkle Tree) and are the base unit to obtain the sums that BTFS uses to address content.

The package provides a SizeSplitter which creates chunks of equal size and it is used by default in most cases, and a rabin fingerprint chunker. This chunker will attempt to split data in a way that the resulting blocks are the same when the data has repetitive patterns, thus optimizing the resulting DAGs.

Install

go-btfs-chunker works like a regular Go module:

> go get github.com/bittorrent/go-btfs-chunker

Usage

import "github.com/bittorrent/go-btfs-chunker"

Contribute

PRs accepted.

Small note: If editing the README, please conform to the standard-readme specification.

License

MIT © TRON-US.

Documentation ¶

Overview ¶

Package chunk implements streaming block splitters. Splitters read data from a reader and provide byte slices (chunks) The size and contents of these slices depend on the splitting method used.

Index ¶

Constants
Variables
func Chan(s Splitter) (<-chan []byte, <-chan error)
func IsReedSolomon(chunker string) bool
func NewReedSolomonSplitter(r io.Reader, numData, numParity, size uint64) (*reedSolomonSplitter, error)
type Buzhash
- func NewBuzhash(r io.Reader) *Buzhash
- func (b *Buzhash) ChunkSize() uint64
- func (b *Buzhash) MetaData() interface{}
- func (b *Buzhash) NextBytes() ([]byte, error)
- func (b *Buzhash) Reader() io.Reader
- func (b *Buzhash) SetIsDir(v bool)
type MetaSplitter
- func (ms *MetaSplitter) ChunkSize() uint64
- func (ms *MetaSplitter) MetaData() interface{}
- func (ms *MetaSplitter) NextBytes() ([]byte, error)
- func (ms *MetaSplitter) Reader() io.Reader
- func (rss *MetaSplitter) SetIsDir(v bool)
type MultiSplitter
type Rabin
- func NewRabin(r io.Reader, avgBlkSize uint64) *Rabin
- func NewRabinMinMax(r io.Reader, min, avg, max uint64) *Rabin
- func (r *Rabin) ChunkSize() uint64
- func (r *Rabin) MetaData() interface{}
- func (r *Rabin) NextBytes() ([]byte, error)
- func (r *Rabin) Reader() io.Reader
- func (rss *Rabin) SetIsDir(v bool)
type RsMetaMap
- func GetRsMetaMapFromString(str string) (*RsMetaMap, error)
type Splitter
- func DefaultSplitter(r io.Reader) Splitter
- func FromString(r io.Reader, chunker string) (Splitter, error)
- func NewMetaSplitter(r io.Reader, size uint64) Splitter
- func NewSizeSplitter(r io.Reader, size int64) Splitter
type SplitterGen
- func MetaSplitterGen(size int64) SplitterGen
- func SizeSplitterGen(size int64) SplitterGen

Constants ¶

View Source

const (
	PrefixForDefault     = "default"
	PrefixForSize        = "size-"
	PrefixForRabin       = "rabin"
	PrefixForReedSolomon = "reed-solomon"

	// DefaultBlockSize is the chunk size that splitters produce (or aim to).
	DefaultBlockSize int64 = 1024 * 256

	// No leaf block should contain more than 1MiB of payload data ( wrapping overhead aside )
	// This effectively mandates the maximum chunk size
	// See discussion at https://github.com/ipfs/go-ipfs-chunker/pull/21#discussion_r369124879 for background
	ChunkSizeLimit int = 1048576
)

View Source

const (
	DefaultReedSolomonDataShards   = 10
	DefaultReedSolomonParityShards = 20
	DefaultReedSolomonShardSize    = DefaultBlockSize
)

Variables ¶

View Source

var (
	ErrRabinMin = errors.New("rabin min must be greater than 16")
	ErrSize     = errors.New("chunker size must be greater than 0")
	ErrSizeMax  = fmt.Errorf("chunker parameters may not exceed the maximum chunk size of %d", ChunkSizeLimit)
)

View Source

var IpfsRabinPoly = chunker.Pol(17437180132763653)

IpfsRabinPoly is the irreducible polynomial of degree 53 used by for Rabin.

Functions ¶

func Chan ¶

func Chan(s Splitter) (<-chan []byte, <-chan error)

Chan returns a channel that receives each of the chunks produced by a splitter, along with another one for errors.

func IsReedSolomon ¶

func IsReedSolomon(chunker string) bool

func NewReedSolomonSplitter ¶

func NewReedSolomonSplitter(r io.Reader, numData, numParity, size uint64) (
	*reedSolomonSplitter, error)

NewReedSolomonSplitter takes in the number of data and parity chards, plus a size splitting the shards and returns a ReedSolomonSplitter.

Types ¶

type Buzhash ¶

type Buzhash struct {
	// contains filtered or unexported fields
}

func NewBuzhash ¶

func NewBuzhash(r io.Reader) *Buzhash

func (*Buzhash) ChunkSize ¶

func (b *Buzhash) ChunkSize() uint64

ChunkSize returns the chunk size of this Splitter.

func (*Buzhash) MetaData ¶

func (b *Buzhash) MetaData() interface{}

MetaData returns metadata object from this chunker (none).

func (*Buzhash) NextBytes ¶

func (b *Buzhash) NextBytes() ([]byte, error)

func (*Buzhash) Reader ¶

func (b *Buzhash) Reader() io.Reader

func (*Buzhash) SetIsDir ¶

func (b *Buzhash) SetIsDir(v bool)

type MetaSplitter ¶

type MetaSplitter struct {
	// contains filtered or unexported fields
}

func (*MetaSplitter) ChunkSize ¶

func (ms *MetaSplitter) ChunkSize() uint64

ChunkSize returns the chunk size of this Splitter.

func (*MetaSplitter) MetaData ¶

func (ms *MetaSplitter) MetaData() interface{}

MetaData returns metadata object from this chunker (none).

func (*MetaSplitter) NextBytes ¶

func (ms *MetaSplitter) NextBytes() ([]byte, error)

NextBytes produces a new chunk.

func (*MetaSplitter) Reader ¶

func (ms *MetaSplitter) Reader() io.Reader

Reader returns the io.Reader associated to this Splitter.

func (*MetaSplitter) SetIsDir ¶

func (rss *MetaSplitter) SetIsDir(v bool)

type MultiSplitter ¶

type MultiSplitter interface {
	Splitter

	Splitters() []Splitter
}

A MultiSplitter encapsulates multiple splitters useful for concurrent reading of chunks and also specialized dag building schemas. Each MultiSplitter also provides Splitter-compatible interface to read sequentially (the Splitter-default way).

type Rabin ¶

type Rabin struct {
	// contains filtered or unexported fields
}

Rabin implements the Splitter interface and splits content with Rabin fingerprints.

func NewRabin ¶

func NewRabin(r io.Reader, avgBlkSize uint64) *Rabin

NewRabin creates a new Rabin splitter with the given average block size.

func NewRabinMinMax ¶

func NewRabinMinMax(r io.Reader, min, avg, max uint64) *Rabin

NewRabinMinMax returns a new Rabin splitter which uses the given min, average and max block sizes.

func (*Rabin) ChunkSize ¶

func (r *Rabin) ChunkSize() uint64

ChunkSize returns the chunk size of this Splitter.

func (*Rabin) MetaData ¶

func (r *Rabin) MetaData() interface{}

MetaData returns metadata object from this chunker (none).

func (*Rabin) NextBytes ¶

func (r *Rabin) NextBytes() ([]byte, error)

NextBytes reads the next bytes from the reader and returns a slice.

func (*Rabin) Reader ¶

func (r *Rabin) Reader() io.Reader

Reader returns the io.Reader associated to this Splitter.

func (*Rabin) SetIsDir ¶

func (rss *Rabin) SetIsDir(v bool)

type RsMetaMap ¶

type RsMetaMap struct {
	NumData   uint64
	NumParity uint64
	FileSize  uint64
	IsDir     bool
}

func GetRsMetaMapFromString ¶

func GetRsMetaMapFromString(str string) (*RsMetaMap, error)

type Splitter ¶

type Splitter interface {
	Reader() io.Reader
	NextBytes() ([]byte, error)
	ChunkSize() uint64
	MetaData() interface{}
	SetIsDir(bool)
}

A Splitter reads bytes from a Reader and creates "chunks" (byte slices) that can be used to build DAG nodes.

func DefaultSplitter ¶

func DefaultSplitter(r io.Reader) Splitter

DefaultSplitter returns a SizeSplitter with the DefaultBlockSize.

func FromString ¶

func FromString(r io.Reader, chunker string) (Splitter, error)

FromString returns a Splitter depending on the given string: it supports "default" (""), "size-{size}", "rabin", "rabin-{blocksize}", "rabin-{min}-{avg}-{max}", "reed-solomon", "reed-solomon-{#data}-{#parity}-{size}" and "buzhash".

func NewMetaSplitter ¶

func NewMetaSplitter(r io.Reader, size uint64) Splitter

func NewSizeSplitter ¶

func NewSizeSplitter(r io.Reader, size int64) Splitter

NewSizeSplitter returns a new size-based Splitter with the given block size.

type SplitterGen ¶

type SplitterGen func(r io.Reader) Splitter

SplitterGen is a splitter generator, given a reader.

func MetaSplitterGen ¶

func MetaSplitterGen(size int64) SplitterGen

func SizeSplitterGen ¶

func SizeSplitterGen(size int64) SplitterGen

SizeSplitterGen returns a SplitterGen function which will create a splitter with the given size when called.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
gen This file generates bytehash LUT	This file generates bytehash LUT

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

go-btfs-chunker

Table of Contents

Install

Usage

Contribute

License

Documentation ¶

Overview ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func Chan ¶

func IsReedSolomon ¶

func NewReedSolomonSplitter ¶

Types ¶

type Buzhash ¶

func NewBuzhash ¶

func (*Buzhash) ChunkSize ¶

func (*Buzhash) MetaData ¶

func (*Buzhash) NextBytes ¶

func (*Buzhash) Reader ¶

func (*Buzhash) SetIsDir ¶

type MetaSplitter ¶

func (*MetaSplitter) ChunkSize ¶

func (*MetaSplitter) MetaData ¶

func (*MetaSplitter) NextBytes ¶

func (*MetaSplitter) Reader ¶

func (*MetaSplitter) SetIsDir ¶

type MultiSplitter ¶

type Rabin ¶

func NewRabin ¶

func NewRabinMinMax ¶

func (*Rabin) ChunkSize ¶

func (*Rabin) MetaData ¶

func (*Rabin) NextBytes ¶

func (*Rabin) Reader ¶

func (*Rabin) SetIsDir ¶

type RsMetaMap ¶

func GetRsMetaMapFromString ¶

type Splitter ¶

func DefaultSplitter ¶

func FromString ¶

func NewMetaSplitter ¶

func NewSizeSplitter ¶

type SplitterGen ¶

func MetaSplitterGen ¶

func SizeSplitterGen ¶

Source Files ¶

Directories ¶