reedsolomon: github.com/klauspost/reedsolomon Index | Examples | Files

package reedsolomon

import "github.com/klauspost/reedsolomon"

Package reedsolomon enables Erasure Coding in Go

For usage and examples, see https://github.com/klauspost/reedsolomon

Index

Examples

Package Files

galois.go galoisAvx512_amd64.go galois_amd64.go inversion_tree.go matrix.go options.go reedsolomon.go streaming.go

Variables

var ErrInvShardNum = errors.New("cannot create Encoder with zero or less data/parity shards")

ErrInvShardNum will be returned by New, if you attempt to create an Encoder where either data or parity shards is zero or less.

var ErrInvalidInput = errors.New("invalid input")

ErrInvalidInput is returned if invalid input parameter of Update.

var ErrMaxShardNum = errors.New("cannot create Encoder with more than 256 data+parity shards")

ErrMaxShardNum will be returned by New, if you attempt to create an Encoder where data and parity shards are bigger than the order of GF(2^8).

var ErrReconstructMismatch = errors.New("valid shards and fill shards are mutually exclusive")

ErrReconstructMismatch is returned by the StreamEncoder, if you supply "valid" and "fill" streams on the same index. Therefore it is impossible to see if you consider the shard valid or would like to have it reconstructed.

var ErrReconstructRequired = errors.New("reconstruction required as one or more required data shards are nil")

ErrReconstructRequired is returned if too few data shards are intact and a reconstruction is required before you can successfully join the shards.

var ErrShardNoData = errors.New("no shard data")

ErrShardNoData will be returned if there are no shards, or if the length of all shards is zero.

var ErrShardSize = errors.New("shard sizes do not match")

ErrShardSize is returned if shard length isn't the same for all shards.

var ErrShortData = errors.New("not enough data to fill the number of requested shards")

ErrShortData will be returned by Split(), if there isn't enough data to fill the number of shards.

var ErrTooFewShards = errors.New("too few shards given")

ErrTooFewShards is returned if too few shards where given to Encode/Verify/Reconstruct/Update. It will also be returned from Reconstruct if there were too few shards to reconstruct the missing data.

type Encoder Uses

type Encoder interface {
    // Encode parity for a set of data shards.
    // Input is 'shards' containing data shards followed by parity shards.
    // The number of shards must match the number given to New().
    // Each shard is a byte array, and they must all be the same size.
    // The parity shards will always be overwritten and the data shards
    // will remain the same, so it is safe for you to read from the
    // data shards while this is running.
    Encode(shards [][]byte) error

    // Verify returns true if the parity shards contain correct data.
    // The data is the same format as Encode. No data is modified, so
    // you are allowed to read from data while this is running.
    Verify(shards [][]byte) (bool, error)

    // Reconstruct will recreate the missing shards if possible.
    //
    // Given a list of shards, some of which contain data, fills in the
    // ones that don't have data.
    //
    // The length of the array must be equal to the total number of shards.
    // You indicate that a shard is missing by setting it to nil or zero-length.
    // If a shard is zero-length but has sufficient capacity, that memory will
    // be used, otherwise a new []byte will be allocated.
    //
    // If there are too few shards to reconstruct the missing
    // ones, ErrTooFewShards will be returned.
    //
    // The reconstructed shard set is complete, but integrity is not verified.
    // Use the Verify function to check if data set is ok.
    Reconstruct(shards [][]byte) error

    // ReconstructData will recreate any missing data shards, if possible.
    //
    // Given a list of shards, some of which contain data, fills in the
    // data shards that don't have data.
    //
    // The length of the array must be equal to Shards.
    // You indicate that a shard is missing by setting it to nil or zero-length.
    // If a shard is zero-length but has sufficient capacity, that memory will
    // be used, otherwise a new []byte will be allocated.
    //
    // If there are too few shards to reconstruct the missing
    // ones, ErrTooFewShards will be returned.
    //
    // As the reconstructed shard set may contain missing parity shards,
    // calling the Verify function is likely to fail.
    ReconstructData(shards [][]byte) error

    // Update parity is use for change a few data shards and update it's parity.
    // Input 'newDatashards' containing data shards changed.
    // Input 'shards' containing old data shards (if data shard not changed, it can be nil) and old parity shards.
    // new parity shards will in shards[DataShards:]
    // Update is very useful if  DataShards much larger than ParityShards and changed data shards is few. It will
    // faster than Encode and not need read all data shards to encode.
    Update(shards [][]byte, newDatashards [][]byte) error

    // Split a data slice into the number of shards given to the encoder,
    // and create empty parity shards.
    //
    // The data will be split into equally sized shards.
    // If the data size isn't dividable by the number of shards,
    // the last shard will contain extra zeros.
    //
    // There must be at least 1 byte otherwise ErrShortData will be
    // returned.
    //
    // The data will not be copied, except for the last shard, so you
    // should not modify the data of the input slice afterwards.
    Split(data []byte) ([][]byte, error)

    // Join the shards and write the data segment to dst.
    //
    // Only the data shards are considered.
    // You must supply the exact output size you want.
    // If there are to few shards given, ErrTooFewShards will be returned.
    // If the total data size is less than outSize, ErrShortData will be returned.
    Join(dst io.Writer, shards [][]byte, outSize int) error
}

Encoder is an interface to encode Reed-Salomon parity sets for your data.

Simple example of how to use all functions of the Encoder. Note that all error checks have been removed to keep it short.

Code:

// Create some sample data
var data = make([]byte, 250000)
fillRandom(data)

// Create an encoder with 17 data and 3 parity slices.
enc, _ := reedsolomon.New(17, 3)

// Split the data into shards
shards, _ := enc.Split(data)

// Encode the parity set
_ = enc.Encode(shards)

// Verify the parity set
ok, _ := enc.Verify(shards)
if ok {
    fmt.Println("ok")
}

// Delete two shards
shards[10], shards[11] = nil, nil

// Reconstruct the shards
_ = enc.Reconstruct(shards)

// Verify the data set
ok, _ = enc.Verify(shards)
if ok {
    fmt.Println("ok")
}

Output:

ok
ok

This demonstrates that shards can be arbitrary sliced and merged and still remain valid.

Code:

// Create some sample data
var data = make([]byte, 250000)
fillRandom(data)

// Create 5 data slices of 50000 elements each
enc, _ := reedsolomon.New(5, 3)
shards, _ := enc.Split(data)
err := enc.Encode(shards)
if err != nil {
    panic(err)
}

// Check that it verifies
ok, err := enc.Verify(shards)
if ok && err == nil {
    fmt.Println("encode ok")
}

// Split the data set of 50000 elements into two of 25000
splitA := make([][]byte, 8)
splitB := make([][]byte, 8)

// Merge into a 100000 element set
merged := make([][]byte, 8)

// Split/merge the shards
for i := range shards {
    splitA[i] = shards[i][:25000]
    splitB[i] = shards[i][25000:]

    // Concencate it to itself
    merged[i] = append(make([]byte, 0, len(shards[i])*2), shards[i]...)
    merged[i] = append(merged[i], shards[i]...)
}

// Each part should still verify as ok.
ok, err = enc.Verify(shards)
if ok && err == nil {
    fmt.Println("splitA ok")
}

ok, err = enc.Verify(splitB)
if ok && err == nil {
    fmt.Println("splitB ok")
}

ok, err = enc.Verify(merged)
if ok && err == nil {
    fmt.Println("merge ok")
}

Output:

encode ok
splitA ok
splitB ok
merge ok

This demonstrates that shards can xor'ed and still remain a valid set.

The xor value must be the same for element 'n' in each shard, except if you xor with a similar sized encoded shard set.

Code:

// Create some sample data
var data = make([]byte, 25000)
fillRandom(data)

// Create 5 data slices of 5000 elements each
enc, _ := reedsolomon.New(5, 3)
shards, _ := enc.Split(data)
err := enc.Encode(shards)
if err != nil {
    panic(err)
}

// Check that it verifies
ok, err := enc.Verify(shards)
if !ok || err != nil {
    fmt.Println("falied initial verify", err)
}

// Create an xor'ed set
xored := make([][]byte, 8)

// We xor by the index, so you can see that the xor can change,
// It should however be constant vertically through your slices.
for i := range shards {
    xored[i] = make([]byte, len(shards[i]))
    for j := range xored[i] {
        xored[i][j] = shards[i][j] ^ byte(j&0xff)
    }
}

// Each part should still verify as ok.
ok, err = enc.Verify(xored)
if ok && err == nil {
    fmt.Println("verified ok after xor")
}

Output:

verified ok after xor

func New Uses

func New(dataShards, parityShards int, opts ...Option) (Encoder, error)

New creates a new encoder and initializes it to the number of data shards and parity shards that you want to use. You can reuse this encoder. Note that the maximum number of total shards is 256. If no options are supplied, default options are used.

type Option Uses

type Option func(*options)

Option allows to override processing parameters.

func WithAutoGoroutines Uses

func WithAutoGoroutines(shardSize int) Option

WithAutoGoroutines will adjust the number of goroutines for optimal speed with a specific shard size. Send in the shard size you expect to send. Other shard sizes will work, but may not run at the optimal speed. Overwrites WithMaxGoroutines. If shardSize <= 0, it is ignored.

func WithCauchyMatrix Uses

func WithCauchyMatrix() Option

WithCauchyMatrix will make the encoder build a Cauchy style matrix. The output of this is not compatible with the standard output. A Cauchy matrix is faster to generate. This does not affect data throughput, but will result in slightly faster start-up time.

func WithMaxGoroutines Uses

func WithMaxGoroutines(n int) Option

WithMaxGoroutines is the maximum number of goroutines number for encoding & decoding. Jobs will be split into this many parts, unless each goroutine would have to process less than minSplitSize bytes (set with WithMinSplitSize). For the best speed, keep this well above the GOMAXPROCS number for more fine grained scheduling. If n <= 0, it is ignored.

func WithMinSplitSize Uses

func WithMinSplitSize(n int) Option

WithMinSplitSize is the minimum encoding size in bytes per goroutine. See WithMaxGoroutines on how jobs are split. If n <= 0, it is ignored.

func WithPAR1Matrix Uses

func WithPAR1Matrix() Option

WithPAR1Matrix causes the encoder to build the matrix how PARv1 does. Note that the method they use is buggy, and may lead to cases where recovery is impossible, even if there are enough parity shards.

type StreamEncoder Uses

type StreamEncoder interface {
    // Encode parity shards for a set of data shards.
    //
    // Input is 'shards' containing readers for data shards followed by parity shards
    // io.Writer.
    //
    // The number of shards must match the number given to NewStream().
    //
    // Each reader must supply the same number of bytes.
    //
    // The parity shards will be written to the writer.
    // The number of bytes written will match the input size.
    //
    // If a data stream returns an error, a StreamReadError type error
    // will be returned. If a parity writer returns an error, a
    // StreamWriteError will be returned.
    Encode(data []io.Reader, parity []io.Writer) error

    // Verify returns true if the parity shards contain correct data.
    //
    // The number of shards must match the number total data+parity shards
    // given to NewStream().
    //
    // Each reader must supply the same number of bytes.
    // If a shard stream returns an error, a StreamReadError type error
    // will be returned.
    Verify(shards []io.Reader) (bool, error)

    // Reconstruct will recreate the missing shards if possible.
    //
    // Given a list of valid shards (to read) and invalid shards (to write)
    //
    // You indicate that a shard is missing by setting it to nil in the 'valid'
    // slice and at the same time setting a non-nil writer in "fill".
    // An index cannot contain both non-nil 'valid' and 'fill' entry.
    // If both are provided 'ErrReconstructMismatch' is returned.
    //
    // If there are too few shards to reconstruct the missing
    // ones, ErrTooFewShards will be returned.
    //
    // The reconstructed shard set is complete, but integrity is not verified.
    // Use the Verify function to check if data set is ok.
    Reconstruct(valid []io.Reader, fill []io.Writer) error

    // Split a an input stream into the number of shards given to the encoder.
    //
    // The data will be split into equally sized shards.
    // If the data size isn't dividable by the number of shards,
    // the last shard will contain extra zeros.
    //
    // You must supply the total size of your input.
    // 'ErrShortData' will be returned if it is unable to retrieve the
    // number of bytes indicated.
    Split(data io.Reader, dst []io.Writer, size int64) (err error)

    // Join the shards and write the data segment to dst.
    //
    // Only the data shards are considered.
    //
    // You must supply the exact output size you want.
    // If there are to few shards given, ErrTooFewShards will be returned.
    // If the total data size is less than outSize, ErrShortData will be returned.
    Join(dst io.Writer, shards []io.Reader, outSize int64) error
}

StreamEncoder is an interface to encode Reed-Salomon parity sets for your data. It provides a fully streaming interface, and processes data in blocks of up to 4MB.

For small shard sizes, 10MB and below, it is recommended to use the in-memory interface, since the streaming interface has a start up overhead.

For all operations, no readers and writers should not assume any order/size of individual reads/writes.

For usage examples, see "stream-encoder.go" and "streamdecoder.go" in the examples folder.

This will show a simple stream encoder where we encode from a []io.Reader which contain a reader for each shard.

Input and output can be exchanged with files, network streams or what may suit your needs.

Code:

dataShards := 5
parityShards := 2

// Create a StreamEncoder with the number of data and
// parity shards.
rs, err := reedsolomon.NewStream(dataShards, parityShards)
if err != nil {
    log.Fatal(err)
}

shardSize := 50000

// Create input data shards.
input := make([][]byte, dataShards)
for s := range input {
    input[s] = make([]byte, shardSize)
    fillRandom(input[s])
}

// Convert our buffers to io.Readers
readers := make([]io.Reader, dataShards)
for i := range readers {
    readers[i] = io.Reader(bytes.NewBuffer(input[i]))
}

// Create our output io.Writers
out := make([]io.Writer, parityShards)
for i := range out {
    out[i] = ioutil.Discard
}

// Encode from input to output.
err = rs.Encode(readers, out)
if err != nil {
    log.Fatal(err)
}

fmt.Println("ok")

Output:

ok

func NewStream Uses

func NewStream(dataShards, parityShards int, o ...Option) (StreamEncoder, error)

NewStream creates a new encoder and initializes it to the number of data shards and parity shards that you want to use. You can reuse this encoder. Note that the maximum number of data shards is 256.

func NewStreamC Uses

func NewStreamC(dataShards, parityShards int, conReads, conWrites bool, o ...Option) (StreamEncoder, error)

NewStreamC creates a new encoder and initializes it to the number of data shards and parity shards given.

This functions as 'NewStream', but allows you to enable CONCURRENT reads and writes.

type StreamReadError Uses

type StreamReadError struct {
    Err    error // The error
    Stream int   // The stream number on which the error occurred
}

StreamReadError is returned when a read error is encountered that relates to a supplied stream. This will allow you to find out which reader has failed.

func (StreamReadError) Error Uses

func (s StreamReadError) Error() string

Error returns the error as a string

func (StreamReadError) String Uses

func (s StreamReadError) String() string

String returns the error as a string

type StreamWriteError Uses

type StreamWriteError struct {
    Err    error // The error
    Stream int   // The stream number on which the error occurred
}

StreamWriteError is returned when a write error is encountered that relates to a supplied stream. This will allow you to find out which reader has failed.

func (StreamWriteError) Error Uses

func (s StreamWriteError) Error() string

Error returns the error as a string

func (StreamWriteError) String Uses

func (s StreamWriteError) String() string

String returns the error as a string

Package reedsolomon imports 9 packages (graph) and is imported by 66 packages. Updated 2019-09-28. Refresh now. Tools for package owners.