seekable

package module
v0.8.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 23, 2023 License: MIT Imports: 11 Imported by: 4

README

ZSTD seekable compression format implementation in Go

Seekable ZSTD compression format implemented in Golang.

This library provides a random access reader (using uncompressed file offsets) for ZSTD-compressed streams. This can be used for creating transparent compression layers. Coupled with Content Defined Chunking (CDC) it can also be used as a robust de-duplication layer. It has been forked from https://github.com/SaveTheRbtz/zstd-seekable-format-go, made 32 bit safe, and had non-required libraries and build infrastructure removed.

Installation

go get -u gitlab.com/rackn/seekable-zstd

Using the seekable format

Writing is done through the Writer interface:

import (
	"github.com/klauspost/compress/zstd"
	seekable "gitlab.com/rackn/seekable-zstd"
)

enc, err := zstd.NewWriter(nil, zstd.WithEncoderLevel(zstd.SpeedFastest))
if err != nil {
	log.Fatal(err)
}
defer enc.Close()

w, err := seekable.NewWriter(f, enc)
if err != nil {
	log.Fatal(err)
}

// Write data in chunks.
for _, b := range [][]byte{[]byte("Hello"), []byte(" "), []byte("World!")} {
	_, err = w.Write(b)
	if err != nil {
		log.Fatal(err)
	}
}

// Close and flush seek table.
err = w.Close()
if err != nil {
	log.Fatal(err)
}

NB! Do not forget to call Close since it is responsible for flushing the seek table.

Reading can either be done through ReaderAt interface:

dec, err := zstd.NewReader(nil)
if err != nil {
	log.Fatal(err)
}
defer dec.Close()

r, err := seekable.NewReader(f, dec)
if err != nil {
	log.Fatal(err)
}
defer r.Close()

ello := make([]byte, 4)
// ReaderAt
r.ReadAt(ello, 1)
if !bytes.Equal(ello, []byte("ello")) {
	log.Fatalf("%+v != ello", ello)
}

Or through the ReadSeeker:

world := make([]byte, 5)
// Seeker
r.Seek(-6, io.SeekEnd)
// Reader
r.Read(world)
if !bytes.Equal(world, []byte("World")) {
	log.Fatalf("%+v != World", world)
}

Seekable format utilizes ZSTD skippable frames so it is a valid ZSTD stream:

// Standard ZSTD Reader
f.Seek(0, io.SeekStart)
dec, err := zstd.NewReader(f)
if err != nil {
	log.Fatal(err)
}
defer dec.Close()

all, err := io.ReadAll(dec)
if err != nil {
	log.Fatal(err)
}
if !bytes.Equal(all, []byte("Hello World!")) {
	log.Fatalf("%+v != Hello World!", all)
}

Documentation

Overview

Package adds an ability create ZSTD files in seekable format and randomly access them using uncompressed offsets.

Example
package main

import (
	"fmt"
	"io"
	"log"
	"os"

	"github.com/klauspost/compress/zstd"

	seekable "gitlab.com/rackn/seekable-zstd"
)

func main() {
	f, err := os.CreateTemp("", "example")
	if err != nil {
		log.Fatal(err)
	}
	defer os.Remove(f.Name())

	enc, err := zstd.NewWriter(nil, zstd.WithEncoderLevel(zstd.SpeedFastest))
	if err != nil {
		log.Fatal(err)
	}
	defer enc.Close()

	w, err := seekable.NewWriter(f, enc)
	if err != nil {
		log.Fatal(err)
	}

	// Write data in chunks.
	for _, b := range [][]byte{[]byte("Hello"), []byte(" "), []byte("World!")} {
		_, err = w.Write(b)
		if err != nil {
			log.Fatal(err)
		}
	}

	// Close and flush seek table.
	err = w.Close()
	if err != nil {
		log.Fatal(err)
	}

	dec, err := zstd.NewReader(nil)
	if err != nil {
		log.Fatal(err)
	}
	defer dec.Close()
	sz, err := f.Seek(0, io.SeekEnd)
	if err != nil {
		log.Fatal(err)
	}

	rdr, err := seekable.NewDecoder(f, sz, dec)
	if err != nil {
		log.Fatal(err)
	}
	defer rdr.Close()

	ello := make([]byte, 4)
	// ReaderAt
	_, err = rdr.ReadAt(ello, 1)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Offset: 1 from the start: %s\n", string(ello))
	r := rdr.ReadSeeker()
	world := make([]byte, 5)
	// Seeker
	_, err = r.Seek(-6, io.SeekEnd)
	if err != nil {
		log.Fatal(err)
	}
	// ReaderAt
	_, err = r.Read(world)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Offset: -6 from the end: %s\n", string(world))

	_, _ = f.Seek(0, io.SeekStart)

	// Standard ZSTD ReaderAt.
	dec, err = zstd.NewReader(f)
	if err != nil {
		log.Fatal(err)
	}
	defer dec.Close()

	all, err := io.ReadAll(dec)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Printf("Whole string: %s\n", string(all))

}
Output:

Offset: 1 from the start: ello
Offset: -6 from the end: World
Whole string: Hello World!

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

Decoder implements an io.ReaderAt for a seekable zstd stream.

func NewDecoder

func NewDecoder(rs io.ReaderAt, sz int64, decoder ZSTDDecoder) (*Decoder, error)

NewDecoder returns ZSTD stream reader that can be randomly accessed using uncompressed data offset.

func (*Decoder) Close

func (r *Decoder) Close() error

func (*Decoder) NumFrames

func (r *Decoder) NumFrames() int64

NumFrames returns the total number of encoded data frames making up the encoded data.

func (*Decoder) ReadAt added in v0.8.0

func (r *Decoder) ReadAt(p []byte, off int64) (n int, err error)

func (*Decoder) ReadSeeker added in v0.8.0

func (r *Decoder) ReadSeeker() io.ReadSeeker

func (*Decoder) Size

func (r *Decoder) Size() int64

Size returns the total uncompressed size of the encoded data.

type Encoder

type Encoder struct {
	// contains filtered or unexported fields
}

Encoder implements support for encoding data into a seekable zstd stream. It can be two modes, either as an io.WriteCloser, or by using a series of Encode calls followed by an EndStream call.

func NewWriter

func NewWriter(w io.Writer, encoder ZSTDEncoder) (*Encoder, error)

NewWriter wraps the passed io.Writer and Encoder into and indexed ZSTD stream. Resulting stream then can be randomly accessed through the ReaderAt and Decoder interfaces.

func (*Encoder) Close added in v0.8.0

func (s *Encoder) Close() (err error)

Close closes the stream, writing out the seek table. The underlying io.Writer must be Closed separately.

func (*Encoder) Encode

func (s *Encoder) Encode(src, dst []byte) ([]byte, error)

Encoder encodes data from source into dest, returning dest with the new conpressed data appended to it. If this would result in violating seekable format constraints, an error would be returned instead, and dst will be returned without the additional compressed data from src appended to it.

func (*Encoder) EndStream

func (s *Encoder) EndStream(dst []byte) ([]byte, error)

EndStream appends the seek table to dest, returning dst. If there are too many seek frames, an error will be returned.

func (*Encoder) Write added in v0.8.0

func (s *Encoder) Write(src []byte) (int, error)

Write implements io.Writer for Encoder. Each call to Write will encode a single seekable segment.

type ZSTDDecoder

type ZSTDDecoder interface {
	DecodeAll(input, dst []byte) ([]byte, error)
}

ZSTDDecoder is the decompressor. Tested with github.com/klauspost/compress/zstd.

type ZSTDEncoder

type ZSTDEncoder interface {
	EncodeAll(src, dst []byte) []byte
}

ZSTDEncoder is the compressor. Tested with github.com/klauspost/compress/zstd.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL