blobsfile

package module
v0.3.8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 11, 2019 License: MIT Imports: 17 Imported by: 8

README

BlobsFile

builds.sr.ht status    Godoc Reference

BlobsFile is an append-only (i.e. no update and no delete) content-addressed blob store (using BLAKE2b as hash function).

It draws inspiration from Facebook's Haystack, blobs are stored in flat files (called BlobFile) and indexed by a small kv database for fast lookup.

BlobsFile is BlobStash's storage engine.

Features

  • Durable (data is fsynced before returning)
  • Immutable (append-only, can't mutate or delete blobs)
  • Optional compression (Snappy or Zstandard)
  • Extra parity data is added to each BlobFile (using Reed-Solomon error correcting code), allowing the database to repair itself in case of corruption.
    • The test suite is literraly punching holes at random places

Documentation

Overview

Package blobsfile implement the BlobsFile backend for storing blobs.

It stores multiple blobs (optionally compressed with Snappy) inside "BlobsFile"/fat file/packed file (256MB by default). Blobs are indexed by a kv file (that can be rebuild from the blobsfile).

New blobs are appended to the current file, and when the file exceed the limit, a new fie is created.

Index

Constants

View Source
const (
	// Version is the current BlobsFile binary format version
	Version = 1
)

Variables

View Source
var (
	// ErrBlobNotFound reports that the blob could not be found
	ErrBlobNotFound = errors.New("blob not found")

	// ErrBlobsfileCorrupted reports that one of the BlobsFile is corrupted and could not be repaired
	ErrBlobsfileCorrupted = errors.New("blobsfile is corrupted")
)

Functions

func ScanBlobsFile added in v0.3.4

func ScanBlobsFile(path string) ([]string, error)

scanBlobsFile scan a single BlobsFile (#n), and execute `iterFunc` for each indexed blob. `iterFunc` is optional, and without it, this func will check the consistency of each blob, and return a `corruptedError` if a blob is corrupted.

Types

type Blob

type Blob struct {
	Hash string
	Size int
	N    int
}

Blob represents a blob hash and size when enumerating the DB.

type BlobsFiles

type BlobsFiles struct {
	sync.Mutex
	// contains filtered or unexported fields
}

BlobsFiles represent the DB

func New

func New(opts *Opts) (*BlobsFiles, error)

New intializes a new BlobsFileBackend.

func (*BlobsFiles) CheckBlobsFiles

func (backend *BlobsFiles) CheckBlobsFiles() error

CheckBlobsFiles will check the consistency of all the BlobsFile

func (*BlobsFiles) Close

func (backend *BlobsFiles) Close() error

Close closes all the indexes and data files.

func (*BlobsFiles) Enumerate

func (backend *BlobsFiles) Enumerate(blobs chan<- *Blob, start, end string, limit int) error

Enumerate outputs all the blobs into the given chan (ordered lexicographically).

func (*BlobsFiles) EnumeratePrefix added in v0.2.0

func (backend *BlobsFiles) EnumeratePrefix(blobs chan<- *Blob, prefix string, limit int) error

Enumerate outputs all the blobs into the given chan (ordered lexicographically).

func (*BlobsFiles) Exists

func (backend *BlobsFiles) Exists(hash string) (bool, error)

Exists return true if the blobs is already stored.

func (*BlobsFiles) Get

func (backend *BlobsFiles) Get(hash string) ([]byte, error)

Get returns the blob for the given hash.

func (*BlobsFiles) Put

func (backend *BlobsFiles) Put(hash string, data []byte) (err error)

Put save a new blob, hash must be the blake2b hash hex-encoded of the data.

If the blob is already stored, then Put will be a no-op. So it's not necessary to make call Exists before saving a new blob.

func (*BlobsFiles) RebuildIndex

func (backend *BlobsFiles) RebuildIndex() error

RebuildIndex removes the index files and re-build it by re-scanning all the BlobsFiles.

func (*BlobsFiles) SealedPacks added in v0.3.3

func (backend *BlobsFiles) SealedPacks() []string

func (*BlobsFiles) SetBlobsFilesSealedFunc added in v0.3.6

func (backend *BlobsFiles) SetBlobsFilesSealedFunc(f func(string))

func (*BlobsFiles) Size added in v0.3.0

func (backend *BlobsFiles) Size(hash string) (int, error)

Size returns the blob size for the given hash.

func (*BlobsFiles) Stats

func (backend *BlobsFiles) Stats() (*Stats, error)

Stats returns some stats about the DB.

func (*BlobsFiles) String

func (backend *BlobsFiles) String() string

String implements the Stringer interface.

type CompressionAlgorithm

type CompressionAlgorithm byte
const (
	Snappy CompressionAlgorithm = 1 << iota
)

Compression algorithms flag

type ErrInterventionNeeded

type ErrInterventionNeeded struct {
	// contains filtered or unexported fields
}

ErrInterventionNeeded is an error indicating an manual action must be performed before being able to use BobsFile

func (*ErrInterventionNeeded) Error

func (ein *ErrInterventionNeeded) Error() string

type Opts

type Opts struct {
	// Compression algorithm
	Compression CompressionAlgorithm

	// The max size of a BlobsFile, will be 256MB by default if not set
	BlobsFileSize int64

	// Where the data and indexes will be stored
	Directory string

	// Allow to catch some events
	LogFunc func(msg string)

	// When trying to self-heal in case of recovery, some step need to be performed by the user
	AskConfirmationFunc func(msg string) bool

	BlobsFilesSealedFunc func(path string)
}

Opts represents the DB options

type Stats

type Stats struct {
	// The total number of blobs stored
	BlobsCount int

	// The size of all the blobs stored
	BlobsSize int64

	// The number of BlobsFile
	BlobsFilesCount int

	// The size of all the BlobsFile
	BlobsFilesSize int64
}

Stats represents some stats about the DB state

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL