nbs

package
v0.0.0-...-e5fa29d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 27, 2021 License: Apache-2.0 Imports: 39 Imported by: 11

README

Noms Block Store

A horizontally-scalable storage backend for Noms.

Overview

NBS is a storage layer optimized for the needs of the Noms database.

NBS can run in two configurations: either backed by local disk, or backed by Amazon AWS.

When backed by local disk, NBS is significantly faster than LevelDB for our workloads and supports full multiprocess concurrency.

When backed by AWS, NBS stores its data mainly in S3, along with a single DynamoDB item. This configuration makes Noms "effectively CA", in the sense that Noms is always consistent, and Noms+NBS is as available as DynamoDB and S3 are. This configuration also gives Noms the cost profile of S3 with power closer to that of a traditional database.

Details

  • NBS provides storage for a content-addressed DAG of nodes (with exactly one root), where each node is encoded as a sequence of bytes and addressed by a 20-byte hash of the byte-sequence.
  • There is no update or delete -- only insert, update root and garbage collect.
  • Insertion of any novel byte-sequence is durable only upon updating the root.
  • File-level multiprocess concurrency is supported, with optimistic locking for multiple writers.
  • Writers need not worry about re-writing duplicate chunks. NBS will efficiently detect and drop (most) duplicates.

Perf

For the file back-end, perf is substantially better than LevelDB mainly because LDB spends substantial IO with the goal of keeping KV pairs in key-order which doesn't benenfit Noms at all. NBS locates related chunks together and thus reading data from a NBS store can be done quite alot faster. As an example, storing & retrieving a 1.1GB MP4 video file on a MBP i5 2.9Ghz:

  • LDB
    • Initial import: 44 MB/s, size on disk: 1.1 GB.
    • Import exact same bytes: 35 MB/s, size on disk: 1.4 GB.
    • Export: 60 MB/s
  • NBS
    • Initial import: 72 MB/s, size on disk: 1.1 GB.
    • Import exact same bytes: 92 MB/s, size on disk: 1.1GB.
    • Export: 300 MB/s

Status

NBS is more-or-less "beta". There's still work we want to do, but it now works better than LevelDB for our purposes and so we have made it the default local backend for Noms:

# This uses nbs locally:
./csv-import foo.csv /Users/bob/csv-store::data

The AWS backend is available via the aws: scheme:

./csv-import foo.csv aws:table/bucket/database::data

Documentation

Index

Constants

View Source
const (
	// StorageVersion is the version of the on-disk Noms Chunks Store data format.
	StorageVersion = "4"
)

Variables

This section is empty.

Functions

func NewAWSStoreFactory

func NewAWSStoreFactory(sess *session.Session, table, bucket string, maxOpenFiles int, indexCacheSize, tableCacheSize uint64, tableCacheDir string) chunks.Factory

NewAWSStoreFactory returns a ChunkStore factory that vends NomsBlockStore instances that store manifests in the named DynamoDB table, and chunk data in the named S3 bucket. All connections to AWS services share |sess|.

func NewLocalStoreFactory

func NewLocalStoreFactory(dir string, indexCacheSize uint64, maxOpenFiles int) chunks.Factory

func ParseAddr

func ParseAddr(b []byte) (h addr)

func ValidateAddr

func ValidateAddr(s string) bool

Types

type AWSStoreFactory

type AWSStoreFactory struct {
	// contains filtered or unexported fields
}

AWSStoreFactory vends NomsBlockStores built on top of DynamoDB and S3.

func (*AWSStoreFactory) CreateStore

func (asf *AWSStoreFactory) CreateStore(ns string) chunks.ChunkStore

func (*AWSStoreFactory) CreateStoreFromCache

func (asf *AWSStoreFactory) CreateStoreFromCache(ns string) chunks.ChunkStore

func (*AWSStoreFactory) Shutter

func (asf *AWSStoreFactory) Shutter()

type LocalStoreFactory

type LocalStoreFactory struct {
	// contains filtered or unexported fields
}

func (*LocalStoreFactory) CreateStore

func (lsf *LocalStoreFactory) CreateStore(ns string) chunks.ChunkStore

func (*LocalStoreFactory) CreateStoreFromCache

func (lsf *LocalStoreFactory) CreateStoreFromCache(ns string) chunks.ChunkStore

func (*LocalStoreFactory) Shutter

func (lsf *LocalStoreFactory) Shutter()

type NomsBlockCache

type NomsBlockCache struct {
	// contains filtered or unexported fields
}

NomsBlockCache holds Chunks, allowing them to be retrieved by hash or enumerated in hash order.

func NewCache

func NewCache() *NomsBlockCache

func (*NomsBlockCache) Count

func (nbc *NomsBlockCache) Count() uint32

Count returns the number of items in the cache.

func (*NomsBlockCache) Destroy

func (nbc *NomsBlockCache) Destroy() error

Destroy drops the cache and deletes any backing storage.

func (*NomsBlockCache) ExtractChunks

func (nbc *NomsBlockCache) ExtractChunks(chunkChan chan *chunks.Chunk)

ExtractChunks writes the entire contents of the cache to chunkChan. The chunks are extracted in insertion order.

func (*NomsBlockCache) Get

func (nbc *NomsBlockCache) Get(hash hash.Hash) chunks.Chunk

Get retrieves the chunk referenced by hash. If the chunk is not present, Get returns the empty Chunk.

func (*NomsBlockCache) GetMany

func (nbc *NomsBlockCache) GetMany(hashes hash.HashSet, foundChunks chan *chunks.Chunk)

GetMany gets the Chunks with |hashes| from the store. On return, |foundChunks| will have been fully sent all chunks which have been found. Any non-present chunks will silently be ignored.

func (*NomsBlockCache) Has

func (nbc *NomsBlockCache) Has(hash hash.Hash) bool

Has checks if the chunk referenced by hash is in the cache.

func (*NomsBlockCache) HasMany

func (nbc *NomsBlockCache) HasMany(hashes hash.HashSet) hash.HashSet

HasMany returns a set containing the members of hashes present in the cache.

func (*NomsBlockCache) Insert

func (nbc *NomsBlockCache) Insert(c chunks.Chunk)

Insert stores c in the cache.

type NomsBlockStore

type NomsBlockStore struct {
	// contains filtered or unexported fields
}

func NewAWSStore

func NewAWSStore(table, ns, bucket string, s3 s3svc, ddb ddbsvc, memTableSize uint64) *NomsBlockStore

func NewLocalStore

func NewLocalStore(dir string, memTableSize uint64) *NomsBlockStore

func (*NomsBlockStore) CalcReads

func (nbs *NomsBlockStore) CalcReads(hashes hash.HashSet, blockSize uint64) (reads int, split bool)

func (*NomsBlockStore) Close

func (nbs *NomsBlockStore) Close() (err error)

func (*NomsBlockStore) Commit

func (nbs *NomsBlockStore) Commit(current, last hash.Hash) bool

func (*NomsBlockStore) Count

func (nbs *NomsBlockStore) Count() uint32

func (*NomsBlockStore) Get

func (nbs *NomsBlockStore) Get(h hash.Hash) chunks.Chunk

func (*NomsBlockStore) GetMany

func (nbs *NomsBlockStore) GetMany(hashes hash.HashSet, foundChunks chan *chunks.Chunk)

func (*NomsBlockStore) Has

func (nbs *NomsBlockStore) Has(h hash.Hash) bool

func (*NomsBlockStore) HasMany

func (nbs *NomsBlockStore) HasMany(hashes hash.HashSet) hash.HashSet

func (*NomsBlockStore) Put

func (nbs *NomsBlockStore) Put(c chunks.Chunk)

func (*NomsBlockStore) Rebase

func (nbs *NomsBlockStore) Rebase()

func (*NomsBlockStore) Root

func (nbs *NomsBlockStore) Root() hash.Hash

func (*NomsBlockStore) Stats

func (nbs *NomsBlockStore) Stats() interface{}

func (*NomsBlockStore) StatsSummary

func (nbs *NomsBlockStore) StatsSummary() string

func (*NomsBlockStore) Version

func (nbs *NomsBlockStore) Version() string

type Stats

type Stats struct {
	OpenLatency   metrics.Histogram
	CommitLatency metrics.Histogram

	IndexReadLatency  metrics.Histogram
	IndexBytesPerRead metrics.Histogram

	GetLatency   metrics.Histogram
	ChunksPerGet metrics.Histogram

	FileReadLatency  metrics.Histogram
	FileBytesPerRead metrics.Histogram

	S3ReadLatency  metrics.Histogram
	S3BytesPerRead metrics.Histogram

	MemReadLatency  metrics.Histogram
	MemBytesPerRead metrics.Histogram

	DynamoReadLatency  metrics.Histogram
	DynamoBytesPerRead metrics.Histogram

	HasLatency      metrics.Histogram
	AddressesPerHas metrics.Histogram

	PutLatency metrics.Histogram

	PersistLatency  metrics.Histogram
	BytesPerPersist metrics.Histogram

	ChunksPerPersist                 metrics.Histogram
	CompressedChunkBytesPerPersist   metrics.Histogram
	UncompressedChunkBytesPerPersist metrics.Histogram

	ConjoinLatency   metrics.Histogram
	BytesPerConjoin  metrics.Histogram
	ChunksPerConjoin metrics.Histogram
	TablesPerConjoin metrics.Histogram

	ReadManifestLatency  metrics.Histogram
	WriteManifestLatency metrics.Histogram
}

func NewStats

func NewStats() *Stats

func (*Stats) Add

func (s *Stats) Add(other Stats)

func (Stats) Delta

func (s Stats) Delta(other Stats) Stats

func (Stats) String

func (s Stats) String() string

Directories

Path Synopsis
gen

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL