shardingdb

package module
v1.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 30, 2023 License: Apache-2.0 Imports: 13 Imported by: 0

README

shardingdb

ShardingDB is an open-source, sharded database enhancing LevelDB with concurrent reads/writes support. It significantly improves performance, boosting PutData by 60x and GetData by 7x, making it an ideal drop-in replacement for LevelDB.

Requirements

  • Requires at least go1.14 or newer.

How to use

1. Resharding
1.0 Build the resharding tool
make
cd bin
1.1 Migrate data from LevelDB to new shardingdb

For example, if you have 1 LevelDB data and want to migrate it to 3 shardingdb data, print summary log(1), you can run the following command:

./resharding -i /data1 -o /newfolder1,/newfolder2,/newfolder3 -l 1
1.2 Add sharding db

For example, if you have 1 LevelDB data and want to add 2 more LevelDB folders to shardingdb, print no log(0), you can run the following command:


```bash
./resharding -i /data1 -o /data1,/data2,/data3 

For example, if you have 3 LevelDB data and want to add 1 more LevelDB folder to shardingdb, print detail log(2), you can run the following command:


```bash
./resharding -i /data1,/data2,/data3 -o /data1,/data2,/data3,/data4 -l 2
2. Code example
2.0 Get the package
go get github.com/studyzy/shardingdb
2.1 Import the package
import "github.com/studyzy/shardingdb"
2.2 Use shardingdb
inputPathList := []string{"/data1", "/data2"}
sdb, err := shardingdb.OpenFile(inputPathList, nil)
sdb.Put([]byte("key"), []byte("value"), nil)
sdb.Get([]byte("key"), nil)
...
2.3 Another example
db1, err := leveldb.OpenFile(getTempDir(), nil)
if err != nil {
    t.Fatal(err)
}
db2, err := leveldb.OpenFile(getTempDir(), nil)
if err != nil {
    t.Fatal(err)
}
// Create a new sharding db
sdb, err := shardingdb.NewShardingDb(shardingdb.WithDbHandles(db1,db2), shardingdb.WithShardingFunc(MurmurSharding))
...

Performance Benchmark

Environment
  • Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz * 10 Core
  • 40GB RAM
  • 3 SSD: /data, /data1, /

generate data by command:

go test -timeout 60m -run "TestCompareDbPerformance"

Test case: total 1000000 key-value pairs, 100 go routines, 100 key-value pairs per batch. result means the time cost(second) of the whole operation.

1. PutData
Data Size LevelDB ShardingDB(3 folders) ShardingDB(6 folders) ShardingDB(encrypt 3 folders)
100B 2.27 0.659 0.581 0.953
200B 4.45 1.07 0.683 1.9
500B 15.3 3.36 1.49 6.4
1KB 48.9 9.42 3.74 17.69
10KB 1117 351 123 308
2. GetData
Data Size LevelDB ShardingDB(3 folders) ShardingDB(6 folders) ShardingDB(encrypt 3 folders)
100B 2.23 1.25 1.02 1.86
200B 3.09 1.42 1.27 2.24
500B 4.17 1.91 1.62 3.73
1KB 7.97 2.37 2.26 4.53
10KB 12.75 9.54 11.03 13.85
3. GetData not found
Data Size LevelDB ShardingDB(3 folders) ShardingDB(6 folders) ShardingDB(encrypt 3 folders)
100B 2.14 1.36 0.87 1.43
200B 2.07 1.47 0.9 1.6
500B 2.05 1.51 0.93 1.81
1KB 2.35 1.64 0.891 2.28
10KB 8.68 5.56 2.48 7.75
4. DeleteData
Data Size LevelDB ShardingDB(3 folders) ShardingDB(6 folders) ShardingDB(encrypt 3 folders)
100B 3.82 2.76 1.02 1.72
200B 3.81 1.71 1.02 1.74
500B 3.85 1.76 1.05 1.69
1KB 3.84 1.72 1.04 1.74
10KB 3.844 1.78 1.06 1.76
5. Iterator
Data Size LevelDB ShardingDB(3 folders) ShardingDB(6 folders) ShardingDB(encrypt 3 folders)
100B 0.133 0.184 0.222 0.18
200B 0.151 0.246 0.246 0.191
500B 0.282 0.351 0.41 0.344
1KB 0.514 0.419 0.472 0.541
10KB 2.46 2.39 1.96 2.3
6. Sharding count compare

run command:

go test  -timeout 60m -run "TestCompareShardingCountPerformance"
6.1 PutData
Data Size ShardingDB(3 folders) ShardingDB(6 folders) ShardingDB(9 folders) ShardingDB(30 folders) ShardingDB(60 folders)
100B 0.659 0.581 0.506 0.564 0.728
200B 1.07 0.683 0.624 0.685 0.782
500B 3.36 1.49 1.20 1.18 1.21
1KB 9.42 3.74 2.33 1.92 1.96
10KB 351 123 54 26 18.2
6.2 GetData
Data Size ShardingDB(3 folders) ShardingDB(6 folders) ShardingDB(9 folders) ShardingDB(30 folders) ShardingDB(60 folders)
100B 1.25 1.02 1.03 0.343 0.366
200B 1.42 1.27 1.01 0.66 0.373
500B 1.91 1.62 1.21 0.96 1.34
1KB 2.37 2.26 1.83 1.18 1.19
10KB 9.54 11.03 7.67 4.8 3.4

Most interfaces are the same as goleveldb. For my interface definition, please refer to DbHandle.

Documentation

Overview

Package shardingdb provides a sharding db based on goleveldb

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Migration

func Migration(dbReaders []LevelDbHandle, sdb *ShardingDb) error

Migration changed leveldb count, reorganize all data to the new leveldb @param dbReaders @param sdb @return error

func MurmurSharding

func MurmurSharding(key []byte, max uint16) uint16

MurmurSharding sharding function @param key @param max @return uint16

func Sha256Sharding

func Sha256Sharding(key []byte, max uint16) uint16

Sha256Sharding sharding function @param key @param max @return int

func XorSharding added in v1.1.0

func XorSharding(key []byte, max uint8) uint8

XorSharding sharding function use XOR

func XorSharding16 added in v1.1.0

func XorSharding16(key []byte, max uint16) uint16

XorSharding16 sharding function,but actually max is uint8 @param key @param max @return uint16

Types

type AESCryptor added in v1.1.0

type AESCryptor struct {
	// contains filtered or unexported fields
}

AESCryptor is a encryptor using AES

func NewAESCryptor added in v1.1.0

func NewAESCryptor(key []byte) *AESCryptor

NewAESCryptor creates a new AESCryptor

func (*AESCryptor) Decrypt added in v1.1.0

func (a *AESCryptor) Decrypt(data []byte) ([]byte, error)

Decrypt decrypts data

func (*AESCryptor) Encrypt added in v1.1.0

func (a *AESCryptor) Encrypt(data []byte) ([]byte, error)

Encrypt encrypts data with AES

type Batch

type Batch interface {
	// Put sets the value for the given key.
	// @param key
	// @param value
	Put(key, value []byte)
	// Delete deletes the value for the given key.
	// @param key
	Delete(key []byte)
	// Dump returns the serialized representation of the batch.
	// @return []byte
	Dump() []byte
	// Load loads the batch from the serialized representation returned by Dump.
	// @param data
	// @return error
	Load(data []byte) error
	// Replay replays the batch contents into the given handler.
	// @param r
	// @return error
	Replay(r leveldb.BatchReplay) error
	// Len returns the number of updates in the batch.
	// @return int
	Len() int
	// Reset resets the batch contents.
	Reset()
}

Batch is the interface that wraps the basic methods of a leveldb.Batch

type CommonDbHandle added in v1.1.0

type CommonDbHandle interface {
	// Get returns the value for the given key.
	// @param key
	// @param ro
	// @return value
	// @return err
	Get(key []byte, ro *opt.ReadOptions) (value []byte, err error)
	// Has returns whether the DB does contains the given key.
	// @param key
	// @param ro
	// @return ret
	// @return err
	Has(key []byte, ro *opt.ReadOptions) (ret bool, err error)
	// NewIterator returns an iterator for the latest snapshot of the DB.
	// @param slice
	// @param ro
	// @return iterator.Iterator
	NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator

	// GetProperty returns the value of the given property for the DB.
	// @param name
	// @return value
	// @return err
	GetProperty(name string) (value string, err error)
	// Stats returns the DB's leveldb.DBStats.
	// @param s
	// @return error
	Stats(s *leveldb.DBStats) error
	// SizeOf returns the approximate file system space used by keys in the given ranges.
	// @param ranges
	// @return leveldb.Sizes
	// @return error
	SizeOf(ranges []util.Range) (leveldb.Sizes, error)
	// Close closes the DB.
	// @return error
	Close() error

	// Write writes the given batch to the DB.
	// @param batch
	// @param wo
	// @return error
	Write(batch *leveldb.Batch, wo *opt.WriteOptions) error
	// Put sets the value for the given key.
	// @param key
	// @param value
	// @param wo
	// @return error
	Put(key, value []byte, wo *opt.WriteOptions) error
	// Delete deletes the value for the given key.
	// @param key
	// @param wo
	// @return error
	Delete(key []byte, wo *opt.WriteOptions) error
	// CompactRange manually compacts the underlying DB for the given key range.
	// @param r
	// @return error
	CompactRange(r util.Range) error
	// SetReadOnly sets the DB to read-only mode.
	// @return error
	SetReadOnly() error
}

CommonDbHandle is the interface that wraps the basic methods of a leveldb.DB

type DbOption added in v1.1.0

type DbOption func(db *ShardingDb)

DbOption is used to set options for ShardingDb

func WithDbHandles added in v1.1.0

func WithDbHandles(dbHandles ...LevelDbHandle) DbOption

WithDbHandles sets dbHandles for ShardingDb

func WithDbPaths added in v1.1.0

func WithDbPaths(paths ...string) DbOption

WithDbPaths sets dbHandles for ShardingDb

func WithEncryptor added in v1.1.0

func WithEncryptor(e Encryptor) DbOption

WithEncryptor sets encryptor for ShardingDb

func WithLogger added in v1.1.0

func WithLogger(l Logger) DbOption

WithLogger sets logger for ShardingDb

func WithShardingFunc added in v1.1.0

func WithShardingFunc(f ShardingFunc) DbOption

WithShardingFunc sets shardingFunc for ShardingDb

type Encryptor added in v1.1.0

type Encryptor interface {
	// Encrypt encrypts the given data.
	Encrypt(data []byte) ([]byte, error)
	// Decrypt decrypts the given data.
	Decrypt(data []byte) ([]byte, error)
}

Encryptor is the interface that wraps the basic methods of a encryptor

type LevelDbHandle

type LevelDbHandle interface {
	CommonDbHandle
	// GetSnapshot returns a new snapshot of the DB.
	GetSnapshot() (*leveldb.Snapshot, error)
	// OpenTransaction opens a transaction.
	OpenTransaction() (*leveldb.Transaction, error)
}

LevelDbHandle is the interface that wraps the basic LevelDB methods.

type Logger

type Logger interface {
	// Debug logs a debug message.
	Debug(msg string)
	// Info logs an info message.
	Info(msg string)
}

Logger is the interface that wraps the basic methods of a logger

type ShardingBatch

type ShardingBatch struct {
	// contains filtered or unexported fields
}

ShardingBatch is a batch of multiple db

func NewShardingBatch

func NewShardingBatch(len uint16, shardingFunc ShardingFunc, e Encryptor) *ShardingBatch

NewShardingBatch returns a new ShardingBatch

func (*ShardingBatch) Delete

func (s *ShardingBatch) Delete(key []byte)

Delete deletes the value for the given key

func (*ShardingBatch) GetSplitBatch

func (s *ShardingBatch) GetSplitBatch() map[uint16]*leveldb.Batch

GetSplitBatch returns a map of db index to batch

func (*ShardingBatch) Put

func (s *ShardingBatch) Put(key, value []byte)

Put sets the value for the given key

type ShardingDb

type ShardingDb struct {
	// contains filtered or unexported fields
}

ShardingDb is a db of multiple db

func NewShardingDb

func NewShardingDb(options ...DbOption) (*ShardingDb, error)

NewShardingDb creates a new ShardingDb @param shardingFunc @param txHandles @return *ShardingDb

func OpenFile

func OpenFile(path []string, o *opt.Options) (db *ShardingDb, err error)

OpenFile opens multi db,looks like leveldb.OpenFile @param path @param o @return db @return err

func (*ShardingDb) Close

func (sdb *ShardingDb) Close() error

Close close all db @return error

func (*ShardingDb) CompactRange

func (sdb *ShardingDb) CompactRange(r util.Range) error

CompactRange compact range @param r @return error

func (*ShardingDb) Debugf

func (sdb *ShardingDb) Debugf(msg string, a ...interface{})

Debugf log debug @param msg @param a

func (*ShardingDb) Delete

func (sdb *ShardingDb) Delete(key []byte, wo *opt.WriteOptions) error

Delete delete key @param key @param wo @return error Delete removes the given key from the database. If there are multiple replicas, it removes the key from all replicas concurrently and waits for all of them to complete.

func (*ShardingDb) Get

func (sdb *ShardingDb) Get(key []byte, ro *opt.ReadOptions) (value []byte, err error)

Get get value by key @param key @param ro @return value @return err

func (*ShardingDb) GetProperty

func (sdb *ShardingDb) GetProperty(name string) (value string, err error)

GetProperty get property @param name @return value @return err

func (*ShardingDb) GetSnapshot

func (sdb *ShardingDb) GetSnapshot() (Snapshot, error)

GetSnapshot get snapshot @return Snapshot @return error

func (*ShardingDb) Has

func (sdb *ShardingDb) Has(key []byte, ro *opt.ReadOptions) (ret bool, err error)

Has checks if the given key exists in the database. If there are multiple replicas, it checks for the key in all replicas concurrently and returns true if any replica contains the key. @param key @param ro @return ret @return err

func (*ShardingDb) Infof

func (sdb *ShardingDb) Infof(msg string, a ...interface{})

Infof log info @param msg @param a

func (*ShardingDb) NewIterator

func (sdb *ShardingDb) NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator

NewIterator create a new iterator @param slice @param ro @return iterator.Iterator

func (*ShardingDb) OpenTransaction

func (sdb *ShardingDb) OpenTransaction() (Transaction, error)

OpenTransaction open transaction @return Transaction @return error

func (*ShardingDb) Put

func (sdb *ShardingDb) Put(key, value []byte, wo *opt.WriteOptions) error

Put put key value @param key @param value @param wo @return error Put writes the given key-value pair to the database. If there are multiple replicas, it writes the key-value pair to all replicas concurrently and waits for all of them to complete.

func (*ShardingDb) Resharding

func (sdb *ShardingDb) Resharding() error

Resharding changed leveldb count, reorganize all data in the original leveldb @return error

func (*ShardingDb) SetReadOnly

func (sdb *ShardingDb) SetReadOnly() error

SetReadOnly set read only @return error

func (*ShardingDb) ShardCount added in v1.1.0

func (sdb *ShardingDb) ShardCount() uint16

ShardCount returns the number of shards

func (*ShardingDb) SizeOf

func (sdb *ShardingDb) SizeOf(ranges []util.Range) (leveldb.Sizes, error)

SizeOf get size of ranges @param ranges @return leveldb.Sizes @return error

func (*ShardingDb) Stats

func (sdb *ShardingDb) Stats(s *leveldb.DBStats) error

Stats get stats @param s @return error

func (*ShardingDb) Write

func (sdb *ShardingDb) Write(batch *leveldb.Batch, wo *opt.WriteOptions) error

Write write batch @param batch @param wo @return error Write applies the given batch to the database. If there are multiple replicas, it applies the batch to all replicas concurrently and waits for all of them to complete.

type ShardingDbHandle added in v1.1.0

type ShardingDbHandle interface {
	CommonDbHandle
	// GetSnapshot returns a new snapshot of the DB.
	// @return Snapshot
	// @return error
	GetSnapshot() (Snapshot, error)
	// OpenTransaction opens a transaction.
	// @return Transaction
	// @return error
	OpenTransaction() (Transaction, error)
	// Resharding resharding the DB.
	// @return error
	Resharding() error
	// ShardCount returns the shard count of the DB.
	ShardCount() uint16
}

ShardingDbHandle is the interface that wraps the basic methods of a leveldb.DB

type ShardingFunc added in v1.1.0

type ShardingFunc func(key []byte, max uint16) uint16

ShardingFunc is a function to calculate the index of db

type ShardingSnapshot

type ShardingSnapshot struct {
	// contains filtered or unexported fields
}

ShardingSnapshot is a snapshot of multiple db

func (ShardingSnapshot) Get

func (s ShardingSnapshot) Get(key []byte, ro *opt.ReadOptions) (value []byte, err error)

Get returns the value for the given key

func (ShardingSnapshot) Has

func (s ShardingSnapshot) Has(key []byte, ro *opt.ReadOptions) (ret bool, err error)

Has returns whether the DB does contains the given key

func (ShardingSnapshot) NewIterator

func (s ShardingSnapshot) NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator

NewIterator returns an iterator for the latest snapshot of the DB

func (ShardingSnapshot) Release

func (s ShardingSnapshot) Release()

Release releases the snapshot

func (ShardingSnapshot) String

func (s ShardingSnapshot) String() string

String returns a string representation of the snapshot

type ShardingTransaction

type ShardingTransaction struct {
	// contains filtered or unexported fields
}

ShardingTransaction is a transaction of multiple db

func (ShardingTransaction) Commit

func (s ShardingTransaction) Commit() error

Commit commits the transaction.

func (ShardingTransaction) Delete

func (s ShardingTransaction) Delete(key []byte, wo *opt.WriteOptions) error

Delete deletes the value for the given key.

func (ShardingTransaction) Discard

func (s ShardingTransaction) Discard()

Discard discards the transaction.

func (ShardingTransaction) Get

func (s ShardingTransaction) Get(key []byte, ro *opt.ReadOptions) ([]byte, error)

Get returns the value for the given key

func (ShardingTransaction) Has

func (s ShardingTransaction) Has(key []byte, ro *opt.ReadOptions) (bool, error)

Has returns whether the DB does contains the given key

func (ShardingTransaction) NewIterator

func (s ShardingTransaction) NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator

NewIterator returns an iterator for the latest snapshot of the DB.

func (ShardingTransaction) Put

func (s ShardingTransaction) Put(key, value []byte, wo *opt.WriteOptions) error

Put sets the value for the given key.

func (ShardingTransaction) Write

Write writes the given batch to the DB.

type Snapshot

type Snapshot interface {
	// String returns a string representation of the snapshot
	String() string
	// Get returns the value for the given key
	Get(key []byte, ro *opt.ReadOptions) (value []byte, err error)
	// Has returns whether the DB does contains the given key
	Has(key []byte, ro *opt.ReadOptions) (ret bool, err error)
	// NewIterator returns an iterator for the latest snapshot of the DB
	NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator
	// Release releases the snapshot
	Release()
}

Snapshot is the interface that wraps the basic methods of a leveldb.Snapshot

type Transaction

type Transaction interface {
	// Get returns the value for the given key.
	Get(key []byte, ro *opt.ReadOptions) ([]byte, error)
	// Has returns whether the DB does contains the given key.
	Has(key []byte, ro *opt.ReadOptions) (bool, error)
	// NewIterator returns an iterator for the latest snapshot of the DB.
	NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator
	// Put sets the value for the given key.
	Put(key, value []byte, wo *opt.WriteOptions) error
	// Delete deletes the value for the given key.
	Delete(key []byte, wo *opt.WriteOptions) error
	// Write writes the given batch to the DB.
	Write(b *leveldb.Batch, wo *opt.WriteOptions) error
	// Commit commits the transaction.
	Commit() error
	// Discard discards the transaction.
	Discard()
}

Transaction is the interface that wraps the basic methods of a leveldb.Transaction

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL