schemaless

package module
v0.0.0-...-4406380 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 14, 2021 License: MIT Imports: 6 Imported by: 0

README

This is an open-source, MIT-licensed implementation of Uber's Schemaless (immutable BigTable-style sharded RDBMS).

This is only a learning project and not production ready.

The github issues list describes what has been intentionally left unimplemented and what differences there are between this implementation and Uber's (based on the materials linked at the end.)

All code is in Go.

API SUPPORTED

Get(ctx context.Context, tableName, rowKey, columnKey string, refKey int64) (cell models.Cell, found bool, err error)

GetLatest(ctx context.Context, tableName, rowKey, columnKey string) (cell models.Cell, found bool, err error)

PartitionRead(ctx context.Context, tableName string, partitionNumber int, location string, value int64, limit int) (cells []models.Cell, found bool, err error)

FindPartition(tblName, rowKey string) (int, error) 

Put(ctx context.Context, tableName, rowKey, columnKey string, refKey int64, jsonBody string) (err error)

ResetConnection(ctx context.Context, key string) error

Destroy(ctx context.Context) error

DATABASE SUPPORT

For learning or other:

* SQLite

For more serious testing and usage:

* MySQL

* Postgres

DISCLAIMER

I do not work for Uber Technologies.

VIDEOS

"Taking Storage for a Ride With Uber", https://www.youtube.com/watch?v=Dg76cNaeB4s (30 mins)

"GOTO 2016 • Taking Storage for a Ride", https://www.youtube.com/watch?v=kq4gp90QUcs (1 hour)

ARTICLES

"Designing Schemaless, Uber Engineering’s Scalable Datastore Using MySQL"

https://eng.uber.com/schemaless-part-one/ - "Part One"

https://eng.uber.com/schemaless-part-two/ - "Part Two"

https://eng.uber.com/schemaless-part-three/ - "Part Three"

https://eng.uber.com/schemaless-rewrite/ - "Code Migration in Production: Rewriting the Sharding Layer of Uber’s Schemaless Datastore"

https://eng.uber.com/mezzanine-codebase-data-migration/ - "Project Mezzanine: The Great Migration"

OTHER RESOURCES

https://backchannel.org/blog/friendfeed-schemaless-mysql - FriendFeed's original design

https://engineering.pinterest.com/blog/sharding-pinterest-how-we-scaled-our-mysql-fleet - Pinterest's original design

https://martinfowler.com/articles/schemaless/ - Martin Fowler's slides on Schemaless Data Structures

SIMILAR OPEN-SOURCE WORK

https://github.com/hoteltonight/shameless - A similar append-only data store in Ruby influenced by Schemaless.

https://github.com/dgryski/go-shardedkv - Much of the implementation is a derivative of this work.

THANKS

To Damian Gryski for releasing https://github.com/dgryski/go-shardedkv

To Uber Technologies for releasing numerous materials on the design and implementation of Mezzanine, their Schemaless store.

To John Rinehart for NixOS support.

To Simon Dassow for Postgres debugging and support.

And to many others :)

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Chooser

type Chooser interface {
	// SetBuckets sets the list of known buckets from which the chooser should select
	SetBuckets([]string) error
	// Choose returns a bucket for a given key
	Choose(key string) string
	// Buckets returns the list of known buckets
	Buckets() []string
}

Chooser maps keys to shards

type DataStore

type DataStore struct {
	// contains filtered or unexported fields
}

DataStore is our overall datastore structure, backed by at least one KVStore.

func New

func New() *DataStore

New is an empty constructor for DataStore.

func (*DataStore) Destroy

func (ds *DataStore) Destroy(ctx context.Context) error

Destroy implements Storage.Destroy()

func (*DataStore) FindPartition

func (ds *DataStore) FindPartition(tblName, rowKey string) (int, error)

FindPartition implements Storage.FindPartition()

func (*DataStore) Get

func (ds *DataStore) Get(ctx context.Context, tblName, rowKey, columnKey string, refKey int64) (cell models.Cell, found bool, err error)

Get implements Storage.Get()

func (*DataStore) GetLatest

func (ds *DataStore) GetLatest(ctx context.Context, tblName, rowKey, columnKey string) (cell models.Cell, found bool, err error)

GetLatest implements Storage.GetLatest()

func (*DataStore) PartitionRead

func (ds *DataStore) PartitionRead(ctx context.Context, tblName string, partitionNumber int, location string, value int64, limit int) (cells []models.Cell, found bool, err error)

PartitionRead implements Storage.PartitionRead()

func (*DataStore) Put

func (ds *DataStore) Put(ctx context.Context, tblName, rowKey, columnKey string, refKey int64, body string) error

Put implements Storage.Put()

func (*DataStore) ResetConnection

func (ds *DataStore) ResetConnection(ctx context.Context, tblName, rowKey string) error

ResetConnection implements Storage.ResetConnection()

func (*DataStore) WithName

func (ds *DataStore) WithName(tblName string, bucketName string) *DataStore

func (*DataStore) WithSources

func (ds *DataStore) WithSources(tblName string, shards []core.Shard) *DataStore

type Shard

type Shard struct {
	Name    string
	Backend Storage
}

Shard is a named storage backend

type Storage

type Storage interface {
	// Get the cell designated (row key, column key, ref key)
	Get(ctx context.Context, tblName, rowKey, columnKey string, refKey int64) (cell models.Cell, found bool, err error)

	// GetLatest returns the latest value for a given rowKey and columnKey, and a bool indicating if the key was present
	GetLatest(ctx context.Context, tblName, rowKey, columnKey string) (cell models.Cell, found bool, err error)

	// PartitionRead returns 'limit' cells after 'location' from shard 'shard_no'
	PartitionRead(ctx context.Context, tblName string, partitionNumber int, location string, value int64, limit int) (cells []models.Cell, found bool, err error)

	// Put inits a cell with given row key, column key, and ref key
	Put(ctx context.Context, tblName, rowKey, columnKey string, refKey int64, body string) (err error)

	// FindPartition returns the partition number for a specific rowKey
	FindPartition(tblName, rowKey string) int

	// ResetConnection reinitializes the connection for the shard responsible for a key
	ResetConnection(ctx context.Context, key string) error

	// Destroy cleans up any resources, etc.
	Destroy(ctx context.Context) error
}

Storage is a key-value storage backend

Directories

Path Synopsis
examples
schemalessd/pkg/middleware/zap
Package zap is a mirror of https://github.com/treastech/logger/blob/master/logger.go with some slight tweaks in names.
Package zap is a mirror of https://github.com/treastech/logger/blob/master/logger.go with some slight tweaks in names.
storage
mysql
Package mysql is a mysql-backed Schemaless store.
Package mysql is a mysql-backed Schemaless store.
postgres
Package postgres is a postgres-backed Schemaless store.
Package postgres is a postgres-backed Schemaless store.
tools

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL