disktable

package module

v0.0.0-...-423a4a7 Latest Latest Go to latest Published: Oct 1, 2022 License: MIT Imports: 22 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/element-of-surprise/disktable

Links

Open Source Insights

README ¶

DiskTable

NOTE

This disk format is not stable yet, nor is the API. So unless you can afford to erase the data on a new build, don't use this.

If you want to use it as simply a diskcache that you create on startup, then it should be fine. But I make no guarantees.

Introduction

I needed a very basic NOSQL locally that I could embed in my Go program allowing me to serve off disk with some semblance of speed.

It needed to be:

Write once
Read many
Suppport a main data repo
Support a bunch of indexes on the data
Support duplicates in indexes
Lookup by indexes
Stream all data fast

These disktables are built off of outcaste.io/dgraph.

Now, this lacks SQL like characteristics. It is simply a bunch of key/value stores that let you do exact matches on indexes in order to find matching data. So I can say things like: "find cars that are blue with a v8 and made by chevy".

In the future I may allow things like searching by prefix and things like that. But those aren't in here today.

Documentation ¶

Overview ¶

Package disktable provides a write-once, read-many table with index supoprt. This is build on top of badgerDB, which is basically a key/value SSTable storage mechanism.

Let's create a table with some data:

dir := filepath.Join(os.TempDir(), "your_table"")
// Remove it if exists, may or may not want to do this. However you cannot
// create a table on a directory that exists.
os.RemoveAll(dir)

// These are our indexes on the data. AllowDuplicates allows duplicate entries
// in the index.
indexes := NewIndexes(
	&Index{Name: "First Name", AllowDuplicates: true},
	&Index{Name: "Last Name", AllowDuplicates: true},
	&Index{Name: "ID"},
}

w, err := New(dir, WithIndexes(indexes))
if err != nil {
	panic(err)
}

for _, data := range someData {
	b, err := proto.Marshal(data)
	if err != nil {
		panic(err)
	}

	insert := indexes.Insert(b).AddIndexKey(
		"First Name", UnsafeGetBytes(data.First),
	).AddIndexKey(
		"Last Name", UnsafeGetBytes(data.Last),
	).AddIndexKey(
		"ID", NumToByte(data.ID),
	)

	if err = w.WriteData(insert); err != nil {
		panic(err)
	}
}

if err := w.Close(); err != nil {
	panic(err)
}

Now let's open it and stream all records:

table, err := Open(dir)
if err != nil {
	panic(err)
}

results, err := table.FetchAll(ctx)
if err != nil {
	panic(err)
}

for result := range results {
	if result.Err != nil {
		panic(err)
	}

	entry := &pb.MyData{}
	if err := proto.Unmarshal(entry, result.Value); err != nil {
		panic(err)
	}

	fmt.Println("found: ", pretty.Sprint(entry))
}

Let's look for all entries that have the first name John:

results, err := table.Fetch(
	ctx,
	Lookup{IndexName: "First Name", Key: UnsafeGetBytes("John")},
)

if err != nil {
	panic(err)
}

for result := range results {
	if result.Err != nil {
		panic(err)
	}

	entry := &pb.MyData{}
	if err := proto.Unmarshal(entry, result.Value); err != nil {
		panic(err)
	}

	fmt.Println("found: ", pretty.Sprint(entry))
}

Index ¶

func ByteSlice2String(bs []byte) string
func ByteToNum[N Number](b []byte) (N, error)
func NumStreamGoroutines(n int) interface{ ... }
func NumToByte[N Number](n N) []byte
func UnsafeGetBytes(s string) []byte
func WithInMemory() interface{ ... }
func WithIndexes(indexes Indexes) interface{ ... }
func WithLogger(l badger.Logger) interface{ ... }
type FetchAllOption
type Index
type Indexes
- func NewIndexes(indexes ...*Index) Indexes
- func (i Indexes) Insert(value []byte) Insert
type Insert
- func NewInsert(value []byte) Insert
- func (i Insert) AddIndexKey(indexName string, key []byte) Insert
type Lookup
type Number
type OpenOption
type Result
type Table
- func Open(pathDir string, options ...OpenOption) (*Table, error)
- func (t *Table) Close() error
- func (t *Table) Fetch(ctx context.Context, primary Lookup, secondaries ...Lookup) (chan Result, error)
- func (t *Table) FetchAll(ctx context.Context, options ...FetchAllOption) (chan Result, error)
- func (t *Table) Get(ctx context.Context, i uint64) ([]byte, error)
- func (t *Table) Len() uint64
type WriteOption
type Writer
- func New(dirPath string, options ...WriteOption) (*Writer, error)
- func (d *Writer) Close() error
- func (d *Writer) Flatten()
- func (d *Writer) GC(discardRatio float64)
- func (d *Writer) WriteData(insert Insert) error

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ByteSlice2String ¶

func ByteSlice2String(bs []byte) string

ByteSlice2String coverts a []byte to a string without incurring the cost of a copy of the given []byte parameter. This is an unsafe operation and requires that you never modify the []byte slice you passed in.

func ByteToNum ¶

func ByteToNum[N Number](b []byte) (N, error)

ByteToNum returns a number stored in b that represents N. That number should be encoded in BigEndian, usually by NumToByte().

func NumStreamGoroutines ¶

func NumStreamGoroutines(n int) interface {
	FetchAllOption
	calloptions.CallOption
}

NumStreamGoroutines sets the number of goroutines to be used in FetchAll(). By default this is 16.

func NumToByte ¶

func NumToByte[N Number](n N) []byte

NumToByte converts a number into a BigEndian []byte sequence.

func UnsafeGetBytes ¶

func UnsafeGetBytes(s string) []byte

UnsafeGetBytes retrieves the underlying []byte held in string "s" without doing a copy. Do not modify the []byte or suffer the consequences.

func WithInMemory ¶

func WithInMemory() interface {
	WriteOption
	calloptions.CallOption
}

WithInMemory causes the DB to run from memory with no disk persistence. Great for tests. Can be used with:

New()

func WithIndexes ¶

func WithIndexes(indexes Indexes) interface {
	WriteOption
	calloptions.CallOption
}

Indexes provide the indexes that will be used on this database. Can be used with:

New()

func WithLogger ¶

func WithLogger(l badger.Logger) interface {
	WriteOption
	OpenOption
	calloptions.CallOption
}

WithLogger sets the logger for badger. By default this is goes to null. Can be used in:

New()
Open()

Types ¶

type FetchAllOption ¶

type FetchAllOption interface {
	// contains filtered or unexported methods
}

type Index ¶

type Index struct {
	// Name of the index. This must be unique.
	Name string
	// AllowDuplicates indicates if this index allows duplicate keys for the index.
	AllowDuplicates bool
	// contains filtered or unexported fields
}

Index represents an index on our databse.

type Indexes ¶

type Indexes struct {
	Err error
	// contains filtered or unexported fields
}

func NewIndexes ¶

func NewIndexes(indexes ...*Index) Indexes

func (Indexes) Insert ¶

func (i Indexes) Insert(value []byte) Insert

Insert creates an Insert type that can be used to write data to the database. See Insert for more information.

type Insert ¶

type Insert struct {
	Err error
	// contains filtered or unexported fields
}

Insert represents a data insert into the table and is created from Indexes. You must use Insert.AddIndexKey() to all all index keys defined in Indexes.

func NewInsert ¶

func NewInsert(value []byte) Insert

NewInsert creates a new Insert for writing into the table. This is only used when there are no indexes defined on the table. Otherwise you must uses Indexes.Insert().

func (Insert) AddIndexKey ¶

func (i Insert) AddIndexKey(indexName string, key []byte) Insert

AddIndexKey adds a key for a given index. You must capture the returned Insert as AddIndexKey() does not have a pointer receiver.

type Lookup ¶

type Lookup struct {
	// IndexName is the name of the index to do the lookup in.
	IndexName string
	// Key is the key in the index to lookup.
	Key []byte
}

Lookup provides the Index name and the Value that needs to match for the entry to be returned.

type Number ¶

type Number interface {
	~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 |
		~int | ~int8 | ~int16 | ~int32 | ~int64 |
		~float32 | ~float64
}

Number represents any uint*, int* or float* type.

type OpenOption ¶

type OpenOption interface {
	// contains filtered or unexported methods
}

OpenOption is optional arguments for Open().

type Result ¶

type Result struct {
	Value []byte
	Err   error
}

Result is the result of a table lookup.

type Table ¶

type Table struct {
	// contains filtered or unexported fields
}

Table represents our read-only table.

func Open ¶

func Open(pathDir string, options ...OpenOption) (*Table, error)

Open opens an existing disktable for reading.

func (*Table) Close ¶

func (t *Table) Close() error

Close closes all the databases.

func (*Table) Fetch ¶

func (t *Table) Fetch(ctx context.Context, primary Lookup, secondaries ...Lookup) (chan Result, error)

Fetch retrieves specifc rows that match all index lookups. You cannot currently specify multiple searches in the same index. If you wish to fetch all rows, use FetchAll(). Here is an example:

results, err := table.Fetch(
	ctx,
	Lookup{IndexName: "First Name", Key: UnsafeGetBytes("John")},
)

if err != nil {
	panic(err)
}

for result := range results {
	if result.Err != nil {
		panic(err)
	}

	entry := &pb.MyData{}
	if err := proto.Unmarshal(entry, result.Value); err != nil {
		panic(err)
	}

	fmt.Println("found: ", pretty.Sprint(entry))
}

func (*Table) FetchAll ¶

func (t *Table) FetchAll(ctx context.Context, options ...FetchAllOption) (chan Result, error)

FetchAll fetches all the tables entries.

func (*Table) Get ¶

func (t *Table) Get(ctx context.Context, i uint64) ([]byte, error)

Get gets the i'th entry stored in the table.

func (*Table) Len ¶

func (t *Table) Len() uint64

Len() returns the number of entries in the table.

type WriteOption ¶

type WriteOption interface {
	// contains filtered or unexported methods
}

WriteOption is optional arguments for New().

type Writer ¶

type Writer struct {
	// contains filtered or unexported fields
}

Writer represents our disk database.

func New ¶

func New(dirPath string, options ...WriteOption) (*Writer, error)

New creates a new instance of our table store. "dirPath" is the path to a directory that will be created. This must not already exist.

func (*Writer) Close ¶

func (d *Writer) Close() error

Close closes out the Writer.

func (*Writer) Flatten ¶

func (d *Writer) Flatten()

Flatten flattens the LSM tree.

func (*Writer) GC ¶

func (d *Writer) GC(discardRatio float64)

GC does garbage collection on the value log. If interested in everything it does, check out badger.DB.RunValueLogGC(). A value of 0 sets to 0.5 .

func (*Writer) WriteData ¶

func (d *Writer) WriteData(insert Insert) error

Write data writes data to our database. indexes must be in the same order when you created this DB and have the same number of indexes. You cannot reuse any "value" or "indexValues" passed until all data has been written. This is because a single WriteData() does not cause data to be written.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
testing
good

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL