diskmap

package
v1.2.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 5, 2024 License: Apache-2.0 Imports: 13 Imported by: 0

Documentation

Overview

Package diskmap provides disk storage of key/value pairs. The data is immutable once written. In addition the diskmap utilizes mmap on reads to make the random access faster. On Linux, diskmap uses directio to speed up writes.

Unlike a regular map, keys can have duplicates. If you need this filtered out you must do it before adding to the diskmap.

Usage is simplistic:

// Create a new diskmap.
p := path.Join(os.TempDir(), nextSuffix())
w, err := diskmap.New(p)
if err != nil {
  panic(err)
}

// Write a key/value to the diskmap.
if err := w.Write([]byte("hello"), []byte("world")); err != nil {
  panic(err)
}

// Close the file to writing.
w.Close()

// Open the file for reading.
m, err := diskmap.Open(p)
if err != nil {
  panic(err)
}

// Read the value at key "hello".
v, err := m.Read([]byte("hello"))
if err != nil {
  panic(err)
}

// Print the value at key "hello".
fmt.Println(string(v))

// Loop through all entries in the map.
ctx, cancel := context.WithCancel(context.Background())
defer cancel() // Make sure if we end the "range" early we don't leave any leaky goroutines.

for kv := range m.Range(ctx) {
  fmt.Printf("key: %s, value: %s", string(kv.Key), string(kv.Value))
}

Storage details:

The disk storage is fairly straight forward. Once all values have been written to disk, an index of the keys and offsets is written to disk. 64 bytes are reserved at the start of the file to hold a int64 that gives the offset where the index starts and the number of key/value pairs stored. The additional space is reserved for future use. All numbers are int64 values.

The file structure looks as follows:

<file>
  <reserve header space>
    [index offset]
    [number of key/value pairs]
  </reserve header space>
  <data section>
    [byte value]
    [byte value]
    ...
  </data section>
  <index section>
    [data offset]
    [data length]
    [key length]
    [key]
    ...
  </index section>
</file>

Reading the file is simply:

  • read the initial 8 bytes into a int64 to get the offset to the index
  • seek to the index offset
  • read the data storage offset
  • read the key length
  • read the key
  • build map of key to disk offset using the data above.

Index

Constants

This section is empty.

Variables

View Source
var ErrKeyNotFound = fmt.Errorf("key was not found")

ErrKeyNotFound indicates that a searched for key was not found.

Functions

func Clone added in v1.2.2

func Clone(b []byte) []byte

Clone returns a copy of the byte slice. This is for use with a Reader opened with OpenInMemory(), where it is unsafe to alter the returned data.

func UnsafeGetBytes added in v1.1.2

func UnsafeGetBytes(s string) []byte

UnsafeGetBytes retrieves the underlying []byte held in string "s" without doing a copy. Do not modify the []byte or suffer the consequences.

Types

type KeyValue

type KeyValue struct {
	// Err indicates that there was an error in the return stream.
	Err error

	// Key is the key the value was stored at.
	Key []byte

	// Value is the value stored at Key.
	Value []byte
}

KeyValue holds a key/value pair.

type OpenOption added in v1.2.2

type OpenOption func(r *reader) error

func WithNumReaders added in v1.2.2

func WithNumReaders(n int) OpenOption

WithNumReaders sets the number of readers to use when reading from the file. By default, 10 readers are used. Range() concurrently uses all available readers.

type Reader

type Reader interface {
	// Exists returns true if the key exists in the diskmap. Thread-safe.
	Exists(key []byte) bool

	// Keys returns all the keys in the diskmap. This reads only from memory
	// and does not access disk. Thread-safe.
	Keys(ctx context.Context) chan []byte

	// Read fetches key "k" and returns the value. If there are multi-key matches,
	// it returns the last key added. Errors when key not found. Thread-safe.
	Read(k []byte) ([]byte, error)

	// ReadAll fetches all matches to key "k". Does not error if not found. Thread-safe.
	ReadAll(k []byte) ([][]byte, error)

	// Range allows iteration over all the key/value pairs stored in the diskmap. If not interating
	// over all values, Cancel() or a timeout should be used on the Context to prevent a goroutine leak.
	Range(ctx context.Context) chan KeyValue

	// Close closes the diskmap file.
	Close() error
}

Reader provides read access to the the diskmap file. If you fake this, you need to embed it in your fake.

func Open

func Open(p string, options ...OpenOption) (Reader, error)

Open returns a Reader for a file written by a Writer.

func OpenInMemory added in v1.2.2

func OpenInMemory(p string) (Reader, error)

OpenInMemory opens a Reader that reads the entire diskmap into memory. All lookups are done against an ordered in-memory map. Altering data that is returned with alter it in the map, which is unsafe. All data should be copied before altering. Use Clone() for this. Note that this uses about 2x the amount of memory as the diskmap on disk during the OpenInMemory() call and the size of data + 2x keyspace to maintain the ordered map.

type Writer

type Writer interface {
	// Write writes a key/value pair to disk.  Thread-safe.
	Write(k, v []byte) error

	// Exists returns true if the key exists in the diskmap. Thread-safe.
	Exists(key []byte) bool

	// Close syncronizes the file to disk and closes it.
	Close() error
}

Writer provides write access to the diskmap file. An error on write makes the Writer unusable. If you fake this, you need to embed it in your fake.

func New

func New(p string, options ...WriterOption) (Writer, error)

New returns a new Writer that writes to file "p".

type WriterOption added in v1.2.2

type WriterOption func(writerOptions) (writerOptions, error)

WriterOption is an option for New().

func WithBufferSize added in v1.2.2

func WithBufferSize(size int) WriterOption

WithBufferSize sets the buffer size for the Writer. The default is 64MB.

Directories

Path Synopsis
internal
ordered
Package ordered provides a map that maintains insertion order.
Package ordered provides a map that maintains insertion order.
testing

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL