bakemono

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 24, 2023 License: Apache-2.0 Imports: 12 Imported by: 0

README

bakemono

Go Reference Go Report Card License ci-bakemono-tests codecov

bakemonois a cache storage engine implemented in Go.

Design goals:

  • Lightweight: easy to embed in your project
  • High-performance: high throughput and low latency
  • Code-readable: simple but powerful storage design, easy to read and understand

It is highly inspired by Apache Traffic Server, implemented for our cache-proxy project hitori.

Cache Storage Engine

What is a cache storage engine? What is the difference from an embeddable k-v database?

Similarities: They both are:

  • key-value storage
  • embeddable
  • persistent storage on SSD/HDD

Differences: Cache storage are:

  • allowed to drop data when conditions are met
  • fault-tolerant (just return a MISS when disk failure happens)

Cache storage is common in CDN (Content Delivery Network). It is used to cache frequently accessed data to reduce the load of backend servers.

The size of cache data is usually ~100TiB per bare-metal server.

Usage

Install

You can use bakemono as a pkg in your project.

go get github.com/bocchi-the-cache/bakemono
Init

Then simply import and init a Vol in your code:

func main() {
	cfg, err := bakemono.NewDefaultVolOptions("/tmp/bakemono-test.vol", 1024*512*100000, 1024*1024)
	if err != nil {
		panic(err)
	}
	
	v := &bakemono.Vol{}
	corrupted, err := v.Init(cfg)
	if err != nil {
		panic(err)
	}
	if corrupted {
		log.Printf("vol is corrupted, but fixed. ignore this if first time running.")
	}
	
	// ...
}
Read/Write
func main() {
    // ...
    
    // write
    err = v.Set([]byte("key"), []byte("value"))
    if err != nil {
        panic(err)
    }
    // read
	hit, data, err := v.Get([]byte("key"))
    if err != nil {
		// note: err can be not nil when disk failure happens
		// consider it as a MISS when err != nil, or log it to do further processing
        panic(err)
    }
    if !hit {
        panic("key should be hit")
    }
    if string(data) != "value" {
        panic("value should be 'value'")
    }
    log.Printf("value: %s", data)
	
	// close
    err = v.Close()
    if err != nil {
        panic(err)
    }
}
Note

Concurrency RW is supported.

In this version, they are sharing several RWLocks. We will give more tuning options in the future.

We highly recommend you to read tech design doc before using it in high-load scenarios.

Tech Design

TBD

Data Structure

TBD

Read/Write

TBD

Metadata Persistence

TBD

Performance

TBD

Other Information

Roadmap
  • We are working on basic caching functions in this stage.
  • When caching functions are stable, we will concentrate on performance tuning.
Name Origin

Bakemono is a Japanese word meaning "monster". In Chinese, it is called "贵物".

We wish it could be a lightweight but high-performance cache storage engine like a "bakemono"!

The logo is designed by Yige.

Who are bocchi-the-cache?

We are a group of engineers who are interested in storage, networking and Go programming language.

We are excited to build projects using new technologies and share our experience with others.

Documentation

Index

Constants

View Source
const (
	MajorVersion = 0
	MinorVersion = 1
)
View Source
const (
	DirDataSizeLv0 = SectorSize << (0 * 3)              // 512B
	DirDataSizeLv1 = SectorSize << (1 * 3)              // 4KB
	DirDataSizeLv2 = SectorSize << (2 * 3)              // 32KB
	DirDataSizeLv3 = SectorSize << (3 * 3)              // 256KB
	DirMaxDataSize = (SectorSize << (3 * 3)) * (1 << 6) //16MB
)

Dir constants

View Source
const (
	ChunkHeaderSizeFixed = 8 * 1 << 10 // 8KB
	ChunkKeyMaxSize      = 4 * 1 << 10 // 4KB
	ChunkDataSize        = 1 * 1 << 20 // 1MB
)
View Source
const (
	MagicBocchi = 0x000b0cc1
	MagicChunk  = 0x00114514

	DirDepth = 4

	MaxBucketsPerSegment = 1 << 16 / DirDepth
)

Vol constants

View Source
const BlockSize = 1 << 12
View Source
const MaxKeyLength = 4096
View Source
const (
	SectorSize = 512
)

Variables

View Source
var (
	HeaderSize = binary.Size(&VolHeaderFooter{})
	DirSize    = binary.Size(&Dir{})
)
View Source
var ErrCacheMiss = errors.New("cache miss")
View Source
var ErrChunkDataTooLarge = errors.New("chunk data too large")
View Source
var ErrChunkKeyTooLarge = errors.New("chunk key too large")
View Source
var ErrChunkVerifyFailed = errors.New("chunk verify failed")
View Source
var ErrKeyTooLong = errors.New("key too long")
View Source
var ErrVolFileCorrupted = errors.New("vol file corrupted")

Functions

This section is empty.

Types

type Chunk

type Chunk struct {
	Header  ChunkHeader
	DataRaw []byte
}

Chunk is the unit of data storage. Contains a header(meta) and data.

func (*Chunk) GetBinaryLength

func (c *Chunk) GetBinaryLength() Offset

GetBinaryLength returns the binary length of the chunk.

func (*Chunk) GetKeyData

func (c *Chunk) GetKeyData() ([]byte, []byte)

GetKeyData returns the key and data of the chunk. Note: The key is trimmed by the null character.

func (*Chunk) MarshalBinary

func (c *Chunk) MarshalBinary() ([]byte, error)

MarshalBinary returns the binary of the chunk.

func (*Chunk) ReadAt

func (c *Chunk) ReadAt(r io.ReaderAt, off, size int64) error

ReadAt reads the chunk from the reader at the offset.

func (*Chunk) Set

func (c *Chunk) Set(key, data []byte) error

Set sets the key and data of the chunk.

func (*Chunk) UnmarshalBinary

func (c *Chunk) UnmarshalBinary(data []byte) error

UnmarshalBinary unmarshal the binary of the chunk, and verify it. Note: the data must be the whole chunk.

func (*Chunk) Verify

func (c *Chunk) Verify() error

Verify verifies the chunk. It returns nil if the chunk is valid.

func (*Chunk) WriteAt

func (c *Chunk) WriteAt(w io.WriterAt, off int64) error

WriteAt writes the chunk to the writer at the offset.

type ChunkHeader

type ChunkHeader struct {
	Magic          uint32
	Checksum       uint32
	Key            [ChunkKeyMaxSize]byte
	DataLength     uint32
	HeaderSize     uint32
	HeaderChecksum uint32
}

ChunkHeader is the meta of a chunk.

func (*ChunkHeader) GenerateHeaderChecksum

func (c *ChunkHeader) GenerateHeaderChecksum() uint32

func (*ChunkHeader) MarshalBinary

func (c *ChunkHeader) MarshalBinary() ([]byte, error)

MarshalBinary returns the binary representation of the chunk header. TODO: could use a buffer pool to avoid allocating a new buffer every time.

func (*ChunkHeader) UnmarshalBinary

func (c *ChunkHeader) UnmarshalBinary(data []byte) error

UnmarshalBinary unmarshal the binary representation of the chunk header.

type Dir

type Dir struct {
	// contains filtered or unexported fields
}

func (*Dir) MarshalBinary

func (d *Dir) MarshalBinary() ([]byte, error)

func (*Dir) UnmarshalBinary

func (d *Dir) UnmarshalBinary(data []byte) error

type DirManager

type DirManager struct {
	ChunksNum            Offset
	SegmentsNum          Offset
	BucketsNum           Offset
	BucketsNumPerSegment Offset

	// map segment id to dirs
	Dirs         map[segId][]*Dir
	DirFreeStart map[segId]uint16

	// rw mutex for each segment
	SegMutexes map[segId]*sync.RWMutex
}

DirManager manages the dirs attached to a vol.

func (*DirManager) DiagDumpAllDirs

func (dm *DirManager) DiagDumpAllDirs()

func (*DirManager) DiagDumpAllDirsToString

func (dm *DirManager) DiagDumpAllDirsToString() string

func (*DirManager) DiagHangFreeDirs

func (dm *DirManager) DiagHangFreeDirs() (int, error)

func (*DirManager) DiagHangUsedDirs

func (dm *DirManager) DiagHangUsedDirs() (int, error)

func (*DirManager) DiagPanicHangUpDirs

func (dm *DirManager) DiagPanicHangUpDirs() error

func (*DirManager) Get

func (dm *DirManager) Get(key []byte) (hit bool, dirOffset Offset, d Dir)

Get returns HIT: the offset of the dir entry with the given key, MISS: the offset of last dir entry in the bucket

func (*DirManager) Init

func (dm *DirManager) Init(dirNum Offset) Offset

Init initializes the dir manager. Dirs will Initialized as empty by default.

func (*DirManager) InitEmptyDirs

func (dm *DirManager) InitEmptyDirs()

InitEmptyDirs initializes all dirs as empty, make chain.

func (*DirManager) MarshalBinary

func (dm *DirManager) MarshalBinary() (data []byte, err error)

MarshalBinary converts the Dirs to binary format

func (*DirManager) Set

func (dm *DirManager) Set(key []byte, off Offset, size int) (dirOffset Offset, err error)

func (*DirManager) UnmarshalBinary

func (dm *DirManager) UnmarshalBinary(data []byte) (err error)

type Engine

type Engine struct {
	SizeMb      uint32
	SliceSizeKb uint32

	Volume *Vol
	// contains filtered or unexported fields
}

func NewEngine

func NewEngine(cfg *EngineConfig) *Engine

func (*Engine) Delete

func (e *Engine) Delete(key []byte) error

func (*Engine) Get

func (e *Engine) Get(key []byte) ([]byte, error)

func (*Engine) Init

func (e *Engine) Init() error

func (*Engine) Set

func (e *Engine) Set(key, value []byte) error

type EngineConfig

type EngineConfig struct {
	Path        string
	SizeMb      uint32
	SliceSizeKb uint32
}

type Offset

type Offset uint64

type OffsetReaderWriterCloser

type OffsetReaderWriterCloser interface {
	io.WriterAt
	io.ReaderAt
	io.Closer
}

type Vol

type Vol struct {
	Path     string
	Fp       OffsetReaderWriterCloser
	Dm       *DirManager
	WritePos Offset

	Header *VolHeaderFooter

	SectorSize   uint32
	Length       Offset
	ChunkAvgSize Offset // average chunk size, adjusted by user.
	ChunksMaxNum Offset // max chunks num in this vol. calculated from ChunkAvgSize and Length

	HeaderAOffset Offset
	FooterAOffset Offset
	HeaderBOffset Offset
	FooterBOffset Offset
	DataOffset    Offset
	DirAOffset    Offset
	// contains filtered or unexported fields
}

Vol is a volume represents a file on disk. structure: Meta_A(header, dirs, footer) + Meta_B(header, dirs, footer) + Data(Chunks) dirs are organized segment->bucket->dir logically.

func (*Vol) Close

func (v *Vol) Close() error

Close closes the Vol.

func (*Vol) Get

func (v *Vol) Get(key []byte) (hit bool, value []byte, err error)

func (*Vol) Init

func (v *Vol) Init(cfg *VolOptions) (corrupted bool, err error)

func (*Vol) Set

func (v *Vol) Set(key, value []byte) (err error)

func (*Vol) SyncFlushLoop

func (v *Vol) SyncFlushLoop(interval time.Duration)

SyncFlushLoop flushes metadata to disk periodically.

type VolHeaderFooter

type VolHeaderFooter struct {
	Magic          uint32
	CreateUnixTime int64
	WritePos       Offset
	MajorVersion   uint32
	MinorVersion   uint32
	SyncSerial     uint64
	DirsChecksum   uint32

	Checksum uint32
}

func (*VolHeaderFooter) GenerateChecksum

func (v *VolHeaderFooter) GenerateChecksum() uint32

func (*VolHeaderFooter) MarshalBinary

func (v *VolHeaderFooter) MarshalBinary() (data []byte, err error)

func (*VolHeaderFooter) UnmarshalBinary

func (v *VolHeaderFooter) UnmarshalBinary(data []byte) error

type VolOptions

type VolOptions struct {
	Fp        OffsetReaderWriterCloser
	FileSize  Offset
	ChunkSize Offset

	FlushMetaInterval time.Duration
}

VolOptions to init a Vol. Note: do file open/truncate outside.

func NewDefaultVolOptions

func NewDefaultVolOptions(path string, fileSize, avgChunkSize uint64) (*VolOptions, error)

NewDefaultVolOptions creates a VolOptions with a file path. Note: It will create a file if not exists, and truncate it to the given sizeInternal.

func (*VolOptions) Check

func (cfg *VolOptions) Check() error

Check checks if the VolOptions is valid.

Directories

Path Synopsis
demo-app

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL