table

package
v0.0.0-...-e603270 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 15, 2023 License: Apache-2.0 Imports: 21 Imported by: 0

README

Size of table is 123,217,667 bytes for all benchmarks.

BenchmarkRead

$ go test -bench ^BenchmarkRead$ -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkRead-16    	      10	 154074944 ns/op
BenchmarkRead-16    	      10	 154340411 ns/op
BenchmarkRead-16    	      10	 151914489 ns/op
PASS
ok  	github.com/dgraph-io/badger/table	22.467s

Size of table is 123,217,667 bytes, which is ~118MB.

The rate is ~762MB/s using LoadToRAM (when table is in RAM).

To read a 64MB table, this would take ~0.084s, which is negligible.

BenchmarkReadAndBuild

$ go test -bench BenchmarkReadAndBuild -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkReadAndBuild-16    	       1	1026755231 ns/op
BenchmarkReadAndBuild-16    	       1	1009543316 ns/op
BenchmarkReadAndBuild-16    	       1	1039920546 ns/op
PASS
ok  	github.com/dgraph-io/badger/table	12.081s

The rate is ~123MB/s. To build a 64MB table, this would take ~0.56s. Note that this does NOT include the flushing of the table to disk. All we are doing above is reading one table (which is in RAM) and write one table in memory.

The table building takes 0.56-0.084s ~ 0.4823s.

BenchmarkReadMerged

Below, we merge 5 tables. The total size remains unchanged at ~122M.

$ go test -bench ReadMerged -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkReadMerged-16    	       2	 977588975 ns/op
BenchmarkReadMerged-16    	       2	 982140738 ns/op
BenchmarkReadMerged-16    	       2	 962046017 ns/op
PASS
ok  	github.com/dgraph-io/badger/table	27.433s

The rate is ~120MB/s. To read a 64MB table using merge iterator, this would take ~0.53s.

BenchmarkRandomRead

go test -bench BenchmarkRandomRead$ -run ^$ -count 3
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger/table
BenchmarkRandomRead-16    	  500000	      2645 ns/op
BenchmarkRandomRead-16    	  500000	      2648 ns/op
BenchmarkRandomRead-16    	  500000	      2614 ns/op
PASS
ok  	github.com/dgraph-io/badger/table	50.850s

For random read benchmarking, we are randomly reading a key and verifying its value.

DB Open benchmark

  1. Create badger DB with 2 billion key-value pairs (about 380GB of data)
badger fill -m 2000 --dir="/tmp/data" --sorted
  1. Clear buffers and swap memory
free -mh && sync && echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo swapoff -a && sudo swapon -a && free -mh

Also flush disk buffers

blockdev --flushbufs /dev/nvme0n1p4
  1. Run the benchmark
go test -run=^$ github.com/dgraph-io/badger -bench ^BenchmarkDBOpen$ -benchdir="/tmp/data" -v

badger 2019/06/04 17:15:56 INFO: 126 tables out of 1028 opened in 3.017s
badger 2019/06/04 17:15:59 INFO: 257 tables out of 1028 opened in 6.014s
badger 2019/06/04 17:16:02 INFO: 387 tables out of 1028 opened in 9.017s
badger 2019/06/04 17:16:05 INFO: 516 tables out of 1028 opened in 12.025s
badger 2019/06/04 17:16:08 INFO: 645 tables out of 1028 opened in 15.013s
badger 2019/06/04 17:16:11 INFO: 775 tables out of 1028 opened in 18.008s
badger 2019/06/04 17:16:14 INFO: 906 tables out of 1028 opened in 21.003s
badger 2019/06/04 17:16:17 INFO: All 1028 tables opened in 23.851s
badger 2019/06/04 17:16:17 INFO: Replaying file id: 1998 at offset: 332000
badger 2019/06/04 17:16:17 INFO: Replay took: 9.81µs
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/badger
BenchmarkDBOpen-16    	       1	23930082140 ns/op
PASS
ok  	github.com/dgraph-io/badger	24.076s

It takes about 23.851s to open a DB with 2 billion sorted key-value entries.

Documentation

Index

Constants

View Source
const (
	KB = 1024
	MB = KB * 1024
)

Variables

This section is empty.

Functions

func NewMergeIterator

func NewMergeIterator(iters []y.Iterator, reverse bool) y.Iterator

NewMergeIterator creates a merge iterator.

Types

type Builder

type Builder struct {
	// contains filtered or unexported fields
}

Builder is used in building a table.

func NewTableBuilder

func NewTableBuilder(stream streamclient.StreamClient, ct CompressionType) *Builder

NewTableBuilder makes a new TableBuilder.

func (*Builder) Add

func (b *Builder) Add(key []byte, value y.ValueStruct)

Add adds a key-value pair to the block.

func (*Builder) Close

func (b *Builder) Close()

Close closes the TableBuilder, FinishAll will close go routine and wait

func (*Builder) Empty

func (b *Builder) Empty() bool

Empty returns whether it's empty.

func (*Builder) FinishAll

func (b *Builder) FinishAll(headExtentID uint64, headOffset uint32, seqNum uint64, discards map[uint64]int64,
	memorySize uint64) (uint64, uint32, error)

Finish finishes the table by appending the index.

The table structure looks like +---------+------------+-----------+---------------+ | Block 1 | Block 2 | Block 3 | Block 4 | +---------+------------+-----------+---------------+ | Block 5 | Block 6 | Block ... | Block N | +---------+------------+-----------+---------------+ | MetaBlock | +---------+------------+-----------+---------------+

return metablock position(extentID, offset, error) tailExtentID和tailOffset表示当前commitLog对应的结尾, 在打开commitlog后, 从(tailExtentID, tailOffset)开始的 block读数据, 生成mt

func (*Builder) FinishBlock

func (b *Builder) FinishBlock()

Structure of Block. +-------------------+---------------------+--------------------+--------------+------------------+ | Entry1 | Entry2 | Entry3 | Entry4 | Entry5 | +-------------------+---------------------+--------------------+--------------+------------------+ | Entry6 | ... | ... | ... | EntryN | +-------------------+---------------------+--------------------+--------------+------------------+ | Block Meta(contains list of offsets used| Block Meta Size | | | | to perform binary search in the block) | (4 Bytes) | | | +-----------------------------------------+--------------------+--------------+------------------+

In case the data is encrypted, the "IV" is added to the end of the block.

type CompressionType

type CompressionType uint32
const (
	None   CompressionType = 0
	Snappy CompressionType = 1
	ZSTD   CompressionType = 2
)

type ConcatIterator

type ConcatIterator struct {
	// contains filtered or unexported fields
}

ConcatIterator concatenates the sequences defined by several iterators. (It only works with TableIterators, probably just because it's faster to not be so generic.)

func NewConcatIterator

func NewConcatIterator(tbls []*Table, reversed bool) *ConcatIterator

NewConcatIterator creates a new concatenated iterator

func (*ConcatIterator) Close

func (s *ConcatIterator) Close() error

Close implements y.Interface.

func (*ConcatIterator) Key

func (s *ConcatIterator) Key() []byte

Key implements y.Interface

func (*ConcatIterator) Next

func (s *ConcatIterator) Next()

Next advances our concat iterator.

func (*ConcatIterator) Rewind

func (s *ConcatIterator) Rewind()

Rewind implements y.Interface

func (*ConcatIterator) Seek

func (s *ConcatIterator) Seek(key []byte)

Seek brings us to element >= key if reversed is false. Otherwise, <= key.

func (*ConcatIterator) Valid

func (s *ConcatIterator) Valid() bool

Valid implements y.Interface

func (*ConcatIterator) Value

func (s *ConcatIterator) Value() y.ValueStruct

Value implements y.Interface

type Iterator

type Iterator struct {
	// contains filtered or unexported fields
}

Iterator is an iterator for a Table.

func (*Iterator) Close

func (itr *Iterator) Close() error

Close closes the iterator (and it must be called).

func (*Iterator) Key

func (itr *Iterator) Key() []byte

Key follows the y.Iterator interface. Returns the key with timestamp.

func (*Iterator) Next

func (itr *Iterator) Next()

Next follows the y.Iterator interface

func (*Iterator) Rewind

func (itr *Iterator) Rewind()

Rewind follows the y.Iterator interface

func (*Iterator) Seek

func (itr *Iterator) Seek(key []byte)

Seek follows the y.Iterator interface

func (*Iterator) Valid

func (itr *Iterator) Valid() bool

Valid follows the y.Iterator interface

func (*Iterator) Value

func (itr *Iterator) Value() (ret y.ValueStruct)

Value follows the y.Iterator interface

func (*Iterator) ValueCopy

func (itr *Iterator) ValueCopy() (ret y.ValueStruct)

ValueCopy copies the current value and returns it as decoded ValueStruct.

type MergeIterator

type MergeIterator struct {
	// contains filtered or unexported fields
}

MergeIterator merges multiple iterators. NOTE: MergeIterator owns the array of iterators and is responsible for closing them.

func (*MergeIterator) Close

func (mi *MergeIterator) Close() error

Close implements y.Iterator.

func (*MergeIterator) Key

func (mi *MergeIterator) Key() []byte

Key returns the key associated with the current iterator.

func (*MergeIterator) Next

func (mi *MergeIterator) Next()

Next returns the next element. If it is the same as the current key, ignore it.

func (*MergeIterator) Rewind

func (mi *MergeIterator) Rewind()

Rewind seeks to first element (or last element for reverse iterator).

func (*MergeIterator) Seek

func (mi *MergeIterator) Seek(key []byte)

Seek brings us to element with key >= given key.

func (*MergeIterator) Valid

func (mi *MergeIterator) Valid() bool

Valid returns whether the MergeIterator is at a valid element.

func (*MergeIterator) Value

func (mi *MergeIterator) Value() y.ValueStruct

Value returns the value associated with the iterator.

type Table

type Table struct {
	utils.SafeMutex

	// Stores the total size of key-values in skiplist.
	EstimatedSize uint64

	Loc     pspb.Location //saved address in rowStream
	LastSeq uint64
	//all data before [vpExtentID, vpOffset] is in rowStream. log replay starts from [vpExtentID, vpOffset]
	VpExtentID uint64
	VpOffset   uint32
	//extentID => discard count
	Discards map[uint64]int64

	CompressionType  CompressionType
	CompressedSize   uint32
	UncompressedSize uint32
	// contains filtered or unexported fields
}

func OpenTable

func OpenTable(streamReader streamclient.StreamClient,
	extentID uint64, offset uint32) (*Table, error)

func (*Table) Biggest

func (t *Table) Biggest() []byte

Biggest is its biggest key, or nil if there are none

func (*Table) Close

func (t *Table) Close()

func (*Table) DoesNotHave

func (t *Table) DoesNotHave(hash uint64) bool

func (*Table) FirstOccurrence

func (t *Table) FirstOccurrence() uint64

func (*Table) MidKey

func (t *Table) MidKey() []byte

the first key of the block with a zero-based index of (n) / 2, where the total number of table is n

func (*Table) NewIterator

func (t *Table) NewIterator(reversed bool) *Iterator

NewIterator returns a new iterator of the Table

func (*Table) Smallest

func (t *Table) Smallest() []byte

Smallest is its smallest key, or nil if there are none

type TableInterface

type TableInterface interface {
	Smallest() []byte
	Biggest() []byte
	DoesNotHave(hash uint64) bool
}

TableInterface is useful for testing.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL