rumorlog

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 4, 2020 License: MIT Imports: 1 Imported by: 0

README

Go Rumorlog builds.sr.ht status

Package rumorlog tries to implement a fast way to map hashed identifiers to sequence numbers. These sequence numbers can then be used to build integer bitmaps. These bitmaps can then be used for set operations.

The API can be seen on godoc.

Installation

go get go.mindeco.de/rumorlog

Simple usage examples can be seen on the trie implementation package.

Implementations

A very naive first draft for the idea is done as a SQLite implementation with this schema:

CREATE TABLE simple_index (
	seq integer PRIMARY KEY,
	key blob unique
);

CREATE INDEX key_tree ON simple_index (key);

The idea is to use a trie data strucutre together with an append-only log to optimize both mapping directions. The offset log will impose an (arbitrary) sequence on the hashes and the trie is used to find the sequence number for a given hash.

For comparisson there also is a flat mkv but the GetKey lookup is implemented with a naive search loop.

Benchmark

We insert a fixed size set of identifiers before running benchmarks of GetKey and GetSequence to see if the lookups are stable with growing set sizes.

$ go test ./tests -timeout 20m -bench=. -benchtime=3s -benchmem | tee tests/first-draft.bench

goos: linux
goarch: amd64
pkg: go.mindeco.de/rumorlog/tests
BenchmarkAll/add-trie+ofst-8         	          683858	      5073 ns/op	    1304 B/op	      13 allocs/op
BenchmarkAll/fill/trie+ofst-20-8     	            4797	    706998 ns/op	   87296 B/op	    1064 allocs/op
BenchmarkAll/fill/trie+ofst-100-8    	            1308	   2729179 ns/op	  327041 B/op	    5055 allocs/op
BenchmarkAll/fill/trie+ofst-1000-8   	             141	  25566906 ns/op	 2909403 B/op	   50053 allocs/op
BenchmarkAll/fill/trie+ofst-10000-8  	              13	 267205672 ns/op	29858106 B/op	  495028 allocs/op
BenchmarkAll/getSeq/trie+ofst-20-8   	         1658960	      2071 ns/op	     655 B/op	       6 allocs/op
BenchmarkAll/getSeq/trie+ofst-100-8  	         1733563	      2121 ns/op	     655 B/op	       6 allocs/op
BenchmarkAll/getSeq/trie+ofst-1000-8 	         1649378	      2176 ns/op	     655 B/op	       6 allocs/op
BenchmarkAll/getSeq/trie+ofst-10000-8         	 1592440	      2293 ns/op	     655 B/op	       6 allocs/op
BenchmarkAll/getKey/trie+ofst-20-8            	  417561	      8318 ns/op	    2911 B/op	      17 allocs/op
BenchmarkAll/getKey/trie+ofst-100-8           	  441717	      8705 ns/op	    2911 B/op	      17 allocs/op
BenchmarkAll/getKey/trie+ofst-1000-8          	  385377	      8540 ns/op	    2911 B/op	      17 allocs/op
BenchmarkAll/getKey/trie+ofst-10000-8         	  421308	      9080 ns/op	    2911 B/op	      17 allocs/op
BenchmarkAll/add-modernkv-8                   	   16825	    251873 ns/op	   62567 B/op	     258 allocs/op
BenchmarkAll/fill/modernkv-20-8               	      61	  58563973 ns/op	  116278 B/op	    1382 allocs/op
BenchmarkAll/fill/modernkv-100-8              	      75	  47505093 ns/op	  717073 B/op	    8895 allocs/op
BenchmarkAll/fill/modernkv-1000-8             	      21	 156441060 ns/op	25172910 B/op	  130513 allocs/op
BenchmarkAll/fill/modernkv-10000-8            	       2	1891127095 ns/op	278148252 B/op	 1695248 allocs/op
BenchmarkAll/getSeq/modernkv-20-8             	  537592	      6851 ns/op	    3940 B/op	      48 allocs/op
BenchmarkAll/getSeq/modernkv-100-8            	  356325	     10230 ns/op	   10701 B/op	      65 allocs/op
BenchmarkAll/getSeq/modernkv-1000-8           	  111956	     28992 ns/op	   73344 B/op	      96 allocs/op
BenchmarkAll/getSeq/modernkv-10000-8          	   82911	     38719 ns/op	   57426 B/op	     123 allocs/op
BenchmarkAll/getKey/modernkv-20-8             	  431254	      8504 ns/op	    3978 B/op	      80 allocs/op
BenchmarkAll/getKey/modernkv-100-8            	  136754	     25516 ns/op	   14417 B/op	     321 allocs/op
BenchmarkAll/getKey/modernkv-1000-8           	    7161	    536460 ns/op	  138602 B/op	    3338 allocs/op
BenchmarkAll/getKey/modernkv-10000-8          	     631	   4929963 ns/op	  935402 B/op	   28195 allocs/op
BenchmarkAll/add-sqlite-8                     	     793	   4979380 ns/op	    2296 B/op	      45 allocs/op
BenchmarkAll/fill/sqlite-20-8                 	      32	 114943073 ns/op	   63338 B/op	     656 allocs/op
BenchmarkAll/fill/sqlite-100-8                	       7	 524763195 ns/op	  193181 B/op	    2739 allocs/op
BenchmarkAll/fill/sqlite-1000-8               	       1	5066513896 ns/op	 1623088 B/op	   26122 allocs/op
BenchmarkAll/fill/sqlite-10000-8              	       1	50756266937 ns/op	16168112 B/op	  260126 allocs/op
BenchmarkAll/getSeq/sqlite-20-8               	   83376	     43069 ns/op	    2063 B/op	      50 allocs/op
BenchmarkAll/getSeq/sqlite-100-8              	   83034	     45102 ns/op	    2063 B/op	      50 allocs/op
BenchmarkAll/getSeq/sqlite-1000-8             	   84038	     43742 ns/op	    2063 B/op	      50 allocs/op
BenchmarkAll/getSeq/sqlite-10000-8            	   82750	     44007 ns/op	    2063 B/op	      50 allocs/op
BenchmarkAll/getKey/sqlite-20-8               	  177464	     21390 ns/op	    1439 B/op	      29 allocs/op
BenchmarkAll/getKey/sqlite-100-8              	  170108	     20689 ns/op	    1439 B/op	      29 allocs/op
BenchmarkAll/getKey/sqlite-1000-8             	  171957	     20939 ns/op	    1439 B/op	      29 allocs/op
BenchmarkAll/getKey/sqlite-10000-8            	  176138	     21546 ns/op	    1439 B/op	      29 allocs/op
PASS
ok  	go.mindeco.de/rumorlog/tests	707.135s

We can see that while sqlite is stable, it's orders of magnitude slower compared to the specialzied approach of trie+ofset.

ps: These results were generated on an Intel Core i7-4770 CPU @ 3.40GHz with 16Gb RAM.

License

MIT

Documentation

Overview

Package rumorlog tries to implement a fast way to map hashed identifiers to sequence numbers. These sequence numbers can then be used to build integer bitmaps. These bitmaps can then be used for set operations.

To achive this, we impose an order on the hashes by appending them to a sequence log. A trie will be used to find the sequence number for a given hash.

Index

Constants

This section is empty.

Variables

View Source
var ErrNoSuchKey = errors.New("rumorlog: no such key")

Functions

This section is empty.

Types

type Index

type Index interface {

	// GetSequence looks for key and returns the sequence if it exists.
	// if it doesn't exist it should return ErrNoSuchKey.
	// if make is true, it is created if it doesn't exist yet.
	GetSequence(key []byte, make bool) (uint64, error)

	// GetKey does the revere lookup to GetSequence.
	// If the sequence is out of bounds ErrNoSuchKey is returned.
	GetKey(seq uint64) ([]byte, error)

	// Close closes the underlying data store
	Close() error

	/* TODO: optional? */
	Has(key []byte) bool
	Insert(key []byte) (uint64, error)
}

Index matches a set of unique keys to sequence numbers

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL