Documentation ¶
Overview ¶
Package local provides an implementation of key-value store on your filesystem.
It implements fsdb.Local interface.
Layout ¶
With default options (SHA-512/224 hash, 3 directory levels), a key of "key" will be hashed into
6cb1b0e50d74419e2244eaa7328235f71b48c7e1c33b23f6f9517d14
and the files will be stored under:
<fsdb-root>/ data/ 6c/ b1/ b0/ e50d74419e2244eaa7328235f71b48c7e1c33b23f6f9517d14/ key // Key file data // Data file if no compression data.gz // Data file if gzip enabled
There could also be temporary files for unfinished write operations under
<fsdb-root>/_tmp/fsdb_<tmpdir>/
Both hash function and directory levels are configurable.
Atomicity ¶
There's no extra locks in the implementation. The atomicity relies on the atomicity guaranteed by your filesystem on operations like move (rename), delete, open, etc.
Read Before Overwriting Finishes on the Same Key ¶
If you issue a read operation before an overwrite operation (write operation on an existing key) on the same key finishes (returns), based on your timing you could get either the previous data, or new data. You will never read corrupt/incomplete data when fsdb returns nil error.
In details, the write operation sequence is:
- Check for key collision
- Write key-value data onto temporary directory
- Move new key-value data from temporary directory to actual directory
- Delete old value(s), if any
Read operations issued before Step 3 will get the old data. Read operations issued after Step 3 will get the new data.
Two Write Operations on the Same Key ¶
If you issue a write operation before another write operation on the same key finishes, the one that finishes first will be overwritten by the other.
Compression ¶
This implementation supports optional gzip compression with configurable compression levels.
If you changed the compression option on a non-empty local fsdb, the old data is still readable and the new data will be stored per new compression option.
Run
go test -bench .
will show you the read and write benchmark results of different compression options of the filesystem under current directory for different typical sizes of random binary data. Please note that the write time includes the time used to generate the random binary data. Also note that the read time could be much smaller than reality when the corresponding write benchmark test only ran for a few times (large size sample, like the 256M benchmark test results below), as those read benchmark tests will read from the same file over and over again, and most filesystem will optimize for such use case.
You should choose your compression options based on your benchmark result, typical data size and estimated read/write operation ratio.
A sample result on Debian sid (kernel 4.16) ext4 non-SSD is:
$ vgo test -bench=. goos: linux goarch: amd64 pkg: github.com/fishy/fsdb/local BenchmarkReadWrite/1K/gzip-min/write-2 3000 484932 ns/op BenchmarkReadWrite/1K/gzip-min/read-2 30000 38385 ns/op BenchmarkReadWrite/1K/gzip-default/write-2 5000 499867 ns/op BenchmarkReadWrite/1K/gzip-default/read-2 50000 37513 ns/op BenchmarkReadWrite/1K/gzip-max/write-2 5000 450503 ns/op BenchmarkReadWrite/1K/gzip-max/read-2 50000 37402 ns/op BenchmarkReadWrite/1K/nocompression/write-2 5000 492013 ns/op BenchmarkReadWrite/1K/nocompression/read-2 50000 36236 ns/op BenchmarkReadWrite/10K/nocompression/write-2 5000 1083269 ns/op BenchmarkReadWrite/10K/nocompression/read-2 50000 38601 ns/op BenchmarkReadWrite/10K/gzip-min/write-2 5000 457293 ns/op BenchmarkReadWrite/10K/gzip-min/read-2 50000 37044 ns/op BenchmarkReadWrite/10K/gzip-default/write-2 5000 501657 ns/op BenchmarkReadWrite/10K/gzip-default/read-2 50000 37068 ns/op BenchmarkReadWrite/10K/gzip-max/write-2 5000 483487 ns/op BenchmarkReadWrite/10K/gzip-max/read-2 50000 35943 ns/op BenchmarkReadWrite/1M/gzip-min/write-2 1000 7456528 ns/op BenchmarkReadWrite/1M/gzip-min/read-2 50000 37182 ns/op BenchmarkReadWrite/1M/gzip-default/write-2 1000 8039125 ns/op BenchmarkReadWrite/1M/gzip-default/read-2 50000 35744 ns/op BenchmarkReadWrite/1M/gzip-max/write-2 1000 11203349 ns/op BenchmarkReadWrite/1M/gzip-max/read-2 50000 37702 ns/op BenchmarkReadWrite/1M/nocompression/write-2 1000 8565408 ns/op BenchmarkReadWrite/1M/nocompression/read-2 50000 38373 ns/op BenchmarkReadWrite/10M/gzip-default/write-2 100 72915402 ns/op BenchmarkReadWrite/10M/gzip-default/read-2 50000 36421 ns/op BenchmarkReadWrite/10M/gzip-max/write-2 100 79632091 ns/op BenchmarkReadWrite/10M/gzip-max/read-2 50000 35591 ns/op BenchmarkReadWrite/10M/nocompression/write-2 100 69915961 ns/op BenchmarkReadWrite/10M/nocompression/read-2 50000 36835 ns/op BenchmarkReadWrite/10M/gzip-min/write-2 100 66220149 ns/op BenchmarkReadWrite/10M/gzip-min/read-2 50000 36418 ns/op BenchmarkReadWrite/256M/nocompression/write-2 2 2380085276 ns/op BenchmarkReadWrite/256M/nocompression/read-2 50000 34536 ns/op BenchmarkReadWrite/256M/gzip-min/write-2 2 2916555472 ns/op BenchmarkReadWrite/256M/gzip-min/read-2 50000 34695 ns/op BenchmarkReadWrite/256M/gzip-default/write-2 2 2347434829 ns/op BenchmarkReadWrite/256M/gzip-default/read-2 50000 35709 ns/op BenchmarkReadWrite/256M/gzip-max/write-2 2 2311051473 ns/op BenchmarkReadWrite/256M/gzip-max/read-2 50000 36053 ns/op PASS ok github.com/fishy/fsdb/local 181.685s
A sample result on macOS 10.13.4 HFS+ hybrid-disk:
$ vgo test -bench=. goos: darwin goarch: amd64 pkg: github.com/fishy/fsdb/local BenchmarkReadWrite/1K/gzip-default/write-4 3000 551077 ns/op BenchmarkReadWrite/1K/gzip-default/read-4 30000 45361 ns/op BenchmarkReadWrite/1K/gzip-max/write-4 2000 579064 ns/op BenchmarkReadWrite/1K/gzip-max/read-4 30000 45244 ns/op BenchmarkReadWrite/1K/nocompression/write-4 3000 589749 ns/op BenchmarkReadWrite/1K/nocompression/read-4 30000 44290 ns/op BenchmarkReadWrite/1K/gzip-min/write-4 3000 601753 ns/op BenchmarkReadWrite/1K/gzip-min/read-4 30000 41715 ns/op BenchmarkReadWrite/10K/gzip-min/write-4 2000 579109 ns/op BenchmarkReadWrite/10K/gzip-min/read-4 30000 41290 ns/op BenchmarkReadWrite/10K/gzip-default/write-4 2000 622533 ns/op BenchmarkReadWrite/10K/gzip-default/read-4 30000 45317 ns/op BenchmarkReadWrite/10K/gzip-max/write-4 3000 592523 ns/op BenchmarkReadWrite/10K/gzip-max/read-4 30000 41447 ns/op BenchmarkReadWrite/10K/nocompression/write-4 2000 640485 ns/op BenchmarkReadWrite/10K/nocompression/read-4 30000 41452 ns/op BenchmarkReadWrite/1M/gzip-min/write-4 300 6990783 ns/op BenchmarkReadWrite/1M/gzip-min/read-4 30000 43776 ns/op BenchmarkReadWrite/1M/gzip-default/write-4 200 10915348 ns/op BenchmarkReadWrite/1M/gzip-default/read-4 30000 42378 ns/op BenchmarkReadWrite/1M/gzip-max/write-4 200 10628149 ns/op BenchmarkReadWrite/1M/gzip-max/read-4 30000 40477 ns/op BenchmarkReadWrite/1M/nocompression/write-4 300 8656265 ns/op BenchmarkReadWrite/1M/nocompression/read-4 30000 44789 ns/op BenchmarkReadWrite/10M/nocompression/write-4 20 114984454 ns/op BenchmarkReadWrite/10M/nocompression/read-4 50000 38741 ns/op BenchmarkReadWrite/10M/gzip-min/write-4 50 71933590 ns/op BenchmarkReadWrite/10M/gzip-min/read-4 50000 39491 ns/op BenchmarkReadWrite/10M/gzip-default/write-4 20 67794399 ns/op BenchmarkReadWrite/10M/gzip-default/read-4 50000 38596 ns/op BenchmarkReadWrite/10M/gzip-max/write-4 30 116057303 ns/op BenchmarkReadWrite/10M/gzip-max/read-4 50000 38180 ns/op BenchmarkReadWrite/256M/gzip-default/write-4 1 1921484298 ns/op BenchmarkReadWrite/256M/gzip-default/read-4 50000 39500 ns/op BenchmarkReadWrite/256M/gzip-max/write-4 1 1657779284 ns/op BenchmarkReadWrite/256M/gzip-max/read-4 50000 40050 ns/op BenchmarkReadWrite/256M/nocompression/write-4 1 1438590200 ns/op BenchmarkReadWrite/256M/nocompression/read-4 50000 39634 ns/op BenchmarkReadWrite/256M/gzip-min/write-4 1 1445182668 ns/op BenchmarkReadWrite/256M/gzip-min/read-4 50000 39145 ns/op PASS ok github.com/fishy/fsdb/local 96.428s
And a sample result on macOS 10.13.4 HFS+ SSD:
$ vgo test -bench=. goos: darwin goarch: amd64 pkg: github.com/fishy/fsdb/local BenchmarkReadWrite/256M/nocompression/write-8 5 356710471 ns/op BenchmarkReadWrite/256M/nocompression/read-8 30000 43209 ns/op BenchmarkReadWrite/256M/gzip-min/write-8 5 228341933 ns/op BenchmarkReadWrite/256M/gzip-min/read-8 30000 44043 ns/op BenchmarkReadWrite/256M/gzip-default/write-8 5 265429180 ns/op BenchmarkReadWrite/256M/gzip-default/read-8 30000 45661 ns/op BenchmarkReadWrite/256M/gzip-max/write-8 5 280203940 ns/op BenchmarkReadWrite/256M/gzip-max/read-8 30000 44006 ns/op BenchmarkReadWrite/1K/nocompression/write-8 2000 824098 ns/op BenchmarkReadWrite/1K/nocompression/read-8 30000 45584 ns/op BenchmarkReadWrite/1K/gzip-min/write-8 2000 766742 ns/op BenchmarkReadWrite/1K/gzip-min/read-8 20000 85293 ns/op BenchmarkReadWrite/1K/gzip-default/write-8 2000 858178 ns/op BenchmarkReadWrite/1K/gzip-default/read-8 30000 59477 ns/op BenchmarkReadWrite/1K/gzip-max/write-8 2000 839374 ns/op BenchmarkReadWrite/1K/gzip-max/read-8 30000 46870 ns/op BenchmarkReadWrite/10K/nocompression/write-8 2000 805031 ns/op BenchmarkReadWrite/10K/nocompression/read-8 20000 51670 ns/op BenchmarkReadWrite/10K/gzip-min/write-8 2000 929401 ns/op BenchmarkReadWrite/10K/gzip-min/read-8 20000 80976 ns/op BenchmarkReadWrite/10K/gzip-default/write-8 2000 818654 ns/op BenchmarkReadWrite/10K/gzip-default/read-8 30000 44932 ns/op BenchmarkReadWrite/10K/gzip-max/write-8 2000 752227 ns/op BenchmarkReadWrite/10K/gzip-max/read-8 30000 47205 ns/op BenchmarkReadWrite/1M/gzip-min/write-8 1000 1371292 ns/op BenchmarkReadWrite/1M/gzip-min/read-8 10000 101911 ns/op BenchmarkReadWrite/1M/gzip-default/write-8 1000 1347627 ns/op BenchmarkReadWrite/1M/gzip-default/read-8 20000 54486 ns/op BenchmarkReadWrite/1M/gzip-max/write-8 1000 1408124 ns/op BenchmarkReadWrite/1M/gzip-max/read-8 20000 51155 ns/op BenchmarkReadWrite/1M/nocompression/write-8 1000 1238631 ns/op BenchmarkReadWrite/1M/nocompression/read-8 30000 45901 ns/op BenchmarkReadWrite/10M/nocompression/write-8 100 10761830 ns/op BenchmarkReadWrite/10M/nocompression/read-8 30000 42879 ns/op BenchmarkReadWrite/10M/gzip-min/write-8 100 11185495 ns/op BenchmarkReadWrite/10M/gzip-min/read-8 30000 43027 ns/op BenchmarkReadWrite/10M/gzip-default/write-8 100 11036515 ns/op BenchmarkReadWrite/10M/gzip-default/read-8 30000 43005 ns/op BenchmarkReadWrite/10M/gzip-max/write-8 100 11564331 ns/op BenchmarkReadWrite/10M/gzip-max/read-8 30000 43158 ns/op PASS ok github.com/fishy/fsdb/local 88.676s
Other Notes ¶
Remember to set appropriate number of file number limit on your filesystem.
Example ¶
package main import ( "context" "fmt" "io/ioutil" "os" "strings" "github.com/fishy/fsdb" "github.com/fishy/fsdb/local" ) func main() { root, _ := ioutil.TempDir("", "fsdb_") defer os.RemoveAll(root) db := local.Open(local.NewDefaultOptions(root).SetUseGzip(true)) key := fsdb.Key("name") ctx := context.Background() db.Write(ctx, key, strings.NewReader("Anakin Skywalker")) reader, err := db.Read(ctx, key) if err != nil { // TODO: handle error } name, err := ioutil.ReadAll(reader) reader.Close() if err != nil { // TODO: handle error } fmt.Println(string(name)) db.Write(ctx, key, strings.NewReader("Darth Vader")) reader, err = db.Read(ctx, key) if err != nil { // TODO: handle error } name, err = ioutil.ReadAll(reader) reader.Close() if err != nil { // TODO: handle error } fmt.Println(string(name)) db.Delete(ctx, key) _, err = db.Read(ctx, key) if fsdb.IsNoSuchKeyError(err) { fmt.Println("Joined force") } }
Output: Anakin Skywalker Darth Vader Joined force
Index ¶
Examples ¶
Constants ¶
const ( KeyFilename = "key" DataFilename = "data" GzipDataFilename = "data.gz" )
Filenames used under the entry directory.
const ( DefaultDataDir = "data" + PathSeparator DefaultTempDir = "_tmp" + PathSeparator DefaultDirLevel = 3 DefaultUseGzip = false DefaultGzipLevel = gzip.DefaultCompression )
Default options values.
const PathSeparator = string(os.PathSeparator)
PathSeparator is the string version of os.PathSeparator.
Variables ¶
var ( FileModeForFiles os.FileMode = 0600 FileModeForDirs os.FileMode = 0700 )
Permissions for files and directories.
var DefaultHashFunc = sha512.New512_224
DefaultHashFunc is the default hash function, which is SHA-512/224.
It's chosen because it gives us relatively shorter hash results, thus shorter filenames.
Functions ¶
Types ¶
type KeyCollisionError ¶
KeyCollisionError is an error returned when two keys have the same hash.
func (*KeyCollisionError) Error ¶
func (err *KeyCollisionError) Error() string
type Options ¶
type Options interface { // GetRootDataDir returns the full path of the root data directory, // guaranteed to end with PathSeparator. GetRootDataDir() string // GetRootTempDir returns the full path of the root temporary directory, // guaranteed to end with PathSeparator. GetRootTempDir() string // GetHashFunc returns the hash function used in keys. GetHashFunc() func() hash.Hash // GetDirForKey returns the directory to put entry in, // guaranteed to end with PathSeparator and guaranteed to be under root data // directory. GetDirForKey(key fsdb.Key) string GetUseGzip() bool GetGzipLevel() int }
Options defines a read only view of options used by local fsdb.
type OptionsBuilder ¶
type OptionsBuilder interface { Options // Build returns the read-only version of options. Build() Options // SetDataDir sets the relative data directory within the root directory. SetDataDir(dir string) OptionsBuilder // SetTempDir sets the relative temporary directory within the root directory. // // It should be on the same mount point as data directory. SetTempDir(dir string) OptionsBuilder // SetHashFunc sets the hash function used for keys. SetHashFunc(f func() hash.Hash) OptionsBuilder // SetDirLevel sets the directory level used in filenames. // Its purpose is to limit the number of files under the same directory. // // For example, if directory level was set to 2, hash value "deadbeef" will // convert to directory name "de/ad/beef/". SetDirLevel(level int) OptionsBuilder // SetUseGzip sets whether to use gzip for storage. SetUseGzip(gzip bool) OptionsBuilder // SetGzipLevel sets the level used in gzip compression. SetGzipLevel(level int) OptionsBuilder }
OptionsBuilder defines a read-write view of options used by local fsdb.
Gzip related options are safe to change on an existing FSDB system. Changing other options will break the existing FSDB system.
func NewDefaultOptions ¶
func NewDefaultOptions(root string) OptionsBuilder
NewDefaultOptions creates an OptionsBuilder with default options.