streamvbyte

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 22, 2020 License: MIT Imports: 2 Imported by: 1

README

streamvbyte

This package implements integer compression using the Stream VByte algorithm.

"Stream VByte: Faster Byte-Oriented Integer Compression" Daniel Lemire, Nathan Kurz, Christoph Rupp Information Processing Letters Volume 130, February 2018, Pages 1-6 https://arxiv.org/abs/1709.08990

The focus of this method is fast decoding speed on processors with SIMD instructions. As such, this package contains fast assembly implementations for decoding on amd64 processors with SSE3 instructions. Other processors will use the slower pure go implementation.

Assembly implementations were generated using the excellent avo

Reference benchmarks on 1,000,000 Zipfian-distributed 32-bit integers.

Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz go version go1.14.2 linux/amd64

Type Decode (SSE3) Decode (pure go) Encode (pure go) compression ratio
uint32 4.35GB/s ± 1% 291MB/s ± 1% 272MB/s ± 0% 0.51
uint32 (delta) 3.90GB/s ± 4% 890MB/s ± 5% 948MB/s ± 1% 0.34
int32 3.99GB/s ± 4% 276MB/s ± 0% 257MB/s ± 2% 0.51
int32 (delta) 3.08GB/s ± 3% 887MB/s ± 1% 866MB/s ± 2% 0.35

For reference, a pure memory copy of the uncompressed data runs at 6.58GB/s ± 2%.

Documentation

Overview

Package streamvbyte implements integer compression using the Stream VByte algorithm.

"Stream VByte: Faster Byte-Oriented Integer Compression" Daniel Lemire, Nathan Kurz, Christoph Rupp Information Processing Letters Volume 130, February 2018, Pages 1-6 https://arxiv.org/abs/1709.08990

The focus of this method is fast decoding speed on processors with SIMD instructions. As such, this package contains fast assembly implementations for decoding on amd64 processors with SSE3 instructions. Other processors will use the slower pure go implementation.

Assembly implementations were generated using the excellent avo package https://github.com/mmcloughlin/avo

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DecodeDeltaInt32

func DecodeDeltaInt32(data []int32, encoded []byte, previous int32)

DecodeDeltaInt32 decodes len(data) int32 from encoded using the Stream Vbyte algorithm with delta and zigzag encoding using the initial value previous. encoded must contain exactly len(data) encoded uint32.

func DecodeDeltaUint32

func DecodeDeltaUint32(data []uint32, encoded []byte, previous uint32)

DecodeDeltaUint32 decodes len(data) uint32 from encoded using the Stream Vbyte algorithm with delta encoding using the initial value previous. encoded must contain exactly len(data) encoded uint32.

func DecodeInt32

func DecodeInt32(data []int32, encoded []byte)

DecodeInt32 decodes len(data) int32 from encoded using the Stream Vbyte algorithm. encoded must contain exactly len(data) encoded int32.

func DecodeUint32

func DecodeUint32(data []uint32, encoded []byte)

DecodeUint32 decodes len(data) uint32 from encoded using the Stream Vbyte algorithm. encoded must contain exactly len(data) encoded uint32.

func EncodeDeltaInt32

func EncodeDeltaInt32(encoded []byte, data []int32, previous int32) int

EncodeDeltaInt32 encodes data using the Stream VByte algorithm and delta encoding with a step size of 1, i.e. it encodes

delta[n] = data[n] - data[n-1]

where the initial value

data[-1] := previous

followed by zigzag encoding the deltas. The return value is the encoded size. This function assumes that the size of encoded is sufficient to hold the compressed data. Use

encoded := make([]byte, MaxSize32(len(data)))

to obtain a worst case size.

func EncodeDeltaUint32

func EncodeDeltaUint32(encoded []byte, data []uint32, previous uint32) int

EncodeDeltaUint32 encodes data using the Stream VByte algorithm and delta encoding with a step size of 1, i.e. it encodes

delta[n] = data[n] - data[n-1],

where the initial value

data[-1] := previous

The return value is the encoded size. This function assumes that the size of encoded is sufficient to hold the compressed data. Use

encoded := make([]byte, MaxSize32(len(data)))

to obtain a worst case size.

func EncodeInt32

func EncodeInt32(encoded []byte, data []int32) int

EncodeInt32 encodes data using the Stream VByte algorithm with zigzag encoding into encoded and returns the encoded size. This function assumes that the size of encoded is sufficient to hold the compressed data. Use

encoded := make([]byte, MaxSize32(len(data)))

to obtain a worst case size.

func EncodeUint32

func EncodeUint32(encoded []byte, data []uint32) int

EncodeUint32 encodes data using the Stream VByte algorithm into encoded and returns the encoded size. This function assumes that the size of encoded is sufficient to hold the compressed data. Use

encoded := make([]byte, MaxSize32(len(data)))

to obtain a worst case size.

func MaxSize32

func MaxSize32(length int) int

MaxSize32 returns the maximum possible size of an encoded slice of 32-bit integers. Usage:

encoded := make([]byte, MaxSize32(len(data)))

This will ensure that the slice is large enough to hold the encoded data in the worst case of no compression.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL