Documentation ¶
Overview ¶
hllpp implements the HyperLogLog++ cardinality estimator as specified in the HyperLogLog++ paper http://goo.gl/Z5Sqgu. hllpp uses a built-in non-streaming implementation of murmur3 to hash data as you add it to the estimator.
Example ¶
h := New() h.Add([]byte("barclay")) h.Add([]byte("reginald")) h.Add([]byte("barclay")) h.Add([]byte("broccoli")) fmt.Println(h.Count())
Output: 3
Index ¶
- Constants
- Variables
- func MurmurSum64(data []byte) uint64
- type Config
- type HLLPP
- func (h *HLLPP) Add(v []byte)
- func (h *HLLPP) AddHash(x uint64)
- func (h *HLLPP) AsPipeline() ([]byte, error)
- func (h *HLLPP) Copy() *HLLPP
- func (h *HLLPP) Count() uint64
- func (h *HLLPP) DumpRegisters() []Register
- func (h *HLLPP) IsSparse() bool
- func (h *HLLPP) Marshal() []byte
- func (h *HLLPP) Merge(other *HLLPP) error
- type Register
- type RegisterSlice
Examples ¶
Constants ¶
const ( // Public constants. PipelineDenseDirty = 'd' PipelineDenseClean = 'D' PipelineExplicitDirty = 'e' PipelineExplicitClean = 'E' PipelineSparseClean = 'S' PipelineSparseDirty = 's' )
Variables ¶
var ( // Whether to mark the cardinality calculation as clean or dirty. Clean is // better for production, but dirty is better for testing. WriteDirtyEncoding = false )
Functions ¶
func MurmurSum64 ¶
This is a port of MurmurHash3_x64_128 from MurmurHash3.cpp
Types ¶
type Config ¶
type Config struct { // Precision (p). Must be in the range [4..16]. This value can be used // to adjust the typical relative error of the estimate. Space requirements // grow exponentially as this value is increased. Defaults to 14, the // recommended value, which gives an expected error of about 0.8% Precision uint8 // Precision in sparse mode (p'). Must be in the range [p..25] for this // implementation. This value can be used to adjust the typical relative // error of the estimate when using the sparse representation (typically // for cardinalities below 8000 at p'=20). Lowering p' will allow the // estimator to remain in sparse mode longer, but will increase the relative // error. The HyperLogLog++ paper recommends 20 or 25. Defaults to 20 since // that still gives you a much lower error vs. p=14, but saves a signficant // amount of space vs. p'=25 (20-25% for cardinalities less than 5000). SparsePrecision uint8 }
Config is used to set configurable fields on a HyperLogLog++ via NewWithConfig.
type HLLPP ¶
type HLLPP struct {
// contains filtered or unexported fields
}
HLLPP represents a single HyperLogLog++ estimator. Create one via New(). It is not safe to interact with an HLLPP object from multiple goroutines at once.
func NewWithConfig ¶
NewWithConfig creates a HyperLogLog++ estimator with the given Config.
Example ¶
h, err := NewWithConfig(Config{ Precision: 12, SparsePrecision: 14, }) if err != nil { panic(err) } h.Add([]byte("qapla'")) h.Add([]byte("qapla'")) fmt.Println(h.Count())
Output: 1
func (*HLLPP) Add ¶
Add will hash v and add the result to the HyperLogLog++ estimator h. hllpp uses a built-in non-streaming implementation of murmur3.
func (*HLLPP) AsPipeline ¶
Converts dense or sparse data structure to Pipeline padded struct.
func (*HLLPP) DumpRegisters ¶
func (*HLLPP) Marshal ¶
Marshal serializes h into a byte slice that can be deserialized via Unmarshal. The data is naturally compressed, so don't bother trying to compress it any more.
Example ¶
h := New() h.Add([]byte("hobbledehoyhood")) serialized := h.Marshal() h, err := Unmarshal(serialized) if err != nil { panic(err) } fmt.Println(h.Count())
Output: 1
func (*HLLPP) Merge ¶
Merge turns h into the union of h and other. h and other must have the same p and p' values.
Example ¶
h := New() h.Add([]byte("picard")) h.Add([]byte("janeway")) other := New() other.Add([]byte("picard")) other.Add([]byte("kirk")) err := h.Merge(other) if err != nil { panic(err) } fmt.Println(h.Count())
Output: 3
type RegisterSlice ¶
type RegisterSlice []Register
func (RegisterSlice) Len ¶
func (s RegisterSlice) Len() int
func (RegisterSlice) Less ¶
func (s RegisterSlice) Less(i, j int) bool
func (RegisterSlice) Swap ¶
func (s RegisterSlice) Swap(i, j int)