hyperloglog

package module
v0.0.0-...-a0107a5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 27, 2017 License: MIT Imports: 5 Imported by: 22

README

Build Status Coverage Status GoDoc

HyperLogLog and HyperLogLog++

Implements the HyperLogLog and HyperLogLog++ algorithms.

HyperLogLog paper: http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf

HyperLogLog++ paper: http://research.google.com/pubs/pub40671.html

Documentation

Documentation can be found here.

Comparison of Algorithms

The HyperLogLog++ algorithm has much lower error for small cardinalities. This is because it uses a different representation of data for small sets of data. Data generated using this library shows the difference for N < 10000:

HyperLogLog++ also has bias correction which helps offset estimation errors in the original HyperLogLog algorithm. This correction can be seen here, again using data generated using this library:

Future Improvements

  • Right now HLL++ uses 8 bits per register. It could use 6 bits and take less memory.
  • The list compression algorithm could be improved, allowing the sparse representation to be used longer.

Documentation

Overview

Package hyperloglog implements the HyperLogLog and HyperLogLog++ cardinality estimation algorithms. These algorithms are used for accurately estimating the cardinality of a multiset using constant memory. HyperLogLog++ has multiple improvements over HyperLogLog, with a much lower error rate for smaller cardinalities.

HyperLogLog is described here: http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf

HyperLogLog++ is described here: http://research.google.com/pubs/pub40671.html

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Hash32

type Hash32 interface {
	Sum32() uint32
}

type Hash64

type Hash64 interface {
	Sum64() uint64
}

type HyperLogLog

type HyperLogLog struct {
	// contains filtered or unexported fields
}

func New

func New(precision uint8) (*HyperLogLog, error)

New returns a new initialized HyperLogLog.

func (*HyperLogLog) Add

func (h *HyperLogLog) Add(item Hash32)

Add adds a new item to HyperLogLog h.

func (*HyperLogLog) Clear

func (h *HyperLogLog) Clear()

Clear sets HyperLogLog h back to its initial state.

func (*HyperLogLog) Count

func (h *HyperLogLog) Count() uint64

Count returns the cardinality estimate.

func (*HyperLogLog) GobDecode

func (h *HyperLogLog) GobDecode(b []byte) error

Decode gob into a HyperLogLog structure

func (*HyperLogLog) GobEncode

func (h *HyperLogLog) GobEncode() ([]byte, error)

Encode HyperLogLog into a gob

func (*HyperLogLog) Merge

func (h *HyperLogLog) Merge(other *HyperLogLog) error

Merge takes another HyperLogLog and combines it with HyperLogLog h.

type HyperLogLogPlus

type HyperLogLogPlus struct {
	// contains filtered or unexported fields
}

func NewPlus

func NewPlus(precision uint8) (*HyperLogLogPlus, error)

NewPlus returns a new initialized HyperLogLogPlus that uses the HyperLogLog++ algorithm.

func (*HyperLogLogPlus) Add

func (h *HyperLogLogPlus) Add(item Hash64)

Add adds a new item to HyperLogLogPlus h.

func (*HyperLogLogPlus) Clear

func (h *HyperLogLogPlus) Clear()

Clear sets HyperLogLogPlus h back to its initial state.

func (*HyperLogLogPlus) Count

func (h *HyperLogLogPlus) Count() uint64

Count returns the cardinality estimate.

func (*HyperLogLogPlus) GobDecode

func (h *HyperLogLogPlus) GobDecode(b []byte) error

Decode gob into a HyperLogLogPlus structure

func (*HyperLogLogPlus) GobEncode

func (h *HyperLogLogPlus) GobEncode() ([]byte, error)

Encode HyperLogLogPlus into a gob

func (*HyperLogLogPlus) Merge

func (h *HyperLogLogPlus) Merge(other *HyperLogLogPlus) error

Merge takes another HyperLogLogPlus and combines it with HyperLogLogPlus h.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL