count-min-log: github.com/seiflotfy/count-min-log Index | Files

package cml

import "github.com/seiflotfy/count-min-log"

Package cml implments the Count-Min-Log datastructure.

It is based on Count-Min-Log sketch: Approximately counting with approximate counters - Guillaume Pitel & Geoffroy Fouquier

http://iswag-symposium.org/2015/pdfs/shortpaper1.pdf

TL;DR: Count-Min-Log Sketch for improved Average Relative Error on low frequency events

Count-Min Sketch is a widely adopted algorithm for approximate event counting in large scale processing. However, the original version of the Count-Min-Sketch (CMS) suffers of some deficiences, especially if one is interested in the low-frequency items, such as in text- mining related tasks. Several variants of CMS have been proposed to compensate for the high relative error for low-frequency events, but the proposed solutions tend to correct the errors instead of preventing them. In this paper, we propose the Count-Min-Log sketch, which uses logarithm-based, approximate counters instead of linear counters to improve the average relative error of CMS at constant memory footprint.

Index

Package Files

doc.go log.go utils.go

type Sketch Uses

type Sketch struct {
    // contains filtered or unexported fields
}

Sketch is a Count-Min-Log Sketch 16-bit registers

func NewForCapacity16 Uses

func NewForCapacity16(capacity uint64, e float64) (*Sketch, error)

NewForCapacity16 returns a new Count-Min-Log Sketch with 16-bit registers optimized for a given max capacity and expected error rate

func NewSketch Uses

func NewSketch(w uint, d uint, exp float64) (*Sketch, error)

NewSketch returns a new Count-Min-Log Sketch with 16-bit registers

func NewSketchForEpsilonDelta Uses

func NewSketchForEpsilonDelta(epsilon, delta float64) (*Sketch, error)

NewSketchForEpsilonDelta for a given error rate epsiolen with a probability of delta

func (*Sketch) BulkUpdate Uses

func (cml *Sketch) BulkUpdate(e []byte, freq uint) bool

BulkUpdate increases the count of `s` by one, return true if added and the current count of `s`

func (*Sketch) Query Uses

func (cml *Sketch) Query(e []byte) float64

Query returns the count of `e`

func (*Sketch) Update Uses

func (cml *Sketch) Update(e []byte) bool

Update increases the count of `s` by one, return true if added and the current count of `s`

Package cml imports 4 packages (graph) and is imported by 3 packages. Updated 2017-02-19. Refresh now. Tools for package owners.