muse

package module

v0.6.0 Latest Latest Go to latest Published: Feb 6, 2021 License: MIT Imports: 12 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/aouyang1/go-muse

Links

Open Source Insights

README ¶

go-muse

Golang library for comparing one time series with a group of other labeled time series. This library supports arbitrary slicing and dicing of the labeled time series for a more iterative approach to finding visibly similar timeseries. Comparison between two timeseries is done by z-normalizing both series and cross correlating the two using Fast Fourier Transforms (FFT).

Motivation

A common problem in the operations world is finding all the graphs that look like a particular alert or incident. For example a Site Reliability Engineer (SRE) receives an alert which indicates that something is broken. The SRE generally will open up the graph that triggered the alert which is likely one graph of many in a dashboard. Next, the SRE begins scrolling through this dashboard looking for anything that resembles the waveform of the received alert. Once the SRE has filtered down the set of graphs that looks the original alert graph, he/she begins building a story as to why the alert fired and root causing the incident. This whole process can be time consuming depending on the size and complexity of the dashboards. This library aims to provide a first pass filtering of the existing graphs or time series, so that an engineer can focus just on what looks similar.

This library will filter results down to anything that is positively or negatively correlated with the input reference series. You can limit the number of results returned and also specify a score between 0 to 1 with 1 being perfectly correlated. You can also filter down to a number of samples before and after the input reference series to filter our strong matches that are outside your window of interest.

Installation

$ go get -u github.com/aouyang1/go-muse
$ cd $GOPATH/src/github.com/aouyang1/go-muse
$ make all

Quick start

package main

import (
	"fmt"

	"github.com/aouyang1/go-muse"
	"github.com/matrix-profile-foundation/go-matrixprofile/siggen"
)

func main() {
	sampleRate := 1.0 // once per minute
	duration := 480.0 // minutes

	// create a reference rectangular time series with an amplitude of 1.5 centered
	// at 240 minutes and a width of 10 minutes
	ref := NewSeries(
		siggen.Add(
			siggen.Rect(1.5, 240, 10, sampleRate, duration),
			siggen.Noise(0.1, int(sampleRate*duration)),
		), NewLabels(LabelMap{"graph": "CallTime99Pct", "host": "host1"}),
	)

	// create a comparison group of time series that the reference will query against
	comp := NewGroup("comparison")
	comp.Add(
		ref,
		NewSeries(
			siggen.Add(
				siggen.Rect(1.5, 242, 7, sampleRate, duration),
				siggen.Noise(0.1, int(sampleRate*duration)),
			), NewLabels(LabelMap{"graph": "CallTime99Pct", "host": "host2"}),
		),
		NewSeries(
			siggen.Add(
				siggen.Rect(43, 240, 10, sampleRate, duration),
				siggen.Noise(0.1, int(sampleRate*duration)),
			), NewLabels(LabelMap{"graph": "ErrorRate", "host": "host1"}),
		),
		NewSeries(
			siggen.Add(
				siggen.Line(0, 0.1, int(sampleRate*duration)),
				siggen.Noise(0.1, int(sampleRate*duration)),
			), NewLabels(LabelMap{"graph": "ErrorRate", "host": "host2"}),
		),
	)

	maxLag := 15.0   // minutes
	topN := 4        // top 4 grouped series
	threshold := 0.5 // correlation threshold
	m := NewBatch(ref, comp, NewResults(int(maxLag/sampleRate), topN, threshold))

	// Rank each individual time series in the comparison group
	m.Run(nil)
	fmt.Println(m.Results.Fetch())

	// Rank time series grouped by the graph label
	m.Run([]string{"graph"})
	fmt.Println(m.Results.Fetch())

	// Rank time series grouped by the host label
	m.Run([]string{"host"})
	fmt.Println(m.Results.Fetch())
}

Benchmarks

Benchmark name	NumReps	Time/Rep	Memory/Rep	Alloc/Rep
BenchmarkMuseRun-4	214075	5019 ns/op	1680 B/op	20 allocs/op
BenchmarkMuseRunLarge-4	9	128044546 ns/op	2124970 B/op	405 allocs/op
BenchmarkFilterByLabelValues-4	1736983	705 ns/op	496 B/op	8 allocs/op
BenchmarkIndexLabelValues-4	354844	3198 ns/op	2152 B/op	38 allocs/op
BenchmarkZPad-4	1035133	1180 ns/op	4096 B/op	1 allocs/op
BenchmarkZNormalize-4	789307	1287 ns/op	0 B/op	0 allocs/op
BenchmarkXCorr-4	148	8228987 ns/op	2115418 B/op	7 allocs/op
BenchmarkXCorrWithX-4	226	4910405 ns/op	0 B/op	0 allocs/op

Ran on a 2018 MacBookAir on Apr 19, 2020

    Processor: 1.6 GHz Intel Core i5
       Memory: 8GB 2133 MHz LPDDR3
           OS: macOS Mojave v10.14.2
 Logical CPUs: 4
Physical CPUs: 2

$ make bench

Contributing

Fork the repository
Create a new branch (feature_* or bug_*)for the new feature or bug fix
Run tests
Commit your changes
Push code and open a new pull request

Testing

Run all tests including benchmarks

$ make all

Just run benchmarks

$ make bench

Just run tests

$ make test

Contact

Austin Ouyang (aouyang1@gmail.com)

License

The MIT License (MIT). See LICENSE for more details.

Documentation ¶

Overview ¶

Example ¶

sampleRate := 1.0 // once per minute
duration := 480.0 // minutes

// create a reference rectangular time series with an amplitude of 1.5 centered
// at 240 minutes and a width of 10 minutes
ref := NewSeries(
	siggen.Add(
		siggen.Rect(1.5, 240, 10, sampleRate, duration),
		siggen.Noise(0.1, int(sampleRate*duration)),
	), NewLabels(LabelMap{"graph": "CallTime99Pct", "host": "host1"}),
)

// create a comparison group of time series that the reference will query against
comp := NewGroup("comparison")
comp.Add(
	ref,
	NewSeries(
		siggen.Add(
			siggen.Rect(1.5, 242, 7, sampleRate, duration),
			siggen.Noise(0.1, int(sampleRate*duration)),
		), NewLabels(LabelMap{"graph": "CallTime99Pct", "host": "host2"}),
	),
	NewSeries(
		siggen.Add(
			siggen.Rect(43, 240, 10, sampleRate, duration),
			siggen.Noise(0.1, int(sampleRate*duration)),
		), NewLabels(LabelMap{"graph": "ErrorRate", "host": "host1"}),
	),
	NewSeries(
		siggen.Add(
			siggen.Line(0, 0.1, int(sampleRate*duration)),
			siggen.Noise(0.1, int(sampleRate*duration)),
		), NewLabels(LabelMap{"graph": "ErrorRate", "host": "host2"}),
	),
	NewSeries(
		siggen.Line(0, 0.1, int(sampleRate*duration)),
		NewLabels(LabelMap{"graph": "ErrorRate", "host": "host3"}),
	),
)

maxLag := 15.0   // minutes
topN := 4        // top 4 grouped series
threshold := 0.0 // correlation threshold
m, err := NewBatch(ref, comp, NewResults(int(maxLag/sampleRate), topN, threshold, SignFilter_ANY), 2)
if err != nil {
	panic(err)
}

// Rank each individual time series in the comparison group
m.Run(nil)
res, _ := m.Results.Fetch()
fmt.Println("Unique")
for _, s := range res {
	fmt.Printf("%s, Lag: %d, Score: %.3f\n", s.Labels.ID(s.Labels.Keys()), s.Lag, s.PercentScore)
}

// Rank time series grouped by the graph label
m.Run([]string{"graph"})
res, _ = m.Results.Fetch()
fmt.Println("By Graph")
for _, s := range res {
	fmt.Printf("%s, Lag: %d, Score: %.3f\n", s.Labels.ID(s.Labels.Keys()), s.Lag, s.PercentScore)
}

// Rank time series grouped by the host label
m.Run([]string{"host"})
res, _ = m.Results.Fetch()
fmt.Println("By Host")
for _, s := range res {
	fmt.Printf("%s, Lag: %d, Score: %.3f\n", s.Labels.ID(s.Labels.Keys()), s.Lag, s.PercentScore)
}

Output:

Unique
graph:CallTime99Pct,host:host1, Lag: 0, Score: 1.000
graph:ErrorRate,host:host1, Lag: 0, Score: 0.991
graph:CallTime99Pct,host:host2, Lag: -3, Score: 0.822
graph:ErrorRate,host:host3, Lag: 0, Score: 0.000
By Graph
graph:CallTime99Pct,host:host1, Lag: 0, Score: 1.000
graph:ErrorRate,host:host1, Lag: 0, Score: 0.991
By Host
graph:CallTime99Pct,host:host1, Lag: 0, Score: 1.000
graph:CallTime99Pct,host:host2, Lag: -3, Score: 0.822
graph:ErrorRate,host:host3, Lag: 0, Score: 0.000

Index ¶

Constants
type Batch
- func NewBatch(ref *Series, comp *Group, results *Results, cc int) (*Batch, error)
- func (b *Batch) Run(groupByLabels []string) error
type Group
- func NewGroup(name string) *Group
- func (g *Group) Add(series ...*Series) error
- func (g *Group) FilterByLabelValues(labels *Labels) []*Series
- func (g *Group) Length() int
type LabelMap
type Labels
- func NewLabels(labels LabelMap) *Labels
- func (l *Labels) Get(key string) (string, bool)
- func (l Labels) ID(labels []string) string
- func (l *Labels) Keys() []string
- func (l *Labels) Len() int
type Muse
- func New(ref *Series, results *Results) (*Muse, error)
- func (m *Muse) Run(compGraphs []*Series) error
type Results
- func NewResults(maxLag int, topN int, threshold float64, signFilter SignFilter) *Results
- func (r *Results) Fetch() (Scores, float64)
- func (r *Results) Update(s Score)
type Score
type Scores
- func (s Scores) Len() int
- func (s Scores) Less(i, j int) bool
- func (s *Scores) Pop() interface{}
- func (s *Scores) Push(x interface{})
- func (s Scores) Swap(i, j int)
type Series
- func NewSeries(y []float64, labels *Labels) *Series
- func (s *Series) Labels() *Labels
- func (s *Series) Length() int
- func (s *Series) UID() string
- func (s *Series) Values() []float64
type SignFilter

Examples ¶

Package

Constants ¶

View Source

const (
	SignFilter_POS = 1
	SignFilter_NEG = -1
	SignFilter_ANY = 0
)

View Source

const (
	// DefaultLabel is the label name if a series is specified without any labels
	DefaultLabel = "uid"
)

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Batch ¶ added in v0.4.0

type Batch struct {
	Comparison  *Group
	Results     *Results
	Concurrency int
	// contains filtered or unexported fields
}

Batch is used to setup and run a z-normalized cross correlation between a reference series against each individual comparison series while tracking the resulting scores

func NewBatch ¶ added in v0.4.0

func NewBatch(ref *Series, comp *Group, results *Results, cc int) (*Batch, error)

NewBatch creates a new Muse instance with a set reference timeseries, a comparison group of timeseries, and results

func (*Batch) Run ¶ added in v0.4.0

func (b *Batch) Run(groupByLabels []string) error

Run calculates the top N graphs with the highest scores given a reference time series and a group of comparison time series. Number of scores will be the number of unique labels specified in the input. If no groupByLabels is specified, then each timeseries will receive its own score.

type Group ¶

type Group struct {
	Name string
	// contains filtered or unexported fields
}

Group is a collection of timeseries keeping track of all labeled timeseries, All timeseries must be unique regarding their label value pairs

func NewGroup ¶

func NewGroup(name string) *Group

NewGroup creates a new Group and initializes the timeseries label registry

func (*Group) Add ¶

func (g *Group) Add(series ...*Series) error

Add will register a time series with its labels into the current groups registry. If the timeseries with the exact same label values already exists, an error will be returned

func (*Group) FilterByLabelValues ¶

func (g *Group) FilterByLabelValues(labels *Labels) []*Series

FilterByLabelValues returns the slice of timeseries filtered by specified label value pairs

func (*Group) Length ¶

func (g *Group) Length() int

Length returns the length of all timeseries. All timeseries have the same length

type LabelMap ¶

type LabelMap map[string]string

LabelMap is a map of label keys to values

type Labels ¶

type Labels struct {
	// contains filtered or unexported fields
}

Labels is a map of label names to label values

func NewLabels ¶

func NewLabels(labels LabelMap) *Labels

NewLabels creates a Label storing the map and sorted keys

func (*Labels) Get ¶

func (l *Labels) Get(key string) (string, bool)

Get returns the value of a specified key. Returns an empty string if the key is not present

func (Labels) ID ¶

func (l Labels) ID(labels []string) string

ID constructs the unique identifier based on an input set of labels. This does not have to be all the unique label names. Format will have the following "key1:val1,key2:val2" and so on

func (*Labels) Keys ¶

func (l *Labels) Keys() []string

Keys returns the sorted keys of the labels

func (*Labels) Len ¶

func (l *Labels) Len() int

Len returns the number of labels

type Muse ¶

type Muse struct {
	Results *Results
	// contains filtered or unexported fields
}

Muse is the primary struct to setup and run a z-normalized cross correlation between a reference series against an individual comparison series while tracking the resulting scores

func New ¶

func New(ref *Series, results *Results) (*Muse, error)

New creates a new Muse instance with a set reference timeseries, and a comparison timeseries, and results

func (*Muse) Run ¶

func (m *Muse) Run(compGraphs []*Series) error

Run compares a single comparison series against the reference series and updates the score results

type Results ¶

type Results struct {
	sync.Mutex
	MaxLag     int
	TopN       int
	Threshold  float64
	SignFilter SignFilter
	// contains filtered or unexported fields
}

Results tracks the top scores in sorted order given a specified maximum lag, top N and score threshold

func NewResults ¶

func NewResults(maxLag int, topN int, threshold float64, signFilter SignFilter) *Results

NewResults creates a new instance of results to track the top similar graphs

func (*Results) Fetch ¶

func (r *Results) Fetch() (Scores, float64)

Fetch returns the sorted scores in ascending order along with the average absolute percent score

func (*Results) Update ¶ added in v0.4.0

func (r *Results) Update(s Score)

Update records the input score

type Score ¶

type Score struct {
	Labels       *Labels `json:"labels"`
	Lag          int     `json:"lag"`
	PercentScore float64 `json:"percentScore"`
}

Score keeps track of the cross correlation score and the related series

type Scores ¶

type Scores []Score

Scores is a slice of individual Score

func (Scores) Len ¶

func (s Scores) Len() int

func (Scores) Less ¶

func (s Scores) Less(i, j int) bool

func (*Scores) Pop ¶

func (s *Scores) Pop() interface{}

Pop implements the function in the heap interface

func (*Scores) Push ¶

func (s *Scores) Push(x interface{})

Push implements the function in the heap interface

func (Scores) Swap ¶

func (s Scores) Swap(i, j int)

type Series ¶

type Series struct {
	// contains filtered or unexported fields
}

Series is the general representation of timeseries containing only values.

func NewSeries ¶

func NewSeries(y []float64, labels *Labels) *Series

NewSeries creates a new Series with a set of labels. If not labels are specified a unique ID is automatically generated

func (*Series) Labels ¶

func (s *Series) Labels() *Labels

Labels returns the map of label to values for the timeseries

func (*Series) Length ¶

func (s *Series) Length() int

Length returns the length of the timeseries

func (*Series) UID ¶

func (s *Series) UID() string

UID generates the unique identifier string that represents this particular timeseries. This must be unique within a timeseries Group

func (*Series) Values ¶

func (s *Series) Values() []float64

Values returns the y or series values

type SignFilter ¶ added in v0.6.0

type SignFilter int

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL