README ¶
go-matrixprofile
Golang library for computing a matrix profiles and matrix profile indexes. Features also include time series discords, time series segmentation, and motif discovery after computing the matrix profile. Visit The UCR Matrix Profile Page for more details into matrix profiles.
Features:
- STMP
- STAMP (parallelized)
- STAMPI
- STOMP (parallelized)
- mSTOMP
- TopKMotifs - finds the top K motifs from a computed matrix profile
- TopKDiscords - finds the top K discords from a computed matrix profile
- Segement - computes the corrected arc curve for time series segmentation
- Annotation Vectors
- Complexity
- Mean Standard Deviation
- Clipping
Contents
- Installation
- Quick start
- Case Studies
- Benchmarks
- Contributing
- Testing
- Other Libraries
- Contact
- License
- Citations
Installation
$ go get -u github.com/aouyang1/go-matrixprofile/matrixprofile
$ cd $GOPATH/src/github.com/aouyang1/go-matrixprofile
$ make setup
Quick start
$ cat example_mp.go
package main
import (
"fmt"
"github.com/aouyang1/go-matrixprofile/matrixprofile"
)
func main() {
sig := []float64{0, 0.99, 1, 0, 0, 0.98, 1, 0, 0, 0.96, 1, 0}
mp, err := matrixprofile.New(sig, nil, 4)
if err != nil {
panic(err)
}
if err = mp.Stomp(1); err != nil {
panic(err)
}
fmt.Printf("Signal: %.3f\n", sig)
fmt.Printf("Matrix Profile: %.3f\n", mp.MP)
fmt.Printf("Profile Index: %5d\n", mp.Idx)
}
$ go run example_mp.go
Signal: [0.000 0.990 1.000 0.000 0.000 0.980 1.000 0.000 0.000 0.960 1.000 0.000]
Matrix Profile: [0.014 0.014 0.029 0.029 0.014 0.014 0.029 0.029 0.029]
Profile Index: [ 4 5 6 7 0 1 2 3 4]
Case studies
Matrix Profile
Going through a completely synthetic scenario, we'll cover what features to look for in a matrix profile, and what the additional Discords, TopKMotifs, and Segment tell us. We'll first be generating a fake signal that is composed of sine waves, noise, and sawtooth waves. We then run STOMP on the signal to calculte the matrix profile and matrix profile indexes.
subsequence length: 32
- signal: This shows our raw data. Theres several oddities and patterns that can be seen here.
- matrix profile: generated by running STOMP on this signal which generates both the matrix profile and the matrix profile index. In the matrix profile we see several spikes which indicate that these may be time series discords or anomalies in the time series.
- corrected arc curve: This shows the segmentation of the time series. The two lowest dips around index 420 and 760 indicate potential state changes in the time series. At 420 we see the sinusoidal wave move into a more pulsed pattern. At 760 we see the pulsed pattern move into a sawtooth pattern.
- discords: The discords graph shows the top 3 potential discords of the defined subsequence length, m, based on the 3 highest peaks in the matrix profile. This is mostly composed of noise.
- motifs: These represent the top 6 motifs found from the time series. The first being the initial sine wave pattern. The second is the sinusoidal pulse. The third is during the pulsed sequence on a fall of the pulse to the noise. The fourth is during the pulsed sequence on the rise from the noise to the pulse. The fifth is the sawtooth pattern.
The code to generate the graph can be found in this example.
Multi-Dimensional Matrix Profile
Based on [4] we can extend the matrix profile algorithm to multi-dimensional scenario.
subsequence length: 25
- signal 0-2: the 3 time series dimensions
- matrix profile 0-2: the k-dimensional matrix profile representing choose k from d time series. matrix profile 1 minima represent motifs that span at that time across 2 time series of the 3 available. matrix profile 2 minima represents the motifs that span at that time across 3 time series.
The plots can be generated by running
$ make example
go test ./... -run=Example
ok github.com/aouyang1/go-matrixprofile/matrixprofile 0.260s
ok github.com/aouyang1/go-matrixprofile/siggen 0.007s [no tests to run]
A png file will be saved in the top level directory of the repository as mp_sine.png
and mp_kdim.png
Benchmarks
Benchmark name | NumReps | Time/Rep | Memory/Rep | Alloc/Rep |
---|---|---|---|---|
BenchmarkMStomp-4 | 50 | 34214278 ns/op | 7320233 B/op | 226597 allocs/op |
BenchmarkZNormalize-4 | 10000000 | 192 ns/op | 256 B/op | 1 allocs/op |
BenchmarkMovmeanstd-4 | 50000 | 26181 ns/op | 65537 B/op | 4 allocs/op |
BenchmarkCrossCorrelate-4 | 10000 | 144817 ns/op | 49179 B/op | 3 allocs/op |
BenchmarkMass-4 | 10000 | 151626 ns/op | 49444 B/op | 4 allocs/op |
BenchmarkDistanceProfile-4 | 10000 | 152649 ns/op | 49444 B/op | 4 allocs/op |
BenchmarkCalculateDistanceProfile-4 | 200000 | 10825 ns/op | 2 B/op | 0 allocs/op |
BenchmarkStmp/m32_pts1k-4 | 5 | 304208563 ns/op | 97396022 B/op | 7883 allocs/op |
BenchmarkStmp/m128_pts1k-4 | 5 | 289090801 ns/op | 94091318 B/op | 7499 allocs/op |
BenchmarkStamp/m32_p2_pts1k-4 | 10 | 164346009 ns/op | 97498451 B/op | 7897 allocs/op |
BenchmarkStomp/m32_p1_pts1k-4 | 50 | 34757074 ns/op | 252571 B/op | 22 allocs/op |
BenchmarkStomp/m128_p1_pts1k-4 | 50 | 35568253 ns/op | 252348 B/op | 21 allocs/op |
BenchmarkStomp/m128_p2_pts1k-4 | 100 | 18485672 ns/op | 397347 B/op | 31 allocs/op |
BenchmarkStomp/m128_p2_pts2k-4 | 20 | 73567121 ns/op | 816958 B/op | 32 allocs/op |
BenchmarkStomp/m128_p2_pts5k-4 | 10 | 470744655 ns/op | 2114579 B/op | 33 allocs/op |
BenchmarkStampUpdate-4 | 10 | 130760106 ns/op | 1933241 B/op | 25 allocs/op |
Ran on a 2018 MacBookAir on Jan 08, 2019
Processor: 1.6 GHz Intel Core i5
Memory: 8GB 2133 MHz LPDDR3
OS: macOS Mojave v10.14.2
Logical CPUs: 4
Physical CPUs: 2
$ make bench
Contributing
- Fork the repository
- Create a new branch (feature_* or bug_*)for the new feature or bug fix
- Run tests
- Commit your changes
- Push code and open a new pull request
Testing
Run all tests including benchmarks
$ make all
Just run benchmarks
$ make bench
Just run tests
$ make test
Other libraries
Contact
- Austin Ouyang (aouyang1@gmail.com)
License
The MIT License (MIT). See LICENSE for more details.
Copyright (c) 2018 Austin Ouyang
Citations
[1] Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum, Yifei Ding, Hoang Anh Dau, Diego Furtado Silva, Abdullah Mueen, Eamonn Keogh (2016). Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets. IEEE ICDM 2016.
[2] Yan Zhu, Zachary Zimmerman, Nader Shakibay Senobari, Chin-Chia Michael Yeh, Gareth Funning, Abdullah Mueen, Philip Berisk and Eamonn Keogh (2016). Matrix Profile II: Exploiting a Novel Algorithm and GPUs to break the one Hundred Million Barrier for Time Series Motifs and Joins. IEEE ICDM 2016.
[3] Hoang Anh Dau and Eamonn Keogh (2017). Matrix Profile V: A Generic Technique to Incorporate Domain Knowledge into Motif Discovery. KDD 2017.
[4] Chin-Chia Michael Yeh, Nickolas Kavantzas, Eamonn Keogh (2017).Matrix Profile VI: Meaningful Multidimensional Motif Discovery. ICDM 2017.
[5] Shaghayegh Gharghabi, Yifei Ding, Chin-Chia Michael Yeh, Kaveh Kamgar, Liudmila Ulanova, Eamonn Keogh (2017). Matrix Profile VIII: Domain Agnostic Online Semantic Segmentation at Superhuman Performance Levels. ICDM 2017.
Directories ¶
Path | Synopsis |
---|---|
Package matrixprofile computes the matrix profile and matrix profile index of a time series
|
Package matrixprofile computes the matrix profile and matrix profile index of a time series |
Package siggen provides basic timeseries generation wrappers
|
Package siggen provides basic timeseries generation wrappers |