Documentation ¶
Overview ¶
Package stats implements an online computation of summary statistics.
The primary idea of this package is that samples are coming in at real time, and online computations of the shape of the distribution: the mean, variance, and range need to be computed on-demand. Rather than keeping an array of values, online algorithms update the internal state of the descriptive statistics at runtime, saving memory.
To track statistics in an online fashion, you need to keep track of the various aggregates that are used to compute the final descriptives statistics of the distribution. For simple statistics such as the minimum, maximum, standard deviation, and mean you need to track the number of samples, the sum of samples, and the sum of the squares of all samples (along with the minimum and maximum value seen).
The primary entry point into this function is the Update method, where you can pass sample values and retrieve data back. All other methods are simply computations for values.
Index ¶
- type Benchmark
- func (s *Benchmark) Append(o *Benchmark)
- func (s *Benchmark) Fastest() time.Duration
- func (s *Benchmark) Mean() time.Duration
- func (s *Benchmark) Range() time.Duration
- func (s *Benchmark) Serialize() map[string]interface{}
- func (s *Benchmark) SetDuration(duration time.Duration)
- func (s *Benchmark) Slowest() time.Duration
- func (s *Benchmark) StdDev() time.Duration
- func (s *Benchmark) Throughput() float64
- func (s *Benchmark) Timeouts() uint64
- func (s *Benchmark) Total() time.Duration
- func (s *Benchmark) Update(durations ...time.Duration)
- func (s *Benchmark) Variance() time.Duration
- type Statistics
- func (s *Statistics) Append(o *Statistics)
- func (s *Statistics) Maximum() float64
- func (s *Statistics) Mean() float64
- func (s *Statistics) Minimum() float64
- func (s *Statistics) N() uint64
- func (s *Statistics) Range() float64
- func (s *Statistics) Serialize() map[string]float64
- func (s *Statistics) StdDev() float64
- func (s *Statistics) Total() float64
- func (s *Statistics) Update(samples ...float64)
- func (s *Statistics) Variance() float64
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Benchmark ¶
type Benchmark struct { sync.RWMutex Statistics // contains filtered or unexported fields }
Benchmark keeps track of a distrubtion of durations, e.g. to benchmark the performance or timing of an operation. It returns descriptive statistics as durations so that they can be read as timings. Benchmark works in an online fashion similar to the Statistics object, but works on time.Duration samples instead of floats. Instead of minimum and maximum values it returns the fastest and slowest times.
The primary entry point to the object is via the Update method, where one or more time.Durations can be passed. This object has unexported fields because it is thread-safe (via a sync.RWMutex). All properties must be accesesd from read-locked access methods.
Example ¶
stats := new(Benchmark) samples, _ := loadBenchData() for _, sample := range samples { stats.Update(sample) } data, _ := json.MarshalIndent(stats.Serialize(), "", " ") fmt.Println(string(data))
Output: { "duration": "0s", "fastest": "41.219436ms", "mean": "120.993689ms", "range": "167.175236ms", "samples": 1000000, "slowest": "208.394672ms", "stddev": "17.283562ms", "throughput": 8.264893850648656, "timeouts": 0, "total": "33h36m33.689461785s", "variance": "298.721µs" }
func (*Benchmark) Append ¶
Append another benchmark object to the current benchmark object, incrementing the distribution from the other object.
func (*Benchmark) Fastest ¶
Fastest returns the minimum value of durations seen. If no durations have been added to the dataset, then this function returns a zero duration.
func (*Benchmark) Mean ¶
Mean returns the average for all durations expressed as float64 seconds and returns a time.Duration which is expressed in int64 nanoseconds. This can mean some loss in precision of the mean value, but also allows the caller to compute the mean in varying timescales. Since microseconds is a pretty fine granularity for timings, truncating the floating point of the nanosecond seems acceptable.
If no durations have been recorded, a zero valued duration is returned.
func (*Benchmark) Range ¶
Range returns the difference between the slowest and fastest durations. If no samples have been added to the dataset, this function returns a zero duration. It will also return zero if the fastest and slowest durations are equal. E.g. in the case only one duration has been recorded or such that all durations have the same value.
func (*Benchmark) Serialize ¶
Serialize returns a map of summary statistics. This map is useful for dumping statistics to disk (using JSON for example) or for reporting the statistics elsewhere. The values in the maps are string representations of the time.Duration objects, which are reported in a human readable form. They can be converted back to durations with time.ParseDuration.
TODO: Create Dump and Load functions to get statistical data to and from offline sources.
func (*Benchmark) SetDuration ¶
SetDuration allows an external setting of the duration. This is especially useful in the case where multiple threads are updating the benchmark and the internal measurement of total time might double count concurrent accesses. In fact it is strongly recommended that this method is called from the external measurerer after all updating is complete.
func (*Benchmark) Slowest ¶
Slowest returns the maximum value of durations seen. If no durations have been added to the dataset, then this function returns a zero duration.
func (*Benchmark) StdDev ¶
StdDev returns the standard deviation of samples, the square root of the variance. This function returns a time.Duration which represents a loss in precision from int64 nanoseconds to float64 seconds.
If no more than 1 durations were recorded, returns a zero valued duration.
func (*Benchmark) Throughput ¶
Throughput returns the number of samples per second, measured as the inverse mean: number of samples divided by the total duration in seconds. The duration is computed in two ways:
- if SetDuration is called, that duration is used
- otherwise, the total number of observed seconds is used
This metric does not express a duration, so a float64 value is returned instead. If the duration or number of accesses is zero, 0.0 is returned.
func (*Benchmark) Update ¶
Update the benchmark with a duration or durations (thread-safe). If a duration of 0 is passed, then it is interpreted as a timeout -- e.g. a maximal duration bound had been reached. Timeouts are recorded in a separate counter and can be used to express failure measures.
func (*Benchmark) Variance ¶
Variance computes the variability of samples and describes the distance of the distribution from the mean. This function returns a time.Duration, which can mean a loss in precision lower than the microsecond level. This is usually acceptable for most applications.
If no more than 1 durations were recorded, returns a zero valued duration.
type Statistics ¶
Statistics keeps track of descriptive statistics in an online fashion at runtime without saving each individual sample in an array. It does this by updating the internal state of summary aggregates including the number of samples seen, the sum of values, and the sum of the value squared. It also tracks the minimum and maximum values seen.
The primary entry point to the object is via the Update method, where one or more samples can be passed. This object has unexported fields because it is thread-safe (via a sync.RWMutex). All properties must be accesesd from read-locked access methods.
Example ¶
stats := new(Statistics) samples, _ := loadTestData() for _, sample := range samples { stats.Update(sample) } data, _ := json.MarshalIndent(stats.Serialize(), "", " ") fmt.Println(string(data))
Output: { "maximum": 5.30507026071, "mean": 0.00041124313405184064, "minimum": -4.72206033824, "range": 10.02713059895, "samples": 1000000, "stddev": 0.9988808397330513, "total": 411.2431340518406, "variance": 0.9977629319858057 }
func (*Statistics) Append ¶
func (s *Statistics) Append(o *Statistics)
Append another statistics object to the current statistics object, incrementing the distribution from the other object.
func (*Statistics) Maximum ¶
func (s *Statistics) Maximum() float64
Maximum returns the maximum value of samples seen. If no samples have been added to the dataset, then this function returns 0.0.
func (*Statistics) Mean ¶
func (s *Statistics) Mean() float64
Mean returns the average for all samples, computed as the sum of values divided by the total number of samples seen. If no samples have been added then this function returns 0.0. Note that 0.0 is a valid mean and does not necessarily mean that no samples have been tracked.
func (*Statistics) Minimum ¶
func (s *Statistics) Minimum() float64
Minimum returns the minimum value of samples seen. If no samples have been added to the dataset, then this function returns 0.0.
func (*Statistics) Range ¶
func (s *Statistics) Range() float64
Range returns the difference between the maximum and minimum of samples. If no samples have been added to the dataset, this function returns 0.0. This function will also return zero if the maximum value equals the minimum value, e.g. in the case only one sample has been added or all of the samples are the same value.
func (*Statistics) Serialize ¶
func (s *Statistics) Serialize() map[string]float64
Serialize returns a map of summary statistics. This map is useful for dumping statistics to disk (using JSON for example) or for reporting the statistics elsewhere.
TODO: Create Dump and Load functions to get statistical data to and from offline sources.
func (*Statistics) StdDev ¶
func (s *Statistics) StdDev() float64
StdDev returns the standard deviation of samples, the square root of the variance. Two or more values are required to comput the standard deviation if one or none samples have been added to the data then this function returns 0.0.
func (*Statistics) Total ¶
func (s *Statistics) Total() float64
Total returns the sum of the samples.
func (*Statistics) Update ¶
func (s *Statistics) Update(samples ...float64)
Update the statistics with a sample or samples (thread-safe). Note that this object expects float64 values. While statistical computations for integer values are possible, it is simpler to simply transform the values into floats ahead of time.
func (*Statistics) Variance ¶
func (s *Statistics) Variance() float64
Variance computes the variability of samples and describes the distance of the distribution from the mean. If one or none samples have been added to the data set then this function returns 0.0 (two or more values are required to compute variance).