stats

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 21, 2023 License: Apache-2.0 Imports: 13 Imported by: 10

README

Statistics package

Variable substitution in Monte Carlo integration

Given the original integral:

I = \integral_{x_min..x_max} f(x) dx

replace x by x(t), so the integral becomes over dt:

I = integral_{t_min..t_max} f(x(t)) x'(t) dt

where x(t_min) = x_min, x(t_max) = x_max, and x'(t) = dx/dt is the derivative of x(t) over t.

The interesting case supported here is an N-dimensional integral over a vector X=(x_1, ..., x_N) in R^N, that is the N-dimensional real hyperspace. The original integral is assumed to be of the form:

I = E[g(X)] = \integral g(X)*f(X)*dX

where f(X) is the p.d.f. of some multivariate distribution of X. The simplest way to compute it is to generate random samples of X using the same distribution. Then the integral I can be approximated as:

I ~= 1/N * sum_{i=1..K} g(X_i)

for K number of samples.

In practice, the distribution f(X) may require too many samples to generate enough samples in the area of interest, e.g. where g(X) is sufficiently large and significantly contributes to the integral. Therefore, it may be beneficial to replace each x in the vector X with another variable t uniformly distributed in (-1..1), such that x(t -> -1) -> -Inf, x(t -> 1) -> Inf, x(t) is monotonically increasing and differentiable over (-1..1), and the probability of "interesting" values of x(t) is significant, so the number of required samples can be reduced.

Specificially, our g(X) will often be a unit function on a subspace, usually for computing a bucket value in a histogram for the N-compounded sample:

g(X) = (sum(X) in [low .. high]) ? 1 : 0

The substitution is

x(t) = r * t / (1 - t^(2*b))

where r controls the width of a near-uniform distribution of x values around zero, and b controls the portion of samples falling beyond the interval [-r..r].

Empirically, for the N-sum over [low..high], a good choice of parameters is:

r = max(|low|, |high|) / sqrt(N)
b=ceiling(sqrt(N))

However, rather than computing each bucket value separately, since we are effectively sampling x over the entire range, we can use every sample to increase the appropriate bucket by f(x(t))*x'(t), thus computing many g(x)'s in one go. The value of r in this case is the maximum absolute value in the buckets' range.

Documentation

Overview

Package stats implements statistical utilities.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ExpectationMC

func ExpectationMC(f func(x float64) float64, random func() float64,
	low, high float64, minIter, maxIter uint, precision float64, relative bool) float64

ExpectationMC computes a (potentially partial) expectation integral: \integral_{ low .. high } [ f(x) * d.Prob(x) * dx ] using the simple Monte-Carlo method of sampling f(x) with the given distribution sampler and computing the average. The bounds are inclusive. Note, that low may be -Inf, and high may be +Inf.

The sampling stops either when the maxIter samples have been reached, or when the estimated standard error becomes less than the required relative or absolute precision. See PreciseEnough for the exact semantics.

In any case, minIter iterations are guaranteed; it should normally be a small number (e.g. 100) to accumulate a reasonable initial error estimate.

func PreciseEnough added in v0.1.7

func PreciseEnough(x, deviation, epsilon float64, relative bool) bool

PreciseEnough determines if the value of x with an estimated deviation is within epsilon neighborhood of its true value. This can be used as a termination criteria in iterative approximation methods when a desired precision has been reached.

Note, that epsilon provides a relative precision: the true value of x is assumed to be within [x-dev..x+dev] interval, and the precision is reached when dev/x < epsilon for |x| >= 1, otherwise dev < epsilon.

func SafeLog

func SafeLog(x float64) float64

SafeLog is a "safe" natural logarithm, which for x <= 0 returns -Inf.

func TimeseriesIntersectIndices added in v0.2.7

func TimeseriesIntersectIndices(tss ...*Timeseries) [][]int

TimeseriesIntersectIndices returns the slice of indices S effectively intersecting the given Timeseries by Date. That is:

- len(S) is the number of distinct Dates present in all of the tss;

- len(S[i]) = len(tss) for any i<len(S), so each S[i] is the slice of indices in the corresponding Timeseries such that tss[j].Dates()[S[i][j]] == tss[k].Dates()[S[i][k]] for any j, k < len(tss).

func VarPrime added in v0.1.7

func VarPrime(t, scale, power float64) float64

VarPrime is the value of x'(t), the first derivative of x(t).

func VarSubst added in v0.1.7

func VarSubst(t, scale, power, shift float64) float64

VarSubst computes the value of

x(t) = shift + scale * t / (1 - t^(2*power))

to be used as a variable substitution in an integral over x in (-Inf..Inf). The new bounds for t become (-1..1), excluding the boundaries.

In Monte Carlo integration, the integral_{-Inf..Inf} f(x)dx is approximated by the sample average E[ f(x(t))*x'(t) ] for a uniformly distributed t over (-1..1).

Types

type Buckets

type Buckets struct {
	N int `json:"n" default:"101"`
	// Indicate that spacing / min / max can be set automatically.
	Auto    bool        `json:"auto bounds" default:"true"`
	Spacing SpacingType `json:"spacing"` // choices:"linear,exponential,symmetric exponential"
	Min     float64     `json:"min" default:"-50"`
	Max     float64     `json:"max" default:"50"`
	Bounds  []float64   `json:"-"` // n+1 bucket boundaries, auto-generated
}

Buckets configures the properties of histogram buckets. It implements message.Message, thus can be directly used in configs.

func NewBuckets

func NewBuckets(n int, minval, maxval float64, spacing SpacingType) (*Buckets, error)

NewBuckets creates and initializes a new buckets object.

func (*Buckets) Bucket

func (b *Buckets) Bucket(x float64) int

Bucket computes the bucket index for a sample.

func (*Buckets) FitTo added in v0.1.6

func (b *Buckets) FitTo(data []float64) error

FitTo data the bucket parameters such as spacing, min & max. Assumes that data is sorted in ascending order. In case of an error, the original value is preserved.

func (*Buckets) InitMessage added in v0.0.6

func (b *Buckets) InitMessage(js any) error

func (*Buckets) SameAs added in v0.0.7

func (b *Buckets) SameAs(b2 *Buckets) bool

SameAs checks if b defines the same buckets as b2.

func (*Buckets) Size

func (b *Buckets) Size(i int) float64

Size of the i'th bucket.

func (Buckets) String added in v0.0.7

func (b Buckets) String() string

String prints Buckets. It is a value method, so non-pointer Buckets will print correctly in fmt.Printf.

func (*Buckets) X

func (b *Buckets) X(i int, shift float64) float64

X computes the representative value of x for the i'th bucket, optionally adjusted by the relative shift amount (shift=1.0 is the next bucket boundary).

func (*Buckets) Xs

func (b *Buckets) Xs(shift float64) []float64

Xs returns the list of representative values for all buckets, optionally adjusted by the relative shift amount. It always returns a newly allocated slice, so it is safe to modify it.

type Distribution

type Distribution interface {
	distuv.Rander
	distuv.Quantiler
	Prob(float64) float64 // the p.d.f. value at x
	Mean() float64
	MAD() float64 // mean absolute deviation
	Variance() float64
	CDF(x float64) float64 // returns max. quantile for x
	Copy() Distribution    // shallow-copy with a new instance of rand.Source
	// Set random seed when applicable. Mostly used in tests.
	Seed(uint64)
}

Distribution API for common operations.

type DistributionWithHistogram added in v0.1.5

type DistributionWithHistogram interface {
	Distribution
	Histogram() *Histogram
}

type FastCompoundState added in v0.2.1

type FastCompoundState []float64

FastCompoundState is used in Transform by FastCompoundRandDistribution.

type Histogram

type Histogram struct {
	// contains filtered or unexported fields
}

Histogram stores sample counts for each bucket. The counts are continuous (float64) so that Histogram can be used to represent c.d.f.-based distributions derived numerically.

func CompoundHistogram added in v0.1.7

func CompoundHistogram(ctx context.Context, source Distribution, n int, c *ParallelSamplingConfig) *Histogram

CompoundHistogram computes a histogram of an n-compounded source distribution from its p.d.f. source.Prob(x) method.

func NewHistogram

func NewHistogram(buckets *Buckets) *Histogram

NewHistogram creates and initializes a Histogram. It panics if buckets is nil.

func (*Histogram) Add

func (h *Histogram) Add(xs ...float64)

Add samples to the Histogram.

func (*Histogram) AddHistogram added in v0.0.7

func (h *Histogram) AddHistogram(h2 *Histogram) error

AddHistogram adds h2 samples into the Histogram. h2 must have the same buckets as self.

func (*Histogram) AddWeights added in v0.1.7

func (h *Histogram) AddWeights(weights []float64) error

AddWeights to the histogram directly. Assumes len(weights) = h.Buckets().N.

func (*Histogram) AddWithWeight added in v0.1.7

func (h *Histogram) AddWithWeight(x, weight float64)

func (*Histogram) Buckets

func (h *Histogram) Buckets() *Buckets

Buckets value of the Histogram.

func (*Histogram) CDF

func (h *Histogram) CDF(x float64) float64

CDF value at x, approximated using histogram weights. It is effectively an inverse of Quantile(), interpolating values of x when it falls between bucket boundaries.

func (*Histogram) Count

func (h *Histogram) Count(i int) uint

Count of the i'th bucket. Returns 0 if i is out of range.

func (*Histogram) Counts

func (h *Histogram) Counts() []uint

Counts of the actual (possibly biased) samples in the Histogram. For p.d.f. estimates use Weights.

func (*Histogram) CountsTotal added in v0.1.7

func (h *Histogram) CountsTotal() uint

CountsTotal is the sum total of all counts.

func (*Histogram) MAD added in v0.0.7

func (h *Histogram) MAD() float64

MAD esmimates mean absolute deviation.

func (*Histogram) Mean

func (h *Histogram) Mean() float64

Mean computes the approximate mean of the distribution.

func (*Histogram) PDF

func (h *Histogram) PDF(i int) float64

PDF value at the i'th bucket. Return 0 if i is out of range. It integrates to 1.0 when dx = h.Buckets().Size(i).

func (*Histogram) PDFs

func (h *Histogram) PDFs() []float64

PDFs lists all the values of PDF for all the buckets. This is suitable for plotting against Xs().

func (*Histogram) Prob added in v0.1.5

func (h *Histogram) Prob(x float64) float64

Prob is the p.d.f. value at x, approximated using histogram weights.

func (*Histogram) Quantile

func (h *Histogram) Quantile(q float64) float64

Quantile computes the approximation of the q'th quantile, where e.g. q=0.5 is the 50th percentile. Quantiles of 0 and 1 can be used as approximations of the minimum and maximum sample values. Panics if q is not within [0..1].

func (*Histogram) Sigma added in v0.0.7

func (h *Histogram) Sigma() float64

Sigma is the estimated standard deviation.

func (*Histogram) StdError added in v0.1.7

func (h *Histogram) StdError(i int) float64

StdError estimates the standard deviation of the p.d.f. value at each bucket.

func (*Histogram) StdErrors added in v0.1.7

func (h *Histogram) StdErrors() []float64

StdErrors is a slice of estimated standard errors for all buckets.

func (*Histogram) Sum added in v0.0.7

func (h *Histogram) Sum(i int) float64

Sum of samples for the i'th bucket. Returns 0 if i is out of range.

func (*Histogram) SumTotal added in v0.0.7

func (h *Histogram) SumTotal() float64

SumTotal of all samples.

func (*Histogram) Sums added in v0.0.7

func (h *Histogram) Sums() []float64

Sums of samples per bucket.

func (*Histogram) Variance added in v0.0.7

func (h *Histogram) Variance() float64

Variance esmimation.

func (*Histogram) Weight added in v0.1.7

func (h *Histogram) Weight(i int) float64

Weight of the i'th bucket. Returns 0 if i is out of range.

func (*Histogram) Weights added in v0.1.7

func (h *Histogram) Weights() []float64

Weights of the buckets in the Histogram. These are the true "sizes" of the buckets in a traditional sense of a histogram.

func (*Histogram) WeightsTotal added in v0.1.8

func (h *Histogram) WeightsTotal() float64

WeightsTotal is the sum total of all weights.

func (*Histogram) X added in v0.0.7

func (h *Histogram) X(i int) float64

X returns the mean x value of the i'th bucket, or the logical middle of the bucket if it has no samples.

func (*Histogram) Xs added in v0.0.7

func (h *Histogram) Xs() []float64

Xs returns the list of mean values for all buckets. The slice is always newly allocated.

type HistogramDistribution added in v0.1.5

type HistogramDistribution struct {
	// contains filtered or unexported fields
}

HistogramDistribution creates a Distribution out of a Histogram.

func NewHistogramDistribution added in v0.1.5

func NewHistogramDistribution(h *Histogram) *HistogramDistribution

NewHistogramDistribution creates a new distribution out of h. Note, that h is stored as the original pointer, and not deep-copied. The caller must assure that h is not modified after creating this distribution, otherwise the behavior may be unpredictable.

func (*HistogramDistribution) CDF added in v0.1.5

func (*HistogramDistribution) Copy added in v0.1.5

Copy shallow-copies the distribution. Note, that the underlying Histogram is copied by pointer, and not deep-copied.

func (*HistogramDistribution) Histogram added in v0.1.5

func (d *HistogramDistribution) Histogram() *Histogram

func (*HistogramDistribution) MAD added in v0.1.5

func (d *HistogramDistribution) MAD() float64

func (*HistogramDistribution) Mean added in v0.1.5

func (d *HistogramDistribution) Mean() float64

func (*HistogramDistribution) Prob added in v0.1.5

func (*HistogramDistribution) Quantile added in v0.1.5

func (d *HistogramDistribution) Quantile(x float64) float64

func (*HistogramDistribution) Rand added in v0.1.5

func (d *HistogramDistribution) Rand() float64

func (*HistogramDistribution) Seed added in v0.1.5

func (d *HistogramDistribution) Seed(seed uint64)

func (*HistogramDistribution) Variance added in v0.1.5

func (d *HistogramDistribution) Variance() float64

type Normal

type Normal struct {
	distuv.Normal
}

Normal distribution.

func NewNormalDistribution

func NewNormalDistribution(mean, MAD float64) *Normal

NewNormalDistribution creates an instance of a normal distribution scaled and shifted for the given mean and MAD (mean absolute deviation).

func (*Normal) Copy

func (d *Normal) Copy() Distribution

func (*Normal) MAD

func (d *Normal) MAD() float64

func (*Normal) Mean

func (d *Normal) Mean() float64

func (*Normal) Seed

func (d *Normal) Seed(seed uint64)

type ParallelSamplingConfig added in v0.1.7

type ParallelSamplingConfig struct {
	BatchMin int     `json:"batch size min" default:"10"`
	BatchMax int     `json:"batch size max" default:"10000"`
	Samples  int     `json:"samples" default:"10000"` // for histogram
	Buckets  Buckets `json:"buckets"`
	// Biased sampling parameters, when applicable. Zero values indicate that the
	// caller must set appropriate defaults.
	Scale   float64 `json:"bias scale"` // size of uniform distribution area
	Power   float64 `json:"bias power"` // approach +-Inf near +-1 as 1/(1-t^(2*Power))
	Shift   float64 `json:"bias shift"` // value of x(t=0)
	Workers int     `json:"workers"`    // default: 2*runtime.NumCPU()
	Seed    int     `json:"seed"`       // for use in tests when > 0
}

ParallelSamplingConfig is a set of configuration parameters for RandDistribution suitable for use in user config file schema.

func (*ParallelSamplingConfig) InitMessage added in v0.1.7

func (c *ParallelSamplingConfig) InitMessage(js any) error

type PriceField

type PriceField uint8

PriceField is an enum type indicating which PriceRow field to use.

const (
	PriceOpenUnadjusted PriceField = iota
	PriceOpenSplitAdjusted
	PriceOpenFullyAdjusted
	PriceHighUnadjusted
	PriceHighSplitAdjusted
	PriceHighFullyAdjusted
	PriceLowUnadjusted
	PriceLowSplitAdjusted
	PriceLowFullyAdjusted
	PriceCloseUnadjusted
	PriceCloseSplitAdjusted
	PriceCloseFullyAdjusted
	PriceCashVolume
)

type RandDistribution

type RandDistribution[State any] struct {
	// contains filtered or unexported fields
}

RandDistribution uses a transformed Rand method of a source distribution to create another distribution. In particular, its own Rand function simply calls the source's Rand and applies the transform. It estimates and caches mean, MAD and quantiles (as a histogram) from a set number of samples. It never stores the generated samples, so its memory footprint remains small.

func CompoundRandDistribution

func CompoundRandDistribution(ctx context.Context, source Distribution, n int, cfg *ParallelSamplingConfig) *RandDistribution[struct{}]

CompoundRandDistribution creates a RandDistribution out of source compounded n times. That is, source.Rand() is invoked n times and the sum of its samples is a new single sample in the new distribution.

func FastCompoundRandDistribution added in v0.1.4

func FastCompoundRandDistribution(ctx context.Context, source Distribution, n int, cfg *ParallelSamplingConfig) *RandDistribution[FastCompoundState]

FastCompoundRandDistribution creates a RandDistribution out of source compounded n times. However, the source.Rand() values are not recomputed n times for each new sample, but are taken as the sum of a sliding window in a single sequence of source samples. This reduces the number of generated source samples from N*numSamples to N+numSamples. In practice, multiple such sequences are generated in parallel for further speedup.

func NewRandDistribution

func NewRandDistribution[S any](ctx context.Context, source Distribution, xform *Transform[S], cfg *ParallelSamplingConfig) *RandDistribution[S]

NewRandDistribution creates a Distribution using the transformation of the random sampler function of the source distribution. The source distribution is copied using Distribution.Copy method, and therefore can be sampled independently and in parallel with the original source. It uses the given number of samples to estimate and lazily cache mean, MAD and quantiles.

func (*RandDistribution[S]) CDF

func (d *RandDistribution[S]) CDF(x float64) float64

func (*RandDistribution[S]) Copy

func (d *RandDistribution[S]) Copy() Distribution

func (*RandDistribution[S]) Histogram

func (d *RandDistribution[S]) Histogram() *Histogram

Histogram of the generator, lazily cached.

func (*RandDistribution[S]) MAD

func (d *RandDistribution[S]) MAD() float64

func (*RandDistribution[S]) Mean

func (d *RandDistribution[S]) Mean() float64

func (*RandDistribution[S]) Prob

func (d *RandDistribution[S]) Prob(x float64) float64

func (*RandDistribution[S]) Quantile

func (d *RandDistribution[S]) Quantile(x float64) float64

func (*RandDistribution[S]) Rand

func (d *RandDistribution[S]) Rand() float64

func (*RandDistribution[S]) Seed

func (d *RandDistribution[S]) Seed(seed uint64)

func (*RandDistribution[S]) Variance added in v0.1.2

func (d *RandDistribution[S]) Variance() float64

type Sample

type Sample struct {
	// contains filtered or unexported fields
}

Sample stores unordered set of numerical data (float64) and computes various statistics over it.

func NewSample

func NewSample(data []float64) *Sample

NewSample creates a new sample initialized with data. Note, that it reuses the slice without copying. Use Copy() if you need to decouple your input from the Sample.

func (*Sample) Copy

func (s *Sample) Copy() *Sample

Copy creates a deep copy of the Sample. This can be useful, e.g. like this:

s := NewSample(data).Copy()
// can safely modify data in place without affecting s.

func (*Sample) Data

func (s *Sample) Data() []float64

Data returns the sample data.

func (*Sample) MAD

func (s *Sample) MAD() float64

MAD computes mean absolute deviation of the Sample, cached.

func (*Sample) Mean

func (s *Sample) Mean() float64

Mean computes the mean of the Sample, cached.

func (*Sample) Normalize added in v0.0.5

func (s *Sample) Normalize() (*Sample, error)

Normalize creates a new Sample of {(x - mean) / MAD}, thus its Mean and MAD are 0 and 1, respectively.

func (*Sample) Sigma

func (s *Sample) Sigma() float64

Sigma computes the standard deviation of the Sample, cached.

func (*Sample) Sum

func (s *Sample) Sum() float64

Sum of samples, cached.

func (*Sample) SumDev

func (s *Sample) SumDev() float64

SumDev computes the sum of absolute deviations from the mean, cached.

func (*Sample) SumSquaredDev

func (s *Sample) SumSquaredDev() float64

SumSquaredDev computes the sum of squared deviations from the mean, cached.

func (*Sample) Variance

func (s *Sample) Variance() float64

Variance of the Sample (sigma squared), cached.

type SampleDistribution

type SampleDistribution struct {
	// contains filtered or unexported fields
}

SampleDistribution implements a distribution of a sample.

func CompoundSampleDistribution

func CompoundSampleDistribution(ctx context.Context, source Distribution, n int, cfg *ParallelSamplingConfig) *SampleDistribution

CompoundSampleDistribution creates a SampleDistribution out of a random generator compounded n times. That is, `rnd` is invoked n times and the sum of its samples is a new single sample in the new distribution.

func FastCompoundSampleDistribution added in v0.1.4

func FastCompoundSampleDistribution(ctx context.Context, source Distribution, n int, cfg *ParallelSamplingConfig) *SampleDistribution

FastCompoundSampleDistribution creates a SampleDistribution out of a random generator compounded n times. See FastCompoundRandDistribution.

func NewSampleDistribution

func NewSampleDistribution(sample []float64, buckets *Buckets) *SampleDistribution

NewSampleDistribution creates an instance of a SampleDistribution. It requires Buckets to create a Histogram for computing a reasonable p.d.f. NOTE: it will sort the sample in place and store the slice as is, without deep copying. The caller is responsible for making a copy if the original order is important, or if the sample will later be modified by the caller.

func NewSampleDistributionFromRand

func NewSampleDistributionFromRand(d Distribution, samples int, buckets *Buckets) *SampleDistribution

NewSampleDistributionFromRand creates an instance of a SampleDistribution by sampling a given distribution. It requires Buckets to create a Histogram for computing a reasonable p.d.f.

func NewSampleDistributionFromRandDist added in v0.1.4

func NewSampleDistributionFromRandDist[S any](d *RandDistribution[S], samples int, buckets *Buckets) *SampleDistribution

NewSampleDistributionFromRandDist is similar to NewSampleDistributionFromRand except that it uses fast stateful sample generation of RandDistribution.

func (*SampleDistribution) CDF

CDF of the sample distribution.

func (*SampleDistribution) Copy

func (d *SampleDistribution) Copy() Distribution

func (*SampleDistribution) Histogram

func (d *SampleDistribution) Histogram() *Histogram

Histogram of the sample distribution.

func (*SampleDistribution) MAD

func (d *SampleDistribution) MAD() float64

func (*SampleDistribution) Mean

func (d *SampleDistribution) Mean() float64

func (*SampleDistribution) Prob

func (d *SampleDistribution) Prob(x float64) float64

func (*SampleDistribution) Quantile

func (d *SampleDistribution) Quantile(x float64) float64

func (*SampleDistribution) Rand

func (d *SampleDistribution) Rand() float64

func (*SampleDistribution) Sample

func (d *SampleDistribution) Sample() *Sample

Sample as the source of the distribution.

func (*SampleDistribution) Seed

func (d *SampleDistribution) Seed(seed uint64)

func (*SampleDistribution) Variance added in v0.1.2

func (d *SampleDistribution) Variance() float64

type SpacingType

type SpacingType uint8

SpacingType is enum for different ways buckets are spaced out.

const (
	LinearSpacing SpacingType = iota
	ExponentialSpacing
	SymmetricExponentialSpacing
)

Values of SpacingType: - LinearSpacing divides the interval into n equal parts.

  • ExponentialSpacing divides the log-space interval into n equal parts, thus the buckets in the original interval grow exponentially away from zero. Note, that Min must be > 0.

  • SymmetricExponentialSpacing makes the exponential spacing symmetric around zero. That is, the buckets grow exponentially away from zero in both directions, and the middle bucket spans [-Min..Min]. It requires n to be odd, and Min > 0, but the actual interval is [-Max..Max].

func (*SpacingType) InitMessage added in v0.0.6

func (s *SpacingType) InitMessage(js any) error

func (SpacingType) String added in v0.0.7

func (s SpacingType) String() string

String prints SpacingType. It's a value method, so it prints correctly in fmt.Printf.

type StandardError added in v0.1.7

type StandardError struct {
	// contains filtered or unexported fields
}

StandardError accumulates and estimates the stardand deviation of an online sequence of samples. The accumulation of the stardand deviation is done in a computationally stable way using a generalization of the Youngs and Cramer formulas, a variant of the more popular Welford's algorithm.

A zero value of StandardError is ready for use, and represents 0 samples.

func (*StandardError) Add added in v0.1.7

func (e *StandardError) Add(x float64)

Add a single sample.

func (*StandardError) AddZeros added in v0.1.7

func (e *StandardError) AddZeros(n uint)

AddZeros adds n zero-valued samples.

func (StandardError) Mean added in v0.1.7

func (e StandardError) Mean() float64

Mean value of all samples.

func (*StandardError) Merge added in v0.1.7

func (e *StandardError) Merge(other StandardError)

Merge the other StandardError into e, so the resulting error estimate is for the union of samples.

func (StandardError) N added in v0.1.7

func (e StandardError) N() uint

N returns the number of accumulated samples.

func (StandardError) Sigma added in v0.1.7

func (e StandardError) Sigma() float64

Sigma is the standard deviation of the accumulated samples.

func (StandardError) Variance added in v0.1.7

func (e StandardError) Variance() float64

Variance of the accumulated samples.

type StudentsT

type StudentsT struct {
	distuv.StudentsT
}

StudentsT distribution.

func NewStudentsTDistribution

func NewStudentsTDistribution(alpha, mean, MAD float64) *StudentsT

NewStudentsTDistribution creates an instance of a Student's T distribution scaled and shifted to have a given mean and MAD (mean absolute deviation).

func (*StudentsT) Copy

func (d *StudentsT) Copy() Distribution

func (*StudentsT) MAD

func (d *StudentsT) MAD() float64

func (*StudentsT) Mean

func (d *StudentsT) Mean() float64

func (*StudentsT) Seed

func (d *StudentsT) Seed(seed uint64)

type Timeseries

type Timeseries struct {
	// contains filtered or unexported fields
}

Timeseries stores numeric values along with timestamps. The timestamps are always sorted in ascending order.

func NewTimeseries

func NewTimeseries(dates []db.Date, data []float64) *Timeseries

NewTimeseries creates a new Timeseries. The dates are expected to be sorted in ascending order (not checked). It panics if dates and data have different lengths. Note, that the argument slices are used as is, not copied. Use Copy() if arguments need to be modified after the call.

func NewTimeseriesFromPrices added in v0.3.0

func NewTimeseriesFromPrices(prices []db.PriceRow, f PriceField) *Timeseries

NewTimeseriesFromPrices initializes Timeseries from PriceRow slice.

func TimeseriesIntersect added in v0.2.7

func TimeseriesIntersect(tss ...*Timeseries) []*Timeseries

TimeseriesIntersect creates new list of Timeseries whose Dates are identical by dropping the mismatching Dates and Data elements out. The resulting slice is guaranteed to be of the same length as the number of arguments and contain valid Timeseries, even if they are empty.

func (*Timeseries) Add added in v0.2.9

func (t *Timeseries) Add(t2 *Timeseries) *Timeseries

Add two Timeseries pointwise.

func (*Timeseries) AddC added in v0.2.9

func (t *Timeseries) AddC(c float64) *Timeseries

AddC adds a constant to Timeseries data, pointwise.

func (*Timeseries) BinaryOp added in v0.2.9

func (t *Timeseries) BinaryOp(f func(x, y float64) float64, t2 *Timeseries) *Timeseries

BinaryOp applies f to the two Timeseries element-wise. It panics if the lengths or dates (pointwise) differ.

func (*Timeseries) Check

func (t *Timeseries) Check() error

Check that Timeseries is consistent: the lengths of dates and data are the same and the dates are ordered in ascending order.

func (*Timeseries) Copy

func (t *Timeseries) Copy() *Timeseries

Copy makes a deep copy of the Timeseries.

func (*Timeseries) Data

func (t *Timeseries) Data() []float64

Data of the Timeseries.

func (*Timeseries) Dates

func (t *Timeseries) Dates() []db.Date

Dates of the Timeseries.

func (*Timeseries) Div added in v0.2.9

func (t *Timeseries) Div(t2 *Timeseries) *Timeseries

Div divides Timeseries by another, pointwise.

func (*Timeseries) DivC added in v0.2.9

func (t *Timeseries) DivC(c float64) *Timeseries

DivC divides Timeseries by a constant, pointwise.

func (*Timeseries) Exp added in v0.2.9

func (t *Timeseries) Exp() *Timeseries

Exp of the Timeseries data, pointwise.

func (*Timeseries) Filter added in v0.3.6

func (t *Timeseries) Filter(f func(int) bool) *Timeseries

Filter elements of the Timeseries to only those that satisfy f, by index.

func (*Timeseries) Log added in v0.2.9

func (t *Timeseries) Log() *Timeseries

Log of the Timeseries data, pointwise.

func (*Timeseries) LogProfits added in v0.0.5

func (t *Timeseries) LogProfits(n int, intraday bool) *Timeseries

LogProfits computes a new Timeseries of log-profits {log(x[t+n]) - log(x[t])}. The associated log-profit date is t+n. When intarday is true, skip log-profits spanning more than one day.

func (*Timeseries) Mult added in v0.2.9

func (t *Timeseries) Mult(t2 *Timeseries) *Timeseries

Mult multiplies two Timeseries pointwise.

func (*Timeseries) MultC added in v0.2.9

func (t *Timeseries) MultC(c float64) *Timeseries

MultC multiplies Timeseries data by a constant, pointwise.

func (*Timeseries) Range

func (t *Timeseries) Range(start, end db.Date) *Timeseries

Range extracts the sub-series from the inclusive time interval. It may return an empty Timeseries, but never nil.

func (*Timeseries) Shift

func (t *Timeseries) Shift(shift int) *Timeseries

Shift the timeseries in time. A positive shift moves the values into the future, negative - into the past. The values outside of the date range are dropped. It may return an empty Timeseries, but never nil.

func (*Timeseries) Sub added in v0.2.9

func (t *Timeseries) Sub(t2 *Timeseries) *Timeseries

Sub subtracts another Timeseries from self, pointwise.

func (*Timeseries) SubC added in v0.2.9

func (t *Timeseries) SubC(c float64) *Timeseries

SubC subtracts a constant from Timeseries, pointwise.

func (*Timeseries) UnaryOp added in v0.2.9

func (t *Timeseries) UnaryOp(f func(float64) float64) *Timeseries

UnaryOp applies f pointwise to the Timeseries data.

type Transform added in v0.1.4

type Transform[State any] struct {
	InitState func() State
	Fn        func(d Distribution, state State) (float64, State)
}

Transform is a stateful random variable transformer used by RandDistribution to generate its random values. The initial state generator and the transform function must be go routine safe.

The random values Y_i are generated as Y_i, S_i = Fn(d, S_(i-1)), where S_0=InitState(). It is assumed that, asymptotically, generating multiple short sequences is statistically equivalent to generating a single long sequence. If this property doesn't hold, the Y values likely cannot be directly modeled by a random variable.

As an example, a sliding window compounding (the sum of last N d.Rand() values, or the log-profit over N steps) satisfies this property, but the unbounded sum (such as log-price) does not.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL