gopula

package module
v0.0.0-...-284be7f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 7, 2019 License: GPL-3.0 Imports: 14 Imported by: 0

README

gopula Build Status Go Report Card Coverage Status GoDoc

Introduction

gopula is a pure Go package aimed to deal with Archimedean Copulas. It implements the well known families: Ali-Mikhail-Haq, Clayton, Frank, Gumbel and Joe.

Currently gopula has three main features:

  • Basic computations (pdf, cdf, etc.)
  • Parameter estimation through Maximum Likelihood (with confidence bounds)
  • Data sampling

To get it:

$ go get github.com/asiffer/gopula

Get started

Sampling

gopula uses the McNeil & Nešlehová universal sampling method to draw observations from a given archimedean copula (see references)

// sampling.go

package main

import (
    "fmt"
    "gonum.org/v1/gonum/mat"
    "github.com/asiffer/gopula"
)

func matPrint(X mat.Matrix) {
	fa := mat.Formatted(X, mat.Prefix(""), mat.Squeeze())
	fmt.Printf("%v\n", fa)
}

func main() {
    // Create a new instance of an Archimedean copula
    // NewCopula(family, theta)
    // Available families are:
    //  - "AMH"
    //  - "Clayton"
    //  - "Frank"
    //  - "Gumbel"
    //  - "Joe"
    A := gopula.NewCopula("Clayton", 2.5)

    // Sample some observations in the desired dimension
    // Sample(number of observations, dimension)
    // It returns a gonum matrix 
    M := A.Sample(9000, 3)
    matPrint(M)

    // The matrix can be exported to a csv file
    // SaveCSV(gonum matrix, path, separator)
    err := gopula.SaveCSV(M, "/tmp/data.csv", ',')
    if err != nil {
        fmt.Println(err)
    }
}

We can check the output file:

$ go run sampling.go
$ head /tmp/data.csv
0.517730,0.402611,0.704825
0.523264,0.565531,0.457385
0.110614,0.162048,0.117283
0.126623,0.175349,0.342992
0.205170,0.199397,0.390389
0.275976,0.151787,0.481059
0.149396,0.198625,0.085618
0.151417,0.586827,0.086791
0.143233,0.227281,0.126375
0.576612,0.459220,0.380677
$ wc -l /tmp/data.csv
9000 /tmp/data.csv
Inference

Despite Archimedean copulas is quite a rich class of copulas with a great deal of nice properties, estimating the single parameter 𝜃 from observations is not so easy. Many techniques exist but gopula uses Maximum Likelihood Estimation as it performs rather the best (see the work of Marius Hofert, Martin Mächler and Alexander J. McNeil in [2]).

The inference procedure mainly uses the formulas provided by the authors mentionned above (see [3])

// inference.go

package main

import (
    "fmt"
    "github.com/asiffer/gopula"
)

func main() {
    // Load the previous sampled dataset
    // Remember: they have been generated with 𝜃 = 2.5
    M, _ := gopula.LoadCSV("/tmp/data.csv", ',', false)
    if err != nil {
        fmt.Println(err)
        return
    }

    // Create a new instance of an Archimedean copula.
    // The initial value of theta does not matter in 
    // this case
    A := gopula.NewCopula("Clayton", 7.5)

    // Fit and see the results
    result := A.Fit(M)
    fmt.Println(result)
}

Let us check what the value of 𝜃 is infered:

$ go run inference.go
𝜃* 2.525        [2.481 2.570]
ℓ  10856.550

We can notice that 𝜃* is quite close to those we used to generate the data (the values in brackets are 95% confidence bounds and ℓ is the maximum likelihood reached).

Troubleshooting

The sampling may output out-of-bounds data (coordinates higher than 1). It occurs when the radial quantile function fails. As it uses a bisection search, it is probably due to a lack of function evaluations. You can increase it through the variable gopula.MaxFunEvals.

Next

I am currently implementing a kind of "margin" object aimed to represent an empirical cumulative distribution function (ecdf). In fact, most of the time, you have not uniform margins necessary to fit a copula so you need to transform the real margins with their ecdf (see wikipedia).

References

[1] McNeil, A. J., & Nešlehová, J. (2009). Multivariate Archimedean copulas, d-monotone functions and ℓ1-norm symmetric distributions. The Annals of Statistics, 37(5B), 3059-3097.

[2] Hofert, M., Mächler, M., & McNeil, A. J. (2012). Estimators for Archimedean copulas in high dimensions. arXiv preprint arXiv:1207.1708.

[3] Hofert, M., Mächler, M., & Mcneil, A. J. (2012). Likelihood inference for Archimedean copulas in high dimensions under known margins. Journal of Multivariate Analysis, 110, 133-150.

Documentation

Overview

Package gopula implements common Archimedean Copulas. It aims both to infer a copula from observations and to sample data from a given model.

Index

Constants

This section is empty.

Variables

View Source
var (
	// MaxDim is the maximum dimension for which
	// stirling number are pre-computed
	MaxDim = 12

	// Inf is a 'big' value (for optimizing bound purpose)
	Inf = 15.
)
View Source
var (
	// Eps is the machine floating-point precision
	Eps = 3.e-8
	// MaxFunEval is the maximum allowed number of iterations
	MaxFunEval = 500
)

Functions

func BFGS

func BFGS(f ObjectiveFunction, args interface{}, x0 float64) (float64, float64, int, error)

BFGS uses the gonum implementation of the BFGS algorithm to find the minimum of a function without constraints

func Bisection

func Bisection(f ObjectiveFunction, args interface{}, x1, x2, tol float64) (float64, error)

Bisection finds a root without derivatives

func BrentMinimizer

func BrentMinimizer(f ObjectiveFunction,
	args interface{},
	a,
	b,
	t float64) (float64, float64, int, error)

BrentMinimizer minimizes the function f according to the Brent's method

func BrentRootFinder

func BrentRootFinder(f ObjectiveFunction, args interface{}, x1, x2, tol float64) (float64, error)

BrentRootFinder finds a root of the the function f: x->f(x, args) between x1 and x2 with the Van Wijngaarden–Dekker–Brent method. The implementation directly comes from the book 'Numerical Recipes in C' (p. 361, 362)

func Hist

func Hist(values []float64, bins int, path string) error

Hist plots an histogram

func LoadCSV

func LoadCSV(path string, sep rune, header bool) (*mat.Dense, error)

LoadCSV loads data from a csvfile to a gonum matrix

func PrecomputeStirlingNumbers

func PrecomputeStirlingNumbers()

PrecomputeStirlingNumbers computes first and second kind stirling numbers until MaxDim

func SaveCSV

func SaveCSV(M *mat.Dense, path string, sep rune) error

SaveCSV saves a gonum matrix to a csv file

func Secant

func Secant(fun ObjectiveFunction, args interface{}, x1, x2, xacc float64) (float64, error)

Secant finds the root of a function func thought to lie between x1 and x2. The root is refined until its accuracy is ±xacc. The implementation directly comes from the book 'Numerical Recipes in C' (p. 361, 362)

Types

type AMH

type AMH struct{}

AMH defines the AMH copula

func (*AMH) Cdf

func (c *AMH) Cdf(vector []float64, theta float64) float64

Cdf computes the cumulative distribution function of the copula

func (*AMH) Family

func (c *AMH) Family() string

Family returns the name of the copula family

func (*AMH) LogPdf

func (c *AMH) LogPdf(vector []float64, theta float64) float64

LogPdf computes the logarithm of the density of the copula

func (*AMH) Pdf

func (c *AMH) Pdf(vector []float64, theta float64) float64

Pdf computes the density of the generated copula

func (*AMH) Psi

func (c *AMH) Psi(t float64, theta float64) float64

Psi is the generating function of the copula

func (*AMH) PsiD

func (c *AMH) PsiD(dim int, t float64, theta float64) float64

PsiD is the d-th derivative of Psi

func (*AMH) PsiInv

func (c *AMH) PsiInv(t float64, theta float64) float64

PsiInv is the inverse of the generating function of the copula

func (*AMH) ThetaBounds

func (c *AMH) ThetaBounds() (float64, float64)

ThetaBounds returns the range where the copula is well defined

type ArchimedeanCopula

type ArchimedeanCopula struct {
	// contains filtered or unexported fields
}

ArchimedeanCopula is a generic structure defining an archimedean copula

func NewCopula

func NewCopula(family string, theta float64) *ArchimedeanCopula

NewCopula returns a new copula according to the desired family

func (*ArchimedeanCopula) Cdf

func (arch *ArchimedeanCopula) Cdf(vector []float64) float64

Cdf computes the cumulative distribution function of the copula

func (*ArchimedeanCopula) ConfidenceBounds

func (arch *ArchimedeanCopula) ConfidenceBounds(M *mat.Dense, level float64) (float64, float64)

ConfidenceBounds compute the upper and lower confidence bounds at given level (level = 1-alpha = 0.95 in practice). The parameter theta must be the fitted value.

func (*ArchimedeanCopula) Family

func (arch *ArchimedeanCopula) Family() string

Family returns the name of the copula family

func (*ArchimedeanCopula) Fit

func (arch *ArchimedeanCopula) Fit(M *mat.Dense) *FitResult

Fit estimates the best theta parameter through maximum likelihood estimation according to the input observations

func (*ArchimedeanCopula) LogLikelihood

func (arch *ArchimedeanCopula) LogLikelihood(M *mat.Dense) float64

LogLikelihood computes the log-likelihood of a batch of observations given the underlying archimedean copula

func (*ArchimedeanCopula) LogPdf

func (arch *ArchimedeanCopula) LogPdf(vector []float64) float64

LogPdf computes the log density of the generated copula

func (*ArchimedeanCopula) Pdf

func (arch *ArchimedeanCopula) Pdf(vector []float64) float64

Pdf computes the density of the generated copula

func (*ArchimedeanCopula) RadialCdf

func (arch *ArchimedeanCopula) RadialCdf(x float64, dim int) float64

RadialCdf computes the cdf of the radial part of the ArchimeanCopula

func (*ArchimedeanCopula) RadialPpf

func (arch *ArchimedeanCopula) RadialPpf(p float64, dim int) float64

RadialPpf computes the quantile zp verifying P(X<zp) = p

func (*ArchimedeanCopula) Sample

func (arch *ArchimedeanCopula) Sample(size int, dim int) *mat.Dense

Sample generates random numbers according to the underlying copula

func (*ArchimedeanCopula) Theta

func (arch *ArchimedeanCopula) Theta() float64

Theta returns the current value of 𝜃

type ArchimedeanCopuler

type ArchimedeanCopuler interface {
	Family() string
	ThetaBounds() (float64, float64)
	Psi(t float64, theta float64) float64
	PsiInv(t float64, theta float64) float64
	PsiD(d int, t float64, theta float64) float64
	Cdf(vector []float64, theta float64) float64
	Pdf(vector []float64, theta float64) float64
	LogPdf(vector []float64, theta float64) float64
}

ArchimedeanCopuler is an interface to implement an archimedean copula

type Clayton

type Clayton struct{}

Clayton defines the clayton copula

func (*Clayton) Cdf

func (c *Clayton) Cdf(vector []float64, theta float64) float64

Cdf computes the cumulative distribution function of the copula

func (*Clayton) Family

func (c *Clayton) Family() string

Family returns the name of the copula family

func (*Clayton) LogPdf

func (c *Clayton) LogPdf(vector []float64, theta float64) float64

LogPdf computes the logarithm of the density of the copula

func (*Clayton) Pdf

func (c *Clayton) Pdf(vector []float64, theta float64) float64

Pdf computes the density of the generated copula

func (*Clayton) Psi

func (c *Clayton) Psi(t float64, theta float64) float64

Psi is the generating function of the copula

func (*Clayton) PsiD

func (c *Clayton) PsiD(d int, t float64, theta float64) float64

PsiD is the d-th derivative of Psi

func (*Clayton) PsiInv

func (c *Clayton) PsiInv(t float64, theta float64) float64

PsiInv is the inverse of the generating function of the copula

func (*Clayton) ThetaBounds

func (c *Clayton) ThetaBounds() (float64, float64)

ThetaBounds returns the range where the copula is well defined

type FitResult

type FitResult struct {
	// Theta is the estimated parameter
	Theta float64
	// LogLikelihhod is the correspond log-likelihood (the maximum)
	LogLikelihood float64
	// UpperBound is the 95% upper confidence bound
	UpperBound float64
	// LowerBound is the 95% upper confidence bound
	LowerBound float64
	// Evals is the number of function evaluations
	Evals int
	// Message describes whether the fit has suceeded
	Message string
}

FitResult is a basic structure detailing the output of the fit

func (*FitResult) String

func (fr *FitResult) String() string

type Frank

type Frank struct{}

Frank defines the Frank copula

func (*Frank) Cdf

func (c *Frank) Cdf(vector []float64, theta float64) float64

Cdf computes the cumulative distribution function of the copula

func (*Frank) Family

func (c *Frank) Family() string

Family returns the name of the copula family

func (*Frank) LogPdf

func (c *Frank) LogPdf(vector []float64, theta float64) float64

LogPdf computes the logarithm of the density of the copula

func (*Frank) Pdf

func (c *Frank) Pdf(vector []float64, theta float64) float64

Pdf computes the density of the generated copula

func (*Frank) Psi

func (c *Frank) Psi(t float64, theta float64) float64

Psi is the generating function of the copula

func (*Frank) PsiD

func (c *Frank) PsiD(dim int, t float64, theta float64) float64

PsiD is the d-th derivative of Psi

func (*Frank) PsiInv

func (c *Frank) PsiInv(t float64, theta float64) float64

PsiInv is the inverse of the generating function of the copula

func (*Frank) ThetaBounds

func (c *Frank) ThetaBounds() (float64, float64)

ThetaBounds returns the range where the copula is well defined

type Gumbel

type Gumbel struct{}

Gumbel defines the Gumbel copula

func (*Gumbel) Cdf

func (c *Gumbel) Cdf(vector []float64, theta float64) float64

Cdf computes the cumulative distribution function of the copula

func (*Gumbel) Family

func (c *Gumbel) Family() string

Family returns the name of the copula family

func (*Gumbel) LogPdf

func (c *Gumbel) LogPdf(vector []float64, theta float64) float64

LogPdf computes the logarithm of the density of the copula

func (*Gumbel) Pdf

func (c *Gumbel) Pdf(vector []float64, theta float64) float64

Pdf computes the density of the generated copula

func (*Gumbel) Psi

func (c *Gumbel) Psi(t float64, theta float64) float64

Psi is the generating function of the copula

func (*Gumbel) PsiD

func (c *Gumbel) PsiD(dim int, t float64, theta float64) float64

PsiD is the d-th derivative of Psi

func (*Gumbel) PsiInv

func (c *Gumbel) PsiInv(t float64, theta float64) float64

PsiInv is the inverse of the generating function of the copula

func (*Gumbel) ThetaBounds

func (c *Gumbel) ThetaBounds() (float64, float64)

ThetaBounds returns the range where the copula is well defined

type Joe

type Joe struct{}

Joe defines the Joe copula

func (*Joe) Cdf

func (c *Joe) Cdf(vector []float64, theta float64) float64

Cdf computes the cumulative distribution function of the copula

func (*Joe) Family

func (c *Joe) Family() string

Family returns the name of the copula family

func (*Joe) LogPdf

func (c *Joe) LogPdf(vector []float64, theta float64) float64

LogPdf computes the logarithm of the density of the copula

func (*Joe) Pdf

func (c *Joe) Pdf(vector []float64, theta float64) float64

Pdf computes the density of the generated copula

func (*Joe) Psi

func (c *Joe) Psi(t float64, theta float64) float64

Psi is the generating function of the copula

func (*Joe) PsiD

func (c *Joe) PsiD(d int, t float64, theta float64) float64

PsiD is the d-th derivative of Psi

func (*Joe) PsiInv

func (c *Joe) PsiInv(t float64, theta float64) float64

PsiInv is the inverse of the generating function of the copula

func (*Joe) ThetaBounds

func (c *Joe) ThetaBounds() (float64, float64)

ThetaBounds returns the range where the copula is well defined

type ObjectiveFunction

type ObjectiveFunction func(x float64, args interface{}) float64

ObjectiveFunction defines a function to minimize

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL