stats

package

v1.1.1 Latest Latest Go to latest Published: Feb 11, 2023 License: MIT Imports: 4 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/aburdulescu/gbenchdiff

Links

Open Source Insights

Documentation ¶

Index ¶

Variables
type LocationHypothesis
type MannWhitneyUTestResult
- func MannWhitneyUTest(x1, x2 []float64, alt LocationHypothesis) (*MannWhitneyUTestResult, error)
type NormalDist
type UDist
- func (d UDist) CDF(U float64) float64

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrSamplesEqual      = errors.New("all samples are equal")
	ErrSampleSize        = errors.New("sample is too small")
	ErrZeroVariance      = errors.New("sample has zero variance")
	ErrMismatchedSamples = errors.New("samples have different lengths")
)

View Source

var MannWhitneyExactLimit = 50

MannWhitneyExactLimit gives the largest sample size for which the exact U distribution will be used for the Mann-Whitney U-test.

Using the exact distribution is necessary for small sample sizes because the distribution is highly irregular. However, computing the distribution for large sample sizes is both computationally expensive and unnecessary because it quickly approaches a normal approximation. Computing the distribution for two 50 value samples takes a few milliseconds on a 2014 laptop.

View Source

var MannWhitneyTiesExactLimit = 25

MannWhitneyTiesExactLimit gives the largest sample size for which the exact U distribution will be used for the Mann-Whitney U-test in the presence of ties.

Computing this distribution is more expensive than computing the distribution without ties, so this is set lower. Computing this distribution for two 25 value samples takes about ten milliseconds on a 2014 laptop.

View Source

var StdNormal = NormalDist{0, 1}

StdNormal is the standard normal distribution (Mu = 0, Sigma = 1)

Functions ¶

This section is empty.

Types ¶

type LocationHypothesis ¶

type LocationHypothesis int

A LocationHypothesis specifies the alternative hypothesis of a location test such as a t-test or a Mann-Whitney U-test. The default (zero) value is to test against the alternative hypothesis that they differ.

const (
	// LocationLess specifies the alternative hypothesis that the
	// location of the first sample is less than the second. This
	// is a one-tailed test.
	LocationLess LocationHypothesis = -1

	// LocationDiffers specifies the alternative hypothesis that
	// the locations of the two samples are not equal. This is a
	// two-tailed test.
	LocationDiffers LocationHypothesis = 0

	// LocationGreater specifies the alternative hypothesis that
	// the location of the first sample is greater than the
	// second. This is a one-tailed test.
	LocationGreater LocationHypothesis = 1
)

type MannWhitneyUTestResult ¶

type MannWhitneyUTestResult struct {
	// N1 and N2 are the sizes of the input samples.
	N1, N2 int

	// U is the value of the Mann-Whitney U statistic for this
	// test, generalized by counting ties as 0.5.
	//
	// Given the Cartesian product of the two samples, this is the
	// number of pairs in which the value from the first sample is
	// greater than the value of the second, plus 0.5 times the
	// number of pairs where the values from the two samples are
	// equal. Hence, U is always an integer multiple of 0.5 (it is
	// a whole integer if there are no ties) in the range [0, N1*N2].
	//
	// U statistics always come in pairs, depending on which
	// sample is "first". The mirror U for the other sample can be
	// calculated as N1*N2 - U.
	//
	// There are many equivalent statistics with slightly
	// different definitions. The Wilcoxon (1945) W statistic
	// (generalized for ties) is U + (N1(N1+1))/2. It is also
	// common to use 2U to eliminate the half steps and Smid
	// (1956) uses N1*N2 - 2U to additionally center the
	// distribution.
	U float64

	// AltHypothesis specifies the alternative hypothesis tested
	// by this test against the null hypothesis that there is no
	// difference in the locations of the samples.
	AltHypothesis LocationHypothesis

	// P is the p-value of the Mann-Whitney test for the given
	// null hypothesis.
	P float64
}

A MannWhitneyUTestResult is the result of a Mann-Whitney U-test.

func MannWhitneyUTest ¶

func MannWhitneyUTest(x1, x2 []float64, alt LocationHypothesis) (*MannWhitneyUTestResult, error)

MannWhitneyUTest performs a Mann-Whitney U-test [1,2] of the null hypothesis that two samples come from the same population against the alternative hypothesis that one sample tends to have larger or smaller values than the other.

This is similar to a t-test, but unlike the t-test, the Mann-Whitney U-test is non-parametric (it does not assume a normal distribution). It has very slightly lower efficiency than the t-test on normal distributions.

Computing the exact U distribution is expensive for large sample sizes, so this uses a normal approximation for sample sizes larger than MannWhitneyExactLimit if there are no ties or MannWhitneyTiesExactLimit if there are ties. This normal approximation uses both the tie correction and the continuity correction.

This can fail with ErrSampleSize if either sample is empty or ErrSamplesEqual if all sample values are equal.

This is also known as a Mann-Whitney-Wilcoxon test and is equivalent to the Wilcoxon rank-sum test, though the Wilcoxon rank-sum test differs in nomenclature.

[1] Mann, Henry B.; Whitney, Donald R. (1947). "On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other". Annals of Mathematical Statistics 18 (1): 50–60.

[2] Klotz, J. H. (1966). "The Wilcoxon, Ties, and the Computer". Journal of the American Statistical Association 61 (315): 772-787.

type NormalDist ¶

type NormalDist struct {
	Mu, Sigma float64
}

NormalDist is a normal (Gaussian) distribution with mean Mu and standard deviation Sigma.

func (NormalDist) Bounds ¶

func (n NormalDist) Bounds() (float64, float64)

func (NormalDist) CDF ¶

func (n NormalDist) CDF(x float64) float64

func (NormalDist) InvCDF ¶

func (n NormalDist) InvCDF(p float64) (x float64)

func (NormalDist) PDF ¶

func (n NormalDist) PDF(x float64) float64

func (NormalDist) Rand ¶

func (n NormalDist) Rand(r *rand.Rand) float64

type UDist ¶

type UDist struct {
	// T is the count of the number of ties at each rank in the
	// input distributions. T may be nil, in which case it is
	// assumed there are no ties (which is equivalent to an M+N
	// slice of 1s). It must be the case that Sum(T) == M+N.
	T []int

	N1, N2 int
}

A UDist is the discrete probability distribution of the Mann-Whitney U statistic for a pair of samples of sizes N1 and N2.

The details of computing this distribution with no ties can be found in Mann, Henry B.; Whitney, Donald R. (1947). "On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other". Annals of Mathematical Statistics 18 (1): 50–60. Computing this distribution in the presence of ties is described in Klotz, J. H. (1966). "The Wilcoxon, Ties, and the Computer". Journal of the American Statistical Association 61 (315): 772-787 and Cheung, Ying Kuen; Klotz, Jerome H. (1997). "The Mann Whitney Wilcoxon Distribution Using Linked Lists". Statistica Sinica 7: 805-813 (the former paper contains details that are glossed over in the latter paper but has mathematical typesetting issues, so it's easiest to get the context from the former paper and the details from the latter).

func (UDist) CDF ¶

func (d UDist) CDF(U float64) float64

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL