dpagg

package
v1.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 7, 2022 License: Apache-2.0 Imports: 9 Imported by: 6

Documentation

Overview

Package dpagg contains differentially private aggregation primitives.

Index

Constants

View Source
const (
	DefaultTreeHeight      = 4
	DefaultBranchingFactor = 16
)

Constants used for QuantileTrees.

Variables

View Source
var LargestRepresentableDelta = 1 - math.Pow(2, -53)

LargestRepresentableDelta is the largest delta we could support in 64 bit precision, approximately equal to one.

Functions

func ClampFloat64

func ClampFloat64(e, lower, upper float64) (float64, error)

ClampFloat64 clamps e within lower and upper, such that lower is returned if e < lower, and upper is returned if e > upper. Otherwise, e is returned.

func ClampInt64

func ClampInt64(e, lower, upper int64) (int64, error)

ClampInt64 clamps e within lower and upper. Returns lower if e < lower. Returns upper if e > upper.

Types

type BoundedMeanFloat64

type BoundedMeanFloat64 struct {

	// State variables
	NormalizedSum BoundedSumFloat64
	Count         Count
	// contains filtered or unexported fields
}

BoundedMeanFloat64 calculates a differentially private mean of a collection of float64 values.

The mean is computed by dividing a noisy sum of the entries by a noisy count of the entries. To improve utility, all entries are normalized by setting them to the difference between their actual value and the middle of the input range before summation. The original mean is recovered by adding the midpoint in a post-processing step. This idea is taken from Algorithm 2.4 of "Differential Privacy: From Theory to Practice", by Ninghui Li, Min Lyu, Dong Su and Weining Yang (section 2.5.5, page 28). In contrast to Algorithm 2.4, we do not return the midpoint if the noisy count is less or equal to 1. Instead, we set the noisy count to 1. Since this is a mere post-processing step, the DP bounds are preserved. Moreover, for small numbers of entries, this approach will return results that are closer to the actual mean in expectation.

BoundedMeanFloat64 supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) as well as contribute to the same partition multiple times (via the MaxContributionsPerPartition parameter), by scaling the added noise appropriately.

For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.

Note: Do not use when your results may cause overflows for float64 values. This aggregation is not hardened for such applications yet.

Not thread-safe.

func NewBoundedMeanFloat64

func NewBoundedMeanFloat64(opt *BoundedMeanFloat64Options) *BoundedMeanFloat64

NewBoundedMeanFloat64 returns a new BoundedMeanFloat64.

func (*BoundedMeanFloat64) Add

func (bm *BoundedMeanFloat64) Add(e float64)

Add an entry to a BoundedMeanFloat64. It skips NaN entries and doesn't count them in the final result because introducing even a single NaN entry will result in a NaN mean regardless of other entries, which would break the indistinguishability property required for differential privacy.

func (*BoundedMeanFloat64) ComputeConfidenceInterval

func (bm *BoundedMeanFloat64) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)

ComputeConfidenceInterval computes a confidence interval that contains the true mean with probability greater than or equal to 1 - alpha. The computation is based exclusively on noised data and the privacy parameters. Thus no privacy budget is consumed by this operation.

Result() needs to be called before ComputeConfidenceInterval, otherwise this will return an error.

func (*BoundedMeanFloat64) GobDecode

func (bm *BoundedMeanFloat64) GobDecode(data []byte) error

GobDecode decodes Count.

func (*BoundedMeanFloat64) GobEncode

func (bm *BoundedMeanFloat64) GobEncode() ([]byte, error)

GobEncode encodes Count.

func (*BoundedMeanFloat64) Merge

func (bm *BoundedMeanFloat64) Merge(bm2 *BoundedMeanFloat64)

Merge merges bm2 into bm (i.e., adds to bm all entries that were added to bm2). bm2 is consumed by this operation: bm2 may not be used after it is merged into bm.

func (*BoundedMeanFloat64) Result

func (bm *BoundedMeanFloat64) Result() float64

Result returns a differentially private estimate of the average of bounded elements added so far. The method can be called only once.

Note that the returned value is not an unbiased estimate of the raw bounded mean.

type BoundedMeanFloat64Options

type BoundedMeanFloat64Options struct {
	Epsilon                      float64 // Privacy parameter ε. Required.
	Delta                        float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise.
	MaxPartitionsContributed     int64   // How many distinct partitions may a single user contribute to? Defaults to 1.
	MaxContributionsPerPartition int64   // How many times may a single user contribute to a single partition? Required.
	// Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper.
	Lower, Upper float64
	Noise        noise.Noise // Type of noise used in BoundedMean. Defaults to Laplace noise.
}

BoundedMeanFloat64Options contains the options necessary to initialize a BoundedMeanFloat64.

type BoundedQuantiles

type BoundedQuantiles struct {
	Noise noise.Noise
	// contains filtered or unexported fields
}

BoundedQuantiles calculates a differentially private quantiles of a collection of float64 values using a quantile tree mechanism. See https://github.com/google/differential-privacy/blob/main/common_docs/Differentially_Private_Quantile_Trees.pdf.

It supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) as well as contribute to the same partition multiple times (via the MaxContributionsPerPartition parameter), by scaling the added noise appropriately.

For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.

Note: Do not use when your results may cause overflows for float64 values. This aggregation is not hardened for such applications yet.

Not thread-safe.

func NewBoundedQuantiles

func NewBoundedQuantiles(opt *BoundedQuantilesOptions) *BoundedQuantiles

NewBoundedQuantiles returns a new BoundedQuantiles.

func (*BoundedQuantiles) Add

func (bq *BoundedQuantiles) Add(e float64)

Add adds an entry to BoundedQuantiles. It skips (ignores) NaN values because their contribution to the final result is not well defined.

func (*BoundedQuantiles) GobDecode

func (bq *BoundedQuantiles) GobDecode(data []byte) error

GobDecode decodes BoundedQuantiles.

func (*BoundedQuantiles) GobEncode

func (bq *BoundedQuantiles) GobEncode() ([]byte, error)

GobEncode encodes BoundedQuantiles.

func (*BoundedQuantiles) Merge

func (bq *BoundedQuantiles) Merge(bq2 *BoundedQuantiles)

Merge merges bq2 into bq (i.e., adds to bq all entries that were added to bq2). bq2 is consumed by this operation: bq2 may not be used after it is merged into bq.

func (*BoundedQuantiles) Result

func (bq *BoundedQuantiles) Result(rank float64) float64

Result calculates and returns a differentially private quantile of the values added. The specified rank must be between 0.0 and 1.0.

This function can be called multiple times to compute different quantiles. Privacy budget is paid only once, on its first invocation. Calling this method repeatedly for the same rank will return the same result. The results of repeated calls are guaranteed to be monotonically increasing in the sense that r_1 < r_2 implies that Result(r_1) <= Result(r_2).

Note that the returned values is not an unbiased estimate of the raw bounded quantile.

type BoundedQuantilesOptions

type BoundedQuantilesOptions struct {
	Epsilon                      float64 // Privacy parameter ε. Required.
	Delta                        float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise.
	MaxPartitionsContributed     int64   // How many distinct partitions may a single privacy unit contribute to? Defaults to 1.
	MaxContributionsPerPartition int64   // How many times may a single user contribute to a single partition? Required.
	// Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper.
	Lower, Upper float64
	Noise        noise.Noise // Type of noise used in BoundedSum. Defaults to Laplace noise.
	// It is not recommended to set TreeHeight and BranchingFactor since they require
	// implementation-specific insight to modify and they only apply to QuantileTree
	// algorithm, which might become obsolote if another algorithm is used.
	TreeHeight      int // Height of the QuantileTree. Defaults to defaultTreeHeight.
	BranchingFactor int // Number of children of every non-leaf node. Defaults to defaultBranchingFactor.
}

BoundedQuantilesOptions contains the options necessary to initialize a BoundedQuantiles.

type BoundedStandardDeviation added in v1.1.0

type BoundedStandardDeviation struct {
	// State variables
	Variance BoundedVariance
	// contains filtered or unexported fields
}

BoundedStandardDeviation calculates a differentially private standard deviation of a collection of float64 values.

The output will be clamped between 0 and (upper - lower).

The implementation simply computes the bounded variance and takes the square root, which is differentially private by the post-processing theorem. It relies on the fact that the bounded variance algorithm guarantees that the output is non-negative.

For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.

Note: Do not use when your results may cause overflows for float64 values. This aggregation is not hardened for such applications yet.

Not thread-safe.

func NewBoundedStandardDeviation added in v1.1.0

func NewBoundedStandardDeviation(opt *BoundedStandardDeviationOptions) *BoundedStandardDeviation

NewBoundedStandardDeviation returns a new BoundedStandardDeviation.

func (*BoundedStandardDeviation) Add added in v1.1.0

func (bstdv *BoundedStandardDeviation) Add(e float64)

Add an entry to a BoundedStandardDeviation. It skips NaN entries and doesn't count them in the final result because introducing even a single NaN entry will result in a NaN standard deviation regardless of other entries, which would break the indistinguishability property required for differential privacy.

func (*BoundedStandardDeviation) GobDecode added in v1.1.0

func (bstdv *BoundedStandardDeviation) GobDecode(data []byte) error

GobDecode decodes BoundedStandardDeviation.

func (*BoundedStandardDeviation) GobEncode added in v1.1.0

func (bstdv *BoundedStandardDeviation) GobEncode() ([]byte, error)

GobEncode encodes BoundedStandardDeviation.

func (*BoundedStandardDeviation) Merge added in v1.1.0

func (bstdv *BoundedStandardDeviation) Merge(bstdv2 *BoundedStandardDeviation)

Merge merges bstdv2 into bstdv (i.e., adds to bstdv all entries that were added to bstdv2). bstdv2 is consumed by this operation: bstdv2 may not be used after it is merged into bstdv.

func (*BoundedStandardDeviation) Result added in v1.1.0

func (bstdv *BoundedStandardDeviation) Result() float64

Result returns a differentially private estimate of the standard deviation of bounded elements added so far. The method can be called only once.

Note that the returned value is not an unbiased estimate of the raw bounded standard deviation.

type BoundedStandardDeviationOptions added in v1.1.0

type BoundedStandardDeviationOptions struct {
	Epsilon                      float64 // Privacy parameter ε. Required.
	Delta                        float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise.
	MaxPartitionsContributed     int64   // How many distinct partitions may a single user contribute to? Defaults to 1.
	MaxContributionsPerPartition int64   // How many times may a single user contribute to a single partition? Required.
	// Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper.
	Lower, Upper float64
	Noise        noise.Noise // Type of noise used in BoundedStandardDeviation. Defaults to Laplace noise.
}

BoundedStandardDeviationOptions contains the options necessary to initialize a BoundedStandardDeviation.

type BoundedSumFloat64

type BoundedSumFloat64 struct {
	Noise noise.Noise
	// contains filtered or unexported fields
}

BoundedSumFloat64 calculates a differentially private sum of a collection of float64 values. It supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) by scaling the added noise appropriately. However, it assumes that for each BoundedSumFloat64 instance (partition), each privacy unit contributes at most one value. If a privacy unit contributes more, the contributions should be pre-aggregated before passing them to BoundedSumFloat64.

The provided differentially private sum is an unbiased estimate of the raw bounded sum meaning that its expected value is equal to the raw bounded sum.

For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions,

Not thread-safe.

func NewBoundedSumFloat64

func NewBoundedSumFloat64(opt *BoundedSumFloat64Options) *BoundedSumFloat64

NewBoundedSumFloat64 returns a new BoundedSumFloat64, whose sum is initialized at 0.

func (*BoundedSumFloat64) Add

func (bs *BoundedSumFloat64) Add(e float64)

Add adds a new summand to the BoundedSumFloat64. It ignores NaN summands because introducing even a single NaN summand will result in a NaN sum regardless of other summands, which would break the indistinguishability property required for differential privacy.

func (*BoundedSumFloat64) ComputeConfidenceInterval

func (bs *BoundedSumFloat64) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)

ComputeConfidenceInterval computes a confidence interval that contains the true sum with a probability greater than or equal to 1 - alpha using the noised sum computed by Result(). The computation is based exclusively on the noised sum returned by Result(). Thus no privacy budget is consumed by this operation.

Result() needs to be called before ComputeConfidenceInterval, otherwise this will return an error.

See https://github.com/google/differential-privacy/tree/main/common_docs/confidence_intervals.md.

func (*BoundedSumFloat64) GobDecode

func (bs *BoundedSumFloat64) GobDecode(data []byte) error

GobDecode decodes BoundedSumInt64.

func (*BoundedSumFloat64) GobEncode

func (bs *BoundedSumFloat64) GobEncode() ([]byte, error)

GobEncode encodes BoundedSumInt64.

func (*BoundedSumFloat64) Merge

func (bs *BoundedSumFloat64) Merge(bs2 *BoundedSumFloat64)

Merge merges bs2 into bs (i.e., adds to bs all entries that were added to bs2). bs2 is consumed by this operation: bs2 may not be used after it is merged into bs.

func (*BoundedSumFloat64) Result

func (bs *BoundedSumFloat64) Result() float64

Result returns a differentially private estimate of the sum of bounded elements added so far. The method can be called only once.

The returned value is an unbiased estimate of the raw bounded sum.

The returned value may sometimes be outside the set of possible raw bounded sums, e.g., the differentially private bounded sum may be positive although neither the lower nor the upper bound are positive. This can be corrected by the caller of this method, e.g., by snapping the result to the closest value representing a bounded sum that is possible. Note that such post processing introduces bias to the result.

func (*BoundedSumFloat64) ThresholdedResult

func (bs *BoundedSumFloat64) ThresholdedResult(thresholdDela float64) *float64

ThresholdedResult is similar to Result() but applies thresholding to the result. So, if the result is less than the threshold specified by the noise, mechanism, it returns nil. Otherwise, it returns the result.

type BoundedSumFloat64Options

type BoundedSumFloat64Options struct {
	Epsilon                  float64 // Privacy parameter ε. Required.
	Delta                    float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise.
	MaxPartitionsContributed int64   // How many distinct partitions may a single privacy unit contribute to? Defaults to 1.
	// Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper.
	Lower, Upper float64
	Noise        noise.Noise // Type of noise used in BoundedSum. Defaults to Laplace noise.
	// contains filtered or unexported fields
}

BoundedSumFloat64Options contains the options necessary to initialize a BoundedSumFloat64.

type BoundedSumInt64

type BoundedSumInt64 struct {
	Noise noise.Noise
	// contains filtered or unexported fields
}

BoundedSumInt64 calculates a differentially private sum of a collection of int64 values. It supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) by scaling the added noise appropriately. However, it assumes that for each BoundedSumInt64 instance (partition), each privacy unit contributes at most one value. If a privacy unit contributes more, the contributions should be pre-aggregated before passing them to BoundedSumInt64.

The provided differentially private sum is an unbiased estimate of the raw bounded sum in the sense that its expected value is equal to the raw bounded sum.

For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.

Note: Do not use when your results may cause overflows for int64 values. This aggregation is not hardened for such applications yet.

Not thread-safe.

func NewBoundedSumInt64

func NewBoundedSumInt64(opt *BoundedSumInt64Options) *BoundedSumInt64

NewBoundedSumInt64 returns a new BoundedSumInt64, whose sum is initialized at 0.

func (*BoundedSumInt64) Add

func (bs *BoundedSumInt64) Add(e int64)

Add adds a new summand to the BoundedSumInt64.

func (*BoundedSumInt64) ComputeConfidenceInterval

func (bs *BoundedSumInt64) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)

ComputeConfidenceInterval computes a confidence interval with integer bounds that contains the true sum with a probability greater than or equal to 1 - alpha using the noised sum computed by Result(). The computation is based exclusively on the noised sum returned by Result(). Thus no privacy budget is consumed by this operation.

Result() needs to be called before ComputeConfidenceInterval, otherwise this will return an error.

See https://github.com/google/differential-privacy/tree/main/common_docs/confidence_intervals.md.

func (*BoundedSumInt64) GobDecode

func (bs *BoundedSumInt64) GobDecode(data []byte) error

GobDecode decodes BoundedSumInt64.

func (*BoundedSumInt64) GobEncode

func (bs *BoundedSumInt64) GobEncode() ([]byte, error)

GobEncode encodes BoundedSumInt64.

func (*BoundedSumInt64) Merge

func (bs *BoundedSumInt64) Merge(bs2 *BoundedSumInt64)

Merge merges bs2 into bs (i.e., adds to bs all entries that were added to bs2). bs2 is consumed by this operation: bs2 may not be used after it is merged into bs.

func (*BoundedSumInt64) Result

func (bs *BoundedSumInt64) Result() int64

Result returns a differentially private estimate of the sum of bounded elements added so far. The method can be called only once.

The returned value is an unbiased estimate of the raw bounded sum.

The returned value may sometimes be outside the set of possible raw bounded sums, e.g., the differentially private bounded sum may be positive although neither the lower nor the upper bound are positive. This can be corrected by the caller of this method, e.g., by snapping the result to the closest value representing a bounded sum that is possible. Note that such post processing introduces bias to the result.

func (*BoundedSumInt64) ThresholdedResult

func (bs *BoundedSumInt64) ThresholdedResult(thresholdDelta float64) *int64

ThresholdedResult is similar to Result() but applies thresholding to the result. So, if the result is less than the threshold specified by the parameters of BoundedSumInt64 as well as thresholdDelta, it returns nil. Otherwise, it returns the result.

Note that the nil results should not be published when the existence of a partition in the output depends on private data.

type BoundedSumInt64Options

type BoundedSumInt64Options struct {
	Epsilon                  float64 // Privacy parameter ε. Required.
	Delta                    float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise.
	MaxPartitionsContributed int64   // How many distinct partitions may a single privacy unit contribute to? Defaults to 1.
	// Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper.
	Lower, Upper int64
	Noise        noise.Noise // Type of noise used in BoundedSum. Defaults to Laplace noise.
	// contains filtered or unexported fields
}

BoundedSumInt64Options contains the options necessary to initialize a BoundedSumInt64.

type BoundedVariance added in v1.1.0

type BoundedVariance struct {

	// State variables
	NormalizedSumOfSquares BoundedSumFloat64
	NormalizedSum          BoundedSumFloat64
	Count                  Count
	// contains filtered or unexported fields
}

BoundedVariance calculates a differentially private variance of a collection of float64 values.

The output will be clamped between 0 and (upper - lower)². Since the result is guaranteed to be positive, this algorithm can be used to compute a differentially private standard deviation.

The algorithm is a variation of the algorithm for differentially private mean from "Differential Privacy: From Theory to Practice", section 2.5.5: https://books.google.com/books?id=WFttDQAAQBAJ&pg=PA24#v=onepage&q&f=false

BoundedVariance supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) as well as contribute to the same partition multiple times (via the MaxContributionsPerPartition parameter), by scaling the added noise appropriately.

For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.

Note: Do not use when your results may cause overflows for float64 values. This aggregation is not hardened for such applications yet.

Not thread-safe.

func NewBoundedVariance added in v1.1.0

func NewBoundedVariance(opt *BoundedVarianceOptions) *BoundedVariance

NewBoundedVariance returns a new BoundedVariance.

func (*BoundedVariance) Add added in v1.1.0

func (bv *BoundedVariance) Add(e float64)

Add an entry to a BoundedVariance. It skips NaN entries and doesn't count them in the final result because introducing even a single NaN entry will result in a NaN variance regardless of other entries, which would break the indistinguishability property required for differential privacy.

func (*BoundedVariance) GobDecode added in v1.1.0

func (bv *BoundedVariance) GobDecode(data []byte) error

GobDecode decodes BoundedVariance.

func (*BoundedVariance) GobEncode added in v1.1.0

func (bv *BoundedVariance) GobEncode() ([]byte, error)

GobEncode encodes BoundedVariance.

func (*BoundedVariance) Merge added in v1.1.0

func (bv *BoundedVariance) Merge(bv2 *BoundedVariance)

Merge merges bv2 into bv (i.e., adds to bv all entries that were added to bv2). bv2 is consumed by this operation: bv2 may not be used after it is merged into bv.

func (*BoundedVariance) Result added in v1.1.0

func (bv *BoundedVariance) Result() float64

Result returns a differentially private estimate of the variance of bounded elements added so far. The method can be called only once.

Note that the returned value is not an unbiased estimate of the raw bounded variance.

type BoundedVarianceOptions added in v1.1.0

type BoundedVarianceOptions struct {
	Epsilon                      float64 // Privacy parameter ε. Required.
	Delta                        float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise.
	MaxPartitionsContributed     int64   // How many distinct partitions may a single user contribute to? Defaults to 1.
	MaxContributionsPerPartition int64   // How many times may a single user contribute to a single partition? Required.
	// Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper.
	Lower, Upper float64
	Noise        noise.Noise // Type of noise used in BoundedVariance. Defaults to Laplace noise.
}

BoundedVarianceOptions contains the options necessary to initialize a BoundedVariance.

type Count

type Count struct {
	Noise noise.Noise
	// contains filtered or unexported fields
}

Count calculates a differentially private count of a collection of values using the Laplace or Gaussian mechanism

It supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) by scaling the added noise appropriately. However, it does not support multiple contributions to a single partition from the same privacy unit. For that use case, BoundedSumInt64 should be used instead.

The provided differentially private count is an unbiased estimate of the raw count meaning that its expected value is equal to the raw count.

For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.

Note: Do not use when your results may cause overflows for int64 values. This aggregation is not hardened for such applications yet.

Not thread-safe.

func NewCount

func NewCount(opt *CountOptions) *Count

NewCount returns a new Count, initialized at 0.

func (*Count) ComputeConfidenceInterval

func (c *Count) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)

ComputeConfidenceInterval computes a confidence interval with integer bounds that contains the true count with a probability greater than or equal to 1 - alpha using the noised count computed by Result(). The computation is based exclusively on the noised count returned by Result(). Thus no privacy budget is consumed by this operation.

Result() needs to be called before ComputeConfidenceInterval, otherwise this will return an error.

See https://github.com/google/differential-privacy/tree/main/common_docs/confidence_intervals.md.

func (*Count) GobDecode

func (c *Count) GobDecode(data []byte) error

GobDecode decodes Count.

func (*Count) GobEncode

func (c *Count) GobEncode() ([]byte, error)

GobEncode encodes Count.

func (*Count) Increment

func (c *Count) Increment()

Increment increments the count by one.

func (*Count) IncrementBy

func (c *Count) IncrementBy(count int64)

IncrementBy increments the count by the given value. Note that this shouldn't be used to count multiple contributions to a single partition from the same privacy unit.

func (*Count) Merge

func (c *Count) Merge(c2 *Count)

Merge merges c2 into c (i.e., adds to c all entries that were added to c2). c2 is consumed by this operation: it may not be used after it is merged into c.

func (*Count) Result

func (c *Count) Result() int64

Result returns a differentially private estimate of the current count. The method can be called only once.

The returned value is an unbiased estimate of the raw count.

The returned value may sometimes be negative. This can be corrected by setting negative results to 0. Note that such post processing introduces bias to the result.

func (*Count) ThresholdedResult

func (c *Count) ThresholdedResult(thresholdDelta float64) *int64

ThresholdedResult is similar to Result() but applies thresholding to the result. So, if the result is less than the threshold specified by the parameters of Count as well as thresholdDelta, it returns nil. Otherwise, it returns the result.

Note that the nil results should not be published when the existence of a partition in the output depends on private data.

type CountOptions

type CountOptions struct {
	Epsilon                  float64     // Privacy parameter ε. Required.
	Delta                    float64     // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise.
	MaxPartitionsContributed int64       // How many distinct partitions may a single privacy unit contribute to? Defaults to 1.
	Noise                    noise.Noise // Type of noise used. Defaults to Laplace noise.
	// contains filtered or unexported fields
}

CountOptions contains the options necessary to initialize a Count.

type PreAggSelectPartition

type PreAggSelectPartition struct {
	// contains filtered or unexported fields
}

PreAggSelectPartition is used to compute an (ε,δ)-differentially private decision of whether to materialize a partition.

Many differential privacy mechanisms work by performing an aggregation and adding noise. They achieve (ε_m,δ_m)-differential privacy under the assumption that partitions are chosen in advance. In other words, they assume that even if no data is associated with a partition, noise is added to the empty aggregation, and the noisy result is materialized. However, when only partitions containing data are materialized, such mechanisms fail to protect privacy for partitions containing data from a single privacy ID (e.g., user). To fix this, partitions with small numbers of privacy IDs must sometimes be dropped in order to maintain privacy. This process of partition selection incurs an additional (ε,δ) differential privacy budget resulting in a total differential privacy budget of (ε+ε_m,δ+δ_m) being used for the aggregation with partition selection.

Depending on the l0sensitivity, the PreAggSelectPartition uses one of two differentially private partition selection algorithms.

When l0sensitivity ≤ 3, the partition selection process is made (ε,δ) differentially private by applying the definition of differential privacy to the count of privacy IDs. Supposing l0Sensitivity bounds the number of partitions a privacy ID may contribute to, we define:

pε := ε/l0Sensitivity
pδ := δ/l0Sensitivity

to be the per-partition differential privacy losses incurred by the partition selection process. Letting n denote the number of privacy IDs in a partition, the probability of selecting a partition is given by the following recurrence relation:

keepPartitionProbability(n) = min(
        keepPartitionProbability(n-1) * exp(pε) + pδ,              (1)
        1 - exp(-pε) * (1-keepPartitionProbability(n-1)-pδ),       (2)
        1                                                   (3)
    )

with base case keepPartitionProbability(0) = 0. This formula is optimal in terms of maximizing the likelihood of selecting a partition under (ε,δ)-differential privacy, with the caveat that the input values for pε and pδ are lower bound approximations. For efficiency, we use a closed-form solution to this recurrence relation. See [Differentially private partition selection paper] https://arxiv.org/pdf/2006.03684.pdf for details on the underlying mathematics.

When l0sensitivity > 3, the partition selection process is made (ε,δ) differentially private by using the ThresholdedResult() of the Count primitive with Gaussian noise. Count computes a (ε,δ/2) differentially private count of the privacy IDs in a partition by adding Gaussian noise. Then, it computes a threshold T for which the probability that a (ε,δ/2) differentially private count of a single privacy ID can exceed T is δ/2. It keeps the partition iff differentially private count exceeds the threshold.

The reason two different algorithms for deciding whether to keep a partition is used is because the first algorithm ("magic partition selection") is optimal when l0sensitivity ≤ 3 but is outperformed by Gaussian-based thresholding when l0sensitivity > 3.

PreAggSelectPartition is a utility for maintaining the count of IDs in a single partition and then determining whether the partition should be materialized. Use Increment() to increment the count of IDs and ShouldKeepPartition() to decide if the partition should be materialized.

func NewPreAggSelectPartition

func NewPreAggSelectPartition(opt *PreAggSelectPartitionOptions) *PreAggSelectPartition

NewPreAggSelectPartition constructs a new PreAggSelectPartition from opt.

func (*PreAggSelectPartition) GetHardThreshold

func (s *PreAggSelectPartition) GetHardThreshold() int

GetHardThreshold returns a threshold k, where if there are at least k privacy units in a partition, we are guaranteed to keep that partition.

This is the conceptual equivalent of the post-aggregation threshold of the noise.Noise interface with the difference that here there is 0 probability of not keeping the partition if it has at least k privacy units, whereas with the post-aggregation threshold there is a non-zero probability (however small).

func (*PreAggSelectPartition) GobDecode

func (s *PreAggSelectPartition) GobDecode(data []byte) error

GobDecode decodes PreAggSelectPartition.

func (*PreAggSelectPartition) GobEncode

func (s *PreAggSelectPartition) GobEncode() ([]byte, error)

GobEncode encodes PreAggSelectPartition.

func (*PreAggSelectPartition) Increment

func (s *PreAggSelectPartition) Increment()

Increment increments the ids count by one. The caller must ensure this methods called at most once per privacy ID.

func (*PreAggSelectPartition) Merge

Merge merges s2 into s (i.e., add the idCount of s2 to s). This implicitly assumes that s and s2 act on distinct privacy IDs. s2 is consumed by this operation: s2 may not be used after it is merged into s.

Preconditions: s and s2 must have the same privacy parameters. In addition, ShouldKeepPartition() may not be called yet for either s or s2.

func (*PreAggSelectPartition) ShouldKeepPartition

func (s *PreAggSelectPartition) ShouldKeepPartition() bool

ShouldKeepPartition returns whether the partition should be materialized.

func (*PreAggSelectPartition) String

func (s *PreAggSelectPartition) String() string

type PreAggSelectPartitionOptions

type PreAggSelectPartitionOptions struct {
	// Epsilon and Delta specify the (ε,δ)-differential privacy budget used for
	// partition selection. Required.
	Epsilon float64
	Delta   float64
	// MaxPartitionsContributed is the number of distinct partitions a single
	// privacy unit can contribute to. Defaults to 1.
	MaxPartitionsContributed int64
}

PreAggSelectPartitionOptions is used to set the privacy parameters when constructing a PreAggSelectPartition.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL