probabilisticsamplerprocessor

package module
v0.99.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 22, 2024 License: Apache-2.0 Imports: 19 Imported by: 16

README

Probabilistic Sampling Processor

Status
Stability alpha: logs
beta: traces
Distributions core, contrib
Issues Open issues Closed issues
Code Owners @jpkrohling, @jmacd

The probabilistic sampler supports two types of sampling for traces:

  1. sampling.priority semantic convention as defined by OpenTracing
  2. Trace ID hashing

The sampling.priority semantic convention takes priority over trace ID hashing. As the name implies, trace ID hashing samples based on hash values determined by trace IDs. See Hashing for more information.

The following configuration options can be modified:

  • hash_seed (no default): An integer used to compute the hash algorithm. Note that all collectors for a given tier (e.g. behind the same load balancer) should have the same hash_seed.
  • sampling_percentage (default = 0): Percentage at which traces are sampled; >= 100 samples all traces

Examples:

processors:
  probabilistic_sampler:
    hash_seed: 22
    sampling_percentage: 15.3

The probabilistic sampler supports sampling logs according to their trace ID, or by a specific log record attribute.

The probabilistic sampler optionally may use a hash_seed to compute the hash of a log record. This sampler samples based on hash values determined by log records. See Hashing for more information.

The following configuration options can be modified:

  • hash_seed (no default, optional): An integer used to compute the hash algorithm. Note that all collectors for a given tier (e.g. behind the same load balancer) should have the same hash_seed.
  • sampling_percentage (required): Percentage at which logs are sampled; >= 100 samples all logs, 0 rejects all logs.
  • attribute_source (default = traceID, optional): defines where to look for the attribute in from_attribute. The allowed values are traceID or record.
  • from_attribute (default = null, optional): The optional name of a log record attribute used for sampling purposes, such as a unique log record ID. The value of the attribute is only used if the trace ID is absent or if attribute_source is set to record.
  • sampling_priority (default = null, optional): The optional name of a log record attribute used to set a different sampling priority from the sampling_percentage setting. 0 means to never sample the log record, and >= 100 means to always sample the log record.

Hashing

In order for hashing to work, all collectors for a given tier (e.g. behind the same load balancer) must have the same hash_seed. It is also possible to leverage a different hash_seed at different collector tiers to support additional sampling requirements. Please refer to config.go for the config spec.

Examples:

Sample 15% of the logs:

processors:
  probabilistic_sampler:
    sampling_percentage: 15

Sample logs according to their logID attribute:

processors:
  probabilistic_sampler:
    sampling_percentage: 15
    attribute_source: record # possible values: one of record or traceID
    from_attribute: logID # value is required if the source is not traceID

Sample logs according to the attribute priority:

processors:
  probabilistic_sampler:
    sampling_percentage: 15
    sampling_priority: priority

Refer to config.yaml for detailed examples on using the processor.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewFactory

func NewFactory() processor.Factory

NewFactory returns a new factory for the Probabilistic sampler processor.

Types

type AttributeSource added in v0.67.0

type AttributeSource string

type Config

type Config struct {

	// SamplingPercentage is the percentage rate at which traces or logs are going to be sampled. Defaults to zero, i.e.: no sample.
	// Values greater or equal 100 are treated as "sample all traces/logs".
	SamplingPercentage float32 `mapstructure:"sampling_percentage"`

	// HashSeed allows one to configure the hashing seed. This is important in scenarios where multiple layers of collectors
	// have different sampling rates: if they use the same seed all passing one layer may pass the other even if they have
	// different sampling rates, configuring different seeds avoids that.
	HashSeed uint32 `mapstructure:"hash_seed"`

	// AttributeSource (logs only) defines where to look for the attribute in from_attribute. The allowed values are
	// `traceID` or `record`. Default is `traceID`.
	AttributeSource `mapstructure:"attribute_source"`

	// FromAttribute (logs only) The optional name of a log record attribute used for sampling purposes, such as a
	// unique log record ID. The value of the attribute is only used if the trace ID is absent or if `attribute_source` is set to `record`.
	FromAttribute string `mapstructure:"from_attribute"`

	// SamplingPriority (logs only) enables using a log record attribute as the sampling priority of the log record.
	SamplingPriority string `mapstructure:"sampling_priority"`
}

Config has the configuration guiding the sampler processor.

func (*Config) Validate

func (cfg *Config) Validate() error

Validate checks if the processor configuration is valid

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL