tel

package module
v2.3.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 23, 2023 License: MIT Imports: 48 Imported by: 25

README

= Telemetry.V2 Otel Protocol

Framework which aims to ease logging affair: `Logs`, `Traces` and `Metrics` .

V2 version launch usage of OpenTelemetry specification for all logging directions.
This mean that all logging propagators uses `OTEL` protocol.

Tel use `zap.Logger` as the heart of system.
That why it's pass all zap functions through.

== Motto

Ony context all logs.

Decrease external dependencies as match as possible.

== Features

.All-In-One
Library establish connection via GRPC OTLP protocol with `opentelemetry-collector-contrib` (official OTEL server) and send `logs`, `traces` and `metrics`.
Collector by the way distribute them to `Loki`, `Tempo` and `Prometheus` or any other services which you prefer and which collector support.
Furthermore, we prepared for you working `dashboards` in `./__grafana` folder which created for our `middlewares` for most popular servers and clients.

.Logs
Our goal to support https://grafana.com/docs/loki/latest/logql/log_queries/#logfmt[logfmt] format for `Loki` viewing.
Via simple `zap` library interface.

By the way, you can enrich log attributes which should be written only when whey would really need

[source,go]
----
// create copy of ctx which we enrich with log attributes
cxt := tel.Global().Copy().Ctx()

// pass ctx in controller->store->root layers and enrich information
err := func (ctx context.Context) error{
	tel.FromCtx(ctx).PutAttr(extract...)
	return return fmt.Errorf("some error")
}(ctx)

// and you write log message only when it really needed
// with all putted attribute via ctx from ALL layers earlier
// No need to look previous info/debug messages
// All needed information in one message with all attributes which you already added, but which would be writen only when you really do call `Error()`, `Info()`, `Debug()` and so on
//
// for example: only when you got error
if err != nil{
	tel.FromCtx(ctx).Error("error happened", tel.Error(err))
}

----

.Trace
Library simplify usage with creating `Spans` of trace

Also, you can send not only logs and also encroach `trace events`

[source,go]
----
	span, ctx := tel.StartSpanFromContext(req.Context(), "my trace")
	defer span.End()

	tel.FromCtx(ctx).Info("this message will be saved both in LogInstance and trace",
		// and this will come to the trace as attribute
		tel.Int("code", errCode))
----

.Metrics
Simplify working with metrics
[source,go]

----
	m := tel.Global().Meter("github.com/MyRepo/MyLib/myInstrumenation")

	requestLatency, err := m.SyncFloat64().Histogram("demo_client.request_latency",
		instrument.WithDescription("The latency of requests processed"))
	if err != nil {
		t.Fatal("metric load error", tel.Error(err))
	}

    ...
    start := time.Now()
    ....

    ms := float64(time.Now().Sub(start).Microseconds())

    requestLatency.Record(ctx, ms,
        attribute.String("userID", "e64916d9-bfd0-4f79-8ee3-847f2d034d20"),
        attribute.Int("orderID", 1),
    )
----

.Middleware

* Recovery flow
* Instantiate new copy of `tel` for further handler
* Basic metrics with respective dashboard for grafana
* Trace propagation
** client part - send (inject) current trace span to the server
** server part - read (extract) trace and create new trace child one (or absolutly new if no trace info was provided or this info where not properly wrapped via propagator protocol of OTEL specification)

== Logging stack

Logging data exported via `OTEL's` GRPC protocol. `tel` developed to trespass it via https://github.com/open-telemetry/opentelemetry-collector[open-telemetry collector] which should route log data up to any desired log receivers.

Keep in mind that collector has plugin version https://github.com/open-telemetry/opentelemetry-collector-contrib[collector contrib] - this is gateway-adapter to numerous protocols which not yet support `OTEL`, for example grafana loki.

For instance, you can use `opentelemetry-collector-contrib` as `tel` receiver and route logging data to `Grafana Loki`, trace data to `Grafana Tempo` and metric data to `Prometheus + Grafana ;)`

=== Grafana references feature

==== loki to tempo

`tel` approach to put `traceID` field with actual trace ID.
All our middlewares should do that or developer should do it by himself

Just call `UpdateTraceFields` before write some logs
[source,go]

----
tel.UpdateTraceFields(ctx)
----

understood grafana should setup `derivedFields` for Loki data source
[source,yaml]

----
  - name: Loki
    type: loki
    url: http://loki:3100
    uid: loki
    jsonData:
      derivedFields:
        - datasourceUid: tempo
          matcherRegex: "traceID=(\\w+)"
          name: trace
          url: '$${__value.raw}'
----

==== tempo to loki

We match `tempo` with `loki` by `service_name` label.
All logs should contain traceID by any key form and `service_name`.
In grafana tempo datasource should be configured with `tracesToLogs`

==== prometheus to loki
[source,yaml]

----
  - name: Tempo
    type: tempo
    access: proxy
    orgId: 1
    url: http://tempo:3200
    basicAuth: false
    isDefault: false
    version: 1
    editable: false
    apiVersion: 1
    uid: tempo
    jsonData:
      nodeGraph:
        enabled: true
      tracesToLogs:
        datasourceUid: loki
        filterBySpanID: false
        filterByTraceID: true
        mapTagNamesEnabled: false
        tags:
          - service_name
----

== Install

[source,bash]
----
go get github.com/tel-io/tel/v2@latest
----

=== collector

OTEL collector configuration (labels) part of setup, this mean if you not properly setup it - you wouldn't be able to see appropriate result

[source,yaml]
----
include::docker/otel-collector-config.yaml[]
----

== Features

* `OTEL` logs implementation

== Env

.OTEL_SERVICE_NAME
service name

`type`: string

.NAMESPACE
project namespace

`type`: string

.DEPLOY_ENVIRONMENT
ENUM: dev, stage, prod

`type`: string

.LOG_LEVEL
info log

`type`: string
NOTE:  debug, info, warn, error, dpanic, panic, fatal


.LOG_ENCODE
valid options: `console` and `json` or "none"

none - disable print to console (only OTEL or critical errors)

.DEBUG
for IsDebug() function

`type`: bool


.MONITOR_ENABLE
default: `true`

.MONITOR_ADDR
address where `health`, `prometheus` would be listen

NOTE: address logic represented in net.Listen description

.OTEL_ENABLE
default: `true`

.OTEL_COLLECTOR_GRPC_ADDR
Address to otel collector server via GRPC protocol

.OTEL_EXPORTER_WITH_INSECURE
With insecure ...

.OTEL_ENABLE_COMPRESSION
default: `true`

Enables gzip compression for grpc connections

.OTEL_METRIC_PERIODIC_INTERVAL_SEC
default: "15"

Interval metrics gathered

.OTEL_COLLECTOR_TLS_SERVER_NAME
Check server certificate DNS name given from server.

Disable `OTEL_EXPORTER_WITH_INSECURE` if set

.LOGGING_OTEL_CLIENT
default: `false`

required `OTEL_ENABLE` = true

Inject logger adapter to otel library related to grpc client and get log information related to this transport

.LOGGING_OTEL_PROCESSOR
default: `false`

required `OTEL_ENABLE` = true

Inject logger adapter to otel processor library related to collectors behaviour

.LOGS_ENABLE_RETRY
default: `false`

Enable retrying to send logs to collector.

.LOGS_SYNC_INTERVAL
default: `1s`

Limit how often logs are flushed with level.Error.

Example: 1s means allowed 1 flush per second if logs have level.Error.

.LOGS_MAX_MESSAGE_SIZE
default: `256`

Limit message size. If limit is exceeded, message is truncated.

.LOGS_MAX_MESSAGES_PER_SECOND
default: `100`

Limit rate of messages per second. If limit is exceeded, warning is logged and messages are dropped. Value 0 disables limit.

.LOGS_MAX_LEVEL_MESSAGES_PER_SECOND
default: ``

The same as LOGS_MAX_MESSAGES_PER_SECOND but allows to configure limit per level.

Value format: <level1>=<n>,<level2>=<n>. Ex: LOGS_MAX_LEVEL_MESSAGES_PER_SECOND="error=0,info=100"

.TRACES_ENABLE_RETRY
default: `false`

Enable retrying to send traces to collector.

.TRACES_SAMPLER
default: `statustraceidratio:0.1`

Set sampling strategy. There are options:
- never
- always
- traceidratio:<float64>
- statustraceidratio:<float64>

where <float64> is required and valid floating point number from 0.0 to 1.0

.TRACES_ENABLE_SPAN_TRACK_LOG_MESSAGE
default: `false`

Enable adding all log messages to active span as event.

.TRACES_ENABLE_SPAN_TRACK_LOG_FIELDS
default: `true`

Enable adding all log fields to active span as attributes.

.TRACES_CARDINALITY_DETECTOR_ENABLE
default: `true`

Enable cardinality check for span names.

.TRACES_CARDINALITY_DETECTOR_MAX_CARDINALITY
default: `0`

Limit cardinality of span's attributes. Not used, so default value is 0.

.TRACES_CARDINALITY_DETECTOR_MAX_INSTRUMENTS
default: `500`

Limit the number of unique span names.

.TRACES_CARDINALITY_DETECTOR_DIAGNOSTIC_INTERVAL
default: `10m`

Enable diagnostic loop that checks for cardinality violations and logs a warning.

You can disable it by setting the value to 0.

.METRICS_ENABLE_RETRY
default: `false`

Enable retrying to send metrics to collector.

.METRICS_CARDINALITY_DETECTOR_ENABLE
default: `true`

Enable cardinality check for metrics' labels.

.METRICS_CARDINALITY_DETECTOR_MAX_CARDINALITY
default: `100`

Limit cardinality of metric's labels. If limit is exceeded, metric is ignored, but the previous metrics work as before

.METRICS_CARDINALITY_DETECTOR_MAX_INSTRUMENTS
default: `500`

Limit the number of unique metric names (without labels. only name).

.METRICS_CARDINALITY_DETECTOR_DIAGNOSTIC_INTERVAL
default: `10m`

Enable diagnostic loop that checks for cardinality violations and logs a warning.

You can disable it by setting the value to 0.

.OTEL_COLLECTOR_TLS_CA_CERT
TLS CA certificate body

.OTEL_COLLECTOR_TLS_CLIENT_CERT
TLS client certificate

.OTEL_COLLECTOR_TLS_CLIENT_KEY
TLS client key

.OTEL_RESOURCE_ATTRIBUTES
This optional variable, handled by open-telemetry SDK.
Separator is semicolon.
Put additional resources variables, very suitable!


== ToDo

* [ ] Expose health check to specific metric
* [ ] Duplicate trace messages for root - ztrace.New just add to chain tree

== Usage

Tale look in `example/demo` folder.

Documentation

Overview

Package tel represent Telemetry service we support context as source of Telemetry for gracefully support middleware we not pass ref to Telemetry for better handling different log instances

Index

Constants

View Source
const (
	ServiceNameKey       = attribute.Key("service")
	ServiceInstanceIDKey = attribute.Key("service_instance_id")
)
View Source
const DisableLog = "none"

Variables

View Source
var (
	ErrNoTLS    = errors.New("no tls configuration")
	ErrCaAppend = errors.New("append certs from pem")
)
View Source
var (
	Any        = zap.Any
	Binary     = zap.Binary
	ByteString = zap.ByteString
	Bool       = zap.Bool
	Duration   = zap.Duration
	Float32    = zap.Float32
	Float64    = zap.Float64
	Int        = zap.Int
	Int64      = zap.Int64
	Int32      = zap.Int32
	Int16      = zap.Int16
	Int8       = zap.Int8
	String     = zap.String
	Time       = zap.Time
	Uint       = zap.Uint
	Uint64     = zap.Uint64
	Uint32     = zap.Uint32
	Uint16     = zap.Uint16
	Uint8      = zap.Uint8
	Uintptr    = zap.Uintptr
	Error      = zap.Error
)
View Source
var (
	Strings = zap.Strings
	Ints    = zap.Ints
)
View Source
var DefaultHistogramBoundaries = []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10}

DefaultHistogramBoundaries have been copied from prometheus.DefBuckets.

Note we anticipate the use of a high-precision histogram sketch as the standard histogram aggregator for OTLP export. (https://github.com/open-telemetry/opentelemetry-specification/issues/982).

View Source
var (
	GenServiceName = defaultServiceFmt
)

Functions

func CreateRes

func CreateRes(ctx context.Context, l Config) *resource.Resource

func SetGlobal

func SetGlobal(t Telemetry)

WARN: NON THREAD SAFE

func SetLogOutput

func SetLogOutput(log *Telemetry) *bytes.Buffer

SetLogOutput debug function for duplicate input log into bytes.Buffer

func StartSpanFromContext

func StartSpanFromContext(ctx context.Context, name string, opts ...trace.SpanStartOption) (
	trace.Span, context.Context)

StartSpanFromContext start telemetry span witch create or continue existent trace for gracefully continue trace ctx should contain both span and tele

func UpdateTraceFields

func UpdateTraceFields(ctx context.Context)

UpdateTraceFields during session start good way to update tracing fields @prefix - for split different inter-service calls: kafka, grpc, db and etc

func WithContext

func WithContext(ctx context.Context, l Telemetry) context.Context

func WrapContext

func WrapContext(ctx context.Context, l *Telemetry) context.Context

Types

type Config

type Config struct {
	Service     string `env:"OTEL_SERVICE_NAME"`
	Namespace   string `env:"NAMESPACE" envDefault:"default"`
	Environment string `env:"DEPLOY_ENVIRONMENT" envDefault:"dev"`
	Version     string `env:"VERSION" envDefault:"dev"`
	LogLevel    string `env:"LOG_LEVEL" envDefault:"info"`
	// Valid values are "json", "console" or "none"
	LogEncode string `env:"LOG_ENCODE" envDefault:"json"`
	Debug     bool   `env:"DEBUG" envDefault:"false"`

	MonitorConfig
	OtelConfig
}

func DefaultConfig

func DefaultConfig() Config

func DefaultDebugConfig

func DefaultDebugConfig() Config

func GetConfigFromEnv

func GetConfigFromEnv() Config

GetConfigFromEnv uses DefaultConfig and overwrite only variables present in env

func (*Config) Level

func (c *Config) Level() zapcore.Level

type HistogramOpt added in v2.2.0

type HistogramOpt struct {
	MetricName string
	Bucket     []float64
}

HistogramOpt represent histogram bucket configuration for specific metric

type Logger

type Logger interface {
	Check(lvl zapcore.Level, msg string) *zapcore.CheckedEntry

	Debug(msg string, fields ...zap.Field)
	Info(msg string, fields ...zap.Field)
	Warn(msg string, fields ...zap.Field)
	Error(msg string, fields ...zap.Field)
	Panic(msg string, fields ...zap.Field)
	Fatal(msg string, fields ...zap.Field)

	Sync() error

	Core() zapcore.Core
}

type MonitorConfig

type MonitorConfig struct {
	Enable      bool   `env:"MONITOR_ENABLE" envDefault:"true"`
	MonitorAddr string `env:"MONITOR_ADDR" envDefault:"0.0.0.0:8011"`
	// contains filtered or unexported fields
}

type Option

type Option interface {
	// contains filtered or unexported methods
}

Option interface used for setting optional config properties.

func WithHealthCheckers

func WithHealthCheckers(c ...health.Checker) Option

WithHealthCheckers provide checkers to monitoring system for check health status of service

func WithHistogram added in v2.2.0

func WithHistogram(list ...HistogramOpt) Option

WithHistogram register metrics with custom bucket list

func WithMonitorEnable

func WithMonitorEnable(enable bool) Option

WithMonitorEnable enable monitoring

func WithMonitoringAddr

func WithMonitoringAddr(addr string) Option

WithMonitoringAddr overwrite monitoring addr

func WithNamespace

func WithNamespace(ns string) Option

WithNamespace set service namespace

func WithServiceName

func WithServiceName(name string) Option

WithServiceName set service name

func WithTraceSampler added in v2.2.3

func WithTraceSampler(sampler sdktrace.Sampler) Option

WithTraceSampler allow use own sampling strategy for scrapping traces

type OtelConfig

type OtelConfig struct {
	Enable bool `env:"OTEL_ENABLE" envDefault:"true"`

	// OtelAddr address where grpc open-telemetry exporter serve
	Addr string `env:"OTEL_COLLECTOR_GRPC_ADDR" envDefault:"127.0.0.1:4317"`
	// WithInsecure controls whether a client verifies the server's
	// certificate chain and host name. If InsecureSkipVerify is true, crypto/tls
	// accepts any certificate presented by the server and any host name in that
	// certificate. In this mode, TLS is susceptible to machine-in-the-middle
	// attacks unless custom verification is used. This should be used only for
	// testing or in combination with VerifyConnection or VerifyPeerCertificate.
	WithInsecure bool `env:"OTEL_EXPORTER_WITH_INSECURE" envDefault:"true"`

	// WithCompression enables gzip compression for all connections: logs, traces, metrics
	WithCompression bool `env:"OTEL_ENABLE_COMPRESSION" envDefault:"true"`

	MetricsPeriodicIntervalSec int `env:"OTEL_METRIC_PERIODIC_INTERVAL_SEC" envDefault:"15"`

	// ServerName is used to verify the hostname on the returned
	// certificates unless InsecureSkipVerify is given. It is also included
	// in the client's handshake to support virtual hosting unless it is
	// an IP address.
	// Disable WithInsecure option if set
	ServerName string `env:"OTEL_COLLECTOR_TLS_SERVER_NAME"`

	Logs struct {
		// OtelClient is logger of otel clients
		OtelClient bool `env:"LOGGING_OTEL_CLIENT"`

		// OtelProcessor is logger of otel processor
		OtelProcessor bool `env:"LOGGING_OTEL_PROCESSOR"`

		EnableRetry               bool          `env:"LOGS_ENABLE_RETRY" envDefault:"false"`
		SyncInterval              time.Duration `env:"LOGS_SYNC_INTERVAL" envDefault:"1s"`
		MaxMessageSize            int           `env:"LOGS_MAX_MESSAGE_SIZE" envDefault:"256"`
		MaxMessagesPerSecond      int           `env:"LOGS_MAX_MESSAGES_PER_SECOND" envDefault:"100"`
		MaxLevelMessagesPerSecond string        `env:"LOGS_MAX_LEVEL_MESSAGES_PER_SECOND" envDefault:""`
	}

	Traces tracesConfig

	Metrics struct {
		EnableRetry         bool `env:"METRICS_ENABLE_RETRY" envDefault:"false"`
		CardinalityDetector struct {
			Enable             bool          `env:"METRICS_CARDINALITY_DETECTOR_ENABLE" envDefault:"true"`
			MaxCardinality     int           `env:"METRICS_CARDINALITY_DETECTOR_MAX_CARDINALITY" envDefault:"100"`
			MaxInstruments     int           `env:"METRICS_CARDINALITY_DETECTOR_MAX_INSTRUMENTS" envDefault:"500"`
			DiagnosticInterval time.Duration `env:"METRICS_CARDINALITY_DETECTOR_DIAGNOSTIC_INTERVAL" envDefault:"10m"`
		}
	}

	// Raw parses a public/private key pair from a pair of
	// PEM encoded data. On successful return, Certificate.Leaf will be nil because
	// the parsed form of the certificate is not retained.
	Raw struct {
		CA   []byte `env:"OTEL_COLLECTOR_TLS_CA_CERT"`
		Cert []byte `env:"OTEL_COLLECTOR_TLS_CLIENT_CERT"`
		Key  []byte `env:"OTEL_COLLECTOR_TLS_CLIENT_KEY"`
	}
	// contains filtered or unexported fields
}

TODO: Review overlapping options (WthInsecure, WithCompression, etc). TODO: Add TEL_ prefix to avoid env conflicts

func (*OtelConfig) IsTLS added in v2.1.2

func (c *OtelConfig) IsTLS() bool

type Telemetry

type Telemetry struct {
	*zap.Logger
	// contains filtered or unexported fields
}

func FromCtx

func FromCtx(ctx context.Context) *Telemetry

FromCtx retrieves from ctx tel object

func Global

func Global() Telemetry

func New

func New(ctx context.Context, cfg Config, options ...Option) (Telemetry, func())

New create telemetry instance

func NewNull

func NewNull() Telemetry

func NewSimple

func NewSimple(cfg Config) Telemetry

NewSimple create simple logger without OTEL propagation

func (Telemetry) Copy

func (t Telemetry) Copy() Telemetry

Copy resiver instance and give us more convenient way to use pipelines

func (Telemetry) Ctx

func (t Telemetry) Ctx() context.Context

Ctx initiate new ctx with Telemetry and span instance if occured

func (Telemetry) IsDebug

func (t Telemetry) IsDebug() bool

IsDebug if ENV DEBUG was true

func (Telemetry) LogLevel

func (t Telemetry) LogLevel() zapcore.Level

LogLevel safe pars log level, in case of error return InfoLevel

func (Telemetry) Meter

func (t Telemetry) Meter(ins string, opts ...metric.MeterOption) metric.Meter

Meter create new metric instance which should be treated as new

func (Telemetry) MetricProvider

func (t Telemetry) MetricProvider() metric.MeterProvider

MetricProvider used in constructor creation

func (*Telemetry) Printf

func (t *Telemetry) Printf(msg string, items ...interface{})

Printf expose fx.Printer interface as debug output

func (*Telemetry) PutAttr

func (t *Telemetry) PutAttr(attr ...attribute.KeyValue) *Telemetry

PutAttr opentelemetry attr WARN: NON THREAD SAFE Be careful using this method with tel.Global()

func (*Telemetry) PutFields

func (t *Telemetry) PutFields(fields ...zap.Field) *Telemetry

PutFields update current logger instance with new fields, which would affect only on nest write log call for current tele instance Because reference it also affect context and this approach is covered in Test_telemetry_With WARN: NON THREAD SAFE Be careful using this method with tel.Global()

func (*Telemetry) PutSpan added in v2.2.3

func (t *Telemetry) PutSpan(in trace.Span)

PutSpan ... WARN: NON THREAD SAFE Be careful using this method with tel.Global()

func (Telemetry) Span added in v2.2.3

func (t Telemetry) Span() trace.Span

Span last created span

func (*Telemetry) StartSpan

func (t *Telemetry) StartSpan(ctx context.Context, name string, opts ...trace.SpanStartOption) (trace.Span, context.Context)

StartSpan start new trace telemetry span in case if ctx contains embed trace it will continue chain keep in mind than that function don't continue any trace, only create new for continue span use StartSpanFromContext In addition: register new root span in new ctx instance

return context where embed telemetry with span writer

func (Telemetry) T

func (t Telemetry) T() trace.Tracer

T returns opentracing instance

func (Telemetry) Tracer

func (t Telemetry) Tracer(name string, opts ...trace.TracerOption) Telemetry

Tracer instantiate with specific name and tel option @return new Telemetry pointed to this one

func (Telemetry) TracerProvider

func (t Telemetry) TracerProvider() trace.TracerProvider

TracerProvider used in constructor creation

func (Telemetry) WithContext

func (t Telemetry) WithContext(ctx context.Context) context.Context

WithContext put new copy of telemetry into context

func (Telemetry) WithSpan

func (t Telemetry) WithSpan(s trace.Span) *Telemetry

WithSpan create span logger where we can duplicate messages both tracer and logger Furthermore we create new log instance with trace fields

Directories

Path Synopsis
logskd
TODO: Fix type: s/logskd/logsdk
TODO: Fix type: s/logskd/logsdk
pkg
sdk

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL