log

package
v2.9.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 8, 2024 License: Apache-2.0 Imports: 49 Imported by: 0

Documentation

Overview

Package log is Pachyderm's logger.

As a developer in the Pachyderm codebase, logging should be very easy. All context.Contexts available throughout Pachyderm are capable of logging a message associated with that request or subsystem.

HOW TO LOG

The API surface that we provide is simply:

Debug(ctx, message, [fields...])
Info(ctx, message, [fields...])
Error(ctx, message, [fields...])

The fields are zap.Field objects; 99% of the time the field you're looking for is in the zap package, but we have a few in this package as well. "go doc zap.Field" and "go doc log.Field" will list them all; there are about 100 of them.

LEVELS

Debug messages are interesting to you, the developer. Info messages are interesting to the team running Pachyderm day-to-day. If you want an operator or support to see a message, choose the Info level. Error messages are for actionable errors; if you can't handle a request and there is no way for an end user to see an error message, choose Error. (It's assumed that the operator or Pachyderm support will have to jump in whenever these show up. Misleading errors, especially errors that resolve themselves after a retry, are a big waste of customer time.)

Error messages that end users should see must be in the API response or UI, not in log messages, of course. End users should never have to read the logs.

STRUCTURED LOGGING

All logs are structured; prefer:

Debug(ctx, "did a thing", zap.String("thing", thing), zap.String("reason", "because"))

over a non-structured:

Debugf(ctx, "did a thing (%v) because %v", thing, "because")

Structured logging is nice because you don't have to think about how to nicely phrase the message to make the fields make sense; just stick as much data in there as you like. And, automation or people debugging can filter for relevant messages with tools like jlog <https://github.com/jrockway/json-logs> or jq.

It is conventional to name your fields with camelCase, not snake_case or whatever-this-is-case. When unpacking a compound object into individual fields, object.key1, object.key2 are acceptable. (Example: grpc.code, grpc.message in gRPC responses.)

SPANS

A common pattern is logging how long a certain operation ran for, and whether or not it succeeded. Most uses are covered by:

func doSomething(ctx context.Context, to *Object) (retErr error) {
    defer log.Span(ctx, "doSomething", zap.Object("to", to))(log.Errorp(&retErr))
    do()
    things()
    and()
    return them()
}

This will log:

DEBUG doSomething; span starting {to: {my: object}}
DEBUG doSomething; span finished ok {span_duration: 100ms, to: {my: object}}

or

DEBUG doSomething; span starting {to: {my: object}}
DEBUG doSomething; span failed {span_duration: 100ms, error: "them failed", errorVerbose: "<stack trace>", to: {my: object}}

There is `SpanL` if you want to set a log level other than DEBUG. log.Span takes fields as well, if you want to annotate the span with an ID or other information to help you assocaite the "end" log with the "start" log. The fields will be logged for both the start and end events.

There are the fields `Errorp` (for failing if a pointer to an error is non-nil, useful with retErr as above), `ErrorpL` (to change the log level on failure), zap.Error (for an error that is ready when you call the callback), and `ErrorL` (to change the log level if the error is non-nil). (As an aside, `zap.Error` won't log anything if its argument is nil, so you can always include an error without checking for nil-ness first. Our wrappers also conform to that convention.)

There is also SpanContext, where the operation itself wants to add details to the span:

func doSomething(ctx context.Context) (retErr error) {
    ctx, end := log.SpanContext(ctx, "doSomething", zap.String("theThing", "coolThing"))
    defer end(log.ErrorpL(&retErr, log.LevelError))
    do(ctx)
    things(ctx)
    and(ctx)
    return them(ctx)
}

This will log:

DEBUG doSomething doSomething; span starting {theThing: coolThing}
DEBUG doSomething inside do {theThing: coolThing}
DEBUG doSomething inside things {theThing: coolThing}
INFO  doSomething them is having trouble {theThing: coolThing, howMuchTrouble: a lot}
ERROR doSomething doSomething; span failed {span_duration: 1ms, theThing: coolThing, error: "them failed"}

There is `SpanContextL` for controlling the level of the starting message.

Finally, there is `LogStep`, which used to be called middleware.LogStep. It takes a callback instead of requiring you to defer things with pointers.

That is all you need to know to start logging things! The rest of the functionality is more advanced.

SAMPLING

Some subsystems can be pretty noisy. The loggers we provide are configured to rate-limit messages over a short period of time (on the order of 2 messages every 3 seconds). This lets you see the first few log messages of a noisy routine, without filling up all storage space with one log message.

Rate limiting is keyed on message and severity; one noisy message won't rate limit other less frequent messages. Design your messages and log levels accordingly. For example, in the GRPC logging interceptor, we include the service name and method in the log message (in addition to fields), simply to ensure that one frequent GRPC call doesn't prevent other calls from being logged. Severity counts as well, so if you're writing a backoff routine, the first attempts might log at level DEBUG, but the last attempt can log at INFO. This ensures that even after a bunch of retries in a short period of time, the final outcome will still appear in the logs. (The backoff package does not yet do this.)

At the end of the day, it's about 100ns per message to log to a rate-limited logger (including some non-rate-limited messages; see BenchmarkFieldsSampled), so don't worry about logging too much. (Normal messages take about 2µs to process and render; most of this time is spent providing the "caller" field.)

You should feel perfectly free to log things like:

for i := 0; i < 10000; i++ {
    DoThing(items, i)
    Debug(ctx, "did a thing to items", zap.Int("i", i), zap.Int("max", 10000))
}

If it proceeds quickly, there will only be a few log lines. If DoThing is unexpectedly slow sometimes, then those logs will magically appear. Just don't change the message or level every iteration.

GETTING A LOGGER

By the time you read this, all code should have been converted to transmit loggers through the context. If you run into some code that can't accept contexts, use `pctx.TODO()` if you can't fix the API today. TODO uses the global logger, but doesn't warn about the logger not being initialized, as Child() would do on an empty context. (For saftey, there is no circumstance where a log function will ever panic, or not log, but we do warn if the provided context doesn't have a logger inside of it. It's probably a bug, and the stack trace attached to the warning should let you track it down easily.)

Top-level applications need to initialize the global logger, from which background services and pctx.TODO() derive the global logger from. We have three ways of doing this, InitPachdLogger, InitWorkerLogger, and InitPachctlLogger. The Pachd Logger is designed for compatability with third-party logging systems, like Stackdriver, and should be the default for any server-side applications. The Worker logger is designed to only emit messages that can be parsed by `pachctl logs`, and is used only by the worker (user code container, not the storage container). The Pachctl logger is for command-line apps, and pretty-prints messages similar to those in this documentation. (It also adds color if the terminal is capable and it's not disabled by pachctl flags.) If you're writing a new CLI utility, you probably want the Pachctl logger.

The root context in applications is `pctx.Background()`. Call it exactly once after Init(Pachd|Worker|Pachctl)Logger.

In tests, use pctx.TestContext(t). In the future we plan to add other accouterments to contexts inside Pachyderm; use of this function will ensure that your tests are ready. The test context contains a logger scoped to that test; it is safe to run tests in parallel that both use different TestContexts. Anything logged to that logger will appear in the test logs (go test -v ...) in a format similar to pachctl's logs. For logging inside tests, rather than in production code called by tests, just use `t.Log`.

Do not use `zap.L()` and `zap.S()` outside of this package. They are set up and work, but prefer the context propagation.

CONFIGURING LOGGING

Logging is configured by the environment, unrelated to service environment initialization.

The knobs you can tweak are:

  • PACHYDERM_LOG_LEVEL={debug, info, error}. That controls the minimum serverity log messages that will be printed. We default to info, but it's configurable in the helm chart.

  • PACHYDERM_DEVELOPMENT_LOGGER=1. If set to "1" or "true", then any warnings generated by the logging system will panic. It can't be turned on in the helm chart; these warnings should never reach production releases, but if they do, we just want to carry on and not crash.

  • PACHYDERM_DISABLE_LOG_SAMPLING=1. If set to "1" or "true", no log sampling will occur. This might be helpful if there are a lot of duplicate messages that you really need to see in order to debug. It can't be turned on in the helm chart.

If you mess up the syntax of any of these environment variables, a warning will be the first message printed to the logger after it starts up. These will also panic in development mode.

At startup, the values of these configuration items will be stored and propagated to all workers. That's why they have the PACHYDERM_ prefix, so as not to conflict with user code.

You can change the log level across the entire fleet of services at runtime with `pachctl debug log-level ...`.

MISCELLANEOUS

Some third-party packages expect a certain type of logger. We typically add adaptors to this package to convert a context into a logger for that subsystem.

See:

NewAmazonLogger(ctx)         # For the S3 client
NewLogrus(ctx)               # For logrus users, like dex and snowflake.
NewStdLogAt(ctx, level)      # For "log" users, like the net/http server.
AddLoggerToEtcdServer(ctx, server)        # To add a logger to an in-memory etcd.
AddLoggerToHttpServer(ctx, name, server)  # To add a logger to incoming requests in a net/http server.

Do not use a NewLogrus or a NewStdLogAt as the primary logger in any code you're writing. They are only for adapting to the needs of third-party packages.

EXTENDING THE LOGGING PACKAGE

Sometimes the loggers available might not be quite what you need. Add the utility function you need to this package. To test it, there are private methods in testing.go that create a logger for testing the logging system. They create a pachd-style logger, that logs to an object you can iterate over to extract the parsed messages.

Because logging is widespread throughout the codebase, this package should not have any dependencies inside Pachyderm. For example, there is a function in here that generates UUIDs, but it can't depend on internal/uuid, because internal/uuid depends on internal/backoff, which depends on this package. It should always be safe to add logging no matter where you are in the codebase.

Index

Constants

View Source
const (
	// EnvLogLevel is the name of the log level environment variable.  It's read by worker_rc.go
	// to propagate our value to any workers the PPS master creates.
	EnvLogLevel           = "PACHYDERM_LOG_LEVEL"
	EnvDevelopmentLogger  = "PACHYDERM_DEVELOPMENT_LOGGER"
	EnvDisableLogSampling = "PACHYDERM_DISABLE_LOG_SAMPLING"
)

Variables

View Source
var WorkerLogConfig = StartupLogConfig{}

WorkerLogConfig is the configuration that worker_rc.go reads to propagate the startup state of the logging subsystem to new workers. Some of this state can change as pachd runs (the log level), so we capture a view at app startup and only propagate that view to the workers.

Functions

func AddLogger

func AddLogger(ctx context.Context) context.Context

AddLogger is used by the pctx package to create empty contexts. It should not be used outside of the pctx package.

func AddLoggerToEtcdServer

func AddLoggerToEtcdServer(ctx context.Context, config *embed.Config) zap.AtomicLevel

AddLoggerToEtcdServer adds the context's logger to an in-memory etcd server.

func AddLoggerToHTTPServer

func AddLoggerToHTTPServer(rctx context.Context, name string, s *http.Server)

AddLoggerToHTTPServer mutates the provided *http.Server so that (*http.Request).Context() returns a context that can be logged to.

TODO(jonathan): Move this to pctx.

func ChildLogger

func ChildLogger(ctx context.Context, name string, opts ...LogOption) context.Context

ChildLogger is used by the pctx package to create child contexts. It should not be used outside of the pctx package.

func CombineLogger

func CombineLogger(ctx, loggerCtx context.Context) context.Context

CombineLogger extracts a logger from loggerCtx, and associates the logger with ctx. It's only public until grpc interceptors move into this package.

func DPanic added in v2.8.0

func DPanic(ctx context.Context, msg string, fields ...Field)

DPanic is a message that panics in development, but is only logged in production.

func Debug

func Debug(ctx context.Context, msg string, fields ...Field)

Debug logs a message, with fields, at level DEBUG. Level debug is appropriate for messages that are interesting to Pachyderm developers, but not operators or users; or logs that are generated by internal action at a rate of more than 1 message per second.

Most tracing and background operations are DEBUG logs.

func Error

func Error(ctx context.Context, msg string, fields ...Field)

Error logs a message, with fields, at level ERROR. Level error is appropriate for messages that indicate a malfunction that an operator of Pachyderm can repair.

func Exit

func Exit(ctx context.Context, msg string, fields ...Field)

Exit logs a message, with fields, at level FATAL and then exits with status 1. Level fatal is only appropriate for use in interactive scripts.

func GetEtcdClientConfig

func GetEtcdClientConfig(ctx context.Context) clientv3.Config

GetEtcdClientConfig returns an etcd client configuration object with the logger set to this context's logger.

func HealthCheckLogger

func HealthCheckLogger(ctx context.Context) context.Context

HealthCheckLogger returns a logger to log health checks. If the gRPC interceptor moves to this package, it shouldn't need to be public anymore.

func Info

func Info(ctx context.Context, msg string, fields ...Field)

Info logs a message, with fields, at level INFO. Level info is appropriate for messages that are interesting to operators of Pachyderm (like information about applying configuration changes), or that represent user action (like incoming RPC requests).

Most warnings and user-caused errors are INFO logs.

func InitBatchLogger added in v2.8.0

func InitBatchLogger(logFile string) func(err error)

InitBatchLogger creates a new logger for command-line tools that need to retain their logs on error. If the returned callback is called with no error, then the log file is deleted. If it's called with an error, the path to the log is printed, the log file is retained, and log.Exit is called to log the error (and ensure everything is flushed). If logFile is non-empty, then the log file is always retained.

func InitPachctlLogger

func InitPachctlLogger()

InitPachctlLogger creates a new logger for command-line tools like pachctl.

func InitPachdLogger

func InitPachdLogger()

InitPachdLogger creates a new global zap logger suitable for use in pachd/enterprise server/etc.

func InitWorkerLogger

func InitWorkerLogger()

InitWorkerLogger creates a new global zap logger suitable for use in the worker.

func LogStep

func LogStep(ctx context.Context, name string, cb func(ctx context.Context) error, fields ...Field) (retErr error)

LogStep logs how long it takes to perform an operation.

This used to be miscutil.LogStep. New code prefers Span().

func NewAmazonLogger

func NewAmazonLogger(ctx context.Context) aws.Logger

NewAmazonLogger returns an aws.Logger.

func NewBenchLogger

func NewBenchLogger(sample bool) (context.Context, *atomic.Int64)

NewBenchLogger returns a logger suitable for benchmarks and an *atomic.Int64 containing the number of bytes written to the logger.

func NewKindLogger added in v2.9.0

func NewKindLogger(ctx context.Context) kindlog.Logger

func NewLogrus

func NewLogrus(ctx context.Context) *logrus.Logger

NewLogrus returns a *logrus.Logger that logs to the provided context (or a context passed with each logged entry, if available).

func NewPGX added in v2.7.0

func NewPGX(name string) pgx.Logger

NewPGX returns a new logger for pgx.

func NewResettableLevelAt

func NewResettableLevelAt(l zapcore.Level) resettableLevel

NewResettableLevelAt creates a new resettable level set to the provided level, and set to revert to that level if changed with SetLevelFor.

func NewStdLogAt

func NewStdLogAt(ctx context.Context, lvl Level) *log.Logger

NewStdLogAt returns a *log.Logger (from the standard library "log" package) that logs to the provided context, at the provided level.

func RequestID added in v2.8.0

func RequestID(ctx context.Context) string

RequestID returns the RequestID associated with this HTTP request. This is added by the middleware above.

func SetGRPCLogLevel

func SetGRPCLogLevel(l zapcore.Level)

SetGRPCLogLevel changes the grpc logger level. It is safe to call at any time from multiple goroutines.

Note: to see any messages, the overall log level has to be lower than the grpc log level. (For example, if you SetLevel to ERROR and then SetGRPCLogLevel to DEBUG, you will only see ERROR logs from GRPC. But, if you SetLogLevel(DEBUG) and SetGRPCLogLevel(DEBUG), you will see DEBUG grpc logs.)

Note: SetGRPCLogLevel takes a zapcore.Level instead of a internal/log.Level. That's because GRPC has more log levels than Pachyderm.

func SetLevel

func SetLevel(l Level)

SetLevel changes the global logger level. It is safe to call at any time from multiple goroutines.

func Test

Test returns a new Context appropriate for use in tests. Unless you are for some reason reliant upon the global logger, use pctx.TestContext(t).

func TestParallel

func TestParallel(ctx context.Context, t testing.TB, opts ...zaptest.LoggerOption) context.Context

Test returns a new Context appropriate for use in parallel tests, at the cost of not logging messages sent to the global logger. This function is only public so that pctx.TestContext(t) can call it, and you should use that.

func TestWithCapture

func TestWithCapture(t testing.TB, opts ...zap.Option) (context.Context, *history)

TestWithCapture returns a new Context appropriate for testing the logging system itself; in addition to a context, you also get a capture object that retains all logged message for analysis. The underling logger is not rate-limited as a production logger is. Note: The logged messages are not printed out to the test log.

Most logging functionality that needs tests should be in this package; this function is only public for the use of pctx.

func WatchDroppedLogs

func WatchDroppedLogs(ctx context.Context, d time.Duration)

WatchDroppedLogs periodically prints a report on how many logs were dropped.

func WriterAt

func WriterAt(ctx context.Context, lvl Level) io.WriteCloser

WriterAt creates a new io.Writer that logs each line as a log message at the provided levels.

Types

type EndSpanFunc

type EndSpanFunc = func(fields ...Field)

EndSpanFunc is a function that ends a span.

func Span

func Span(ctx context.Context, event string, fields ...Field) EndSpanFunc

Span starts a new span at level debug. See SpanContextL for details.

func SpanContext

func SpanContext(rctx context.Context, event string, fields ...Field) (context.Context, EndSpanFunc)

SpanContext starts a new span at level debug. See SpanContextL for details.

func SpanContextL

func SpanContextL(rctx context.Context, event string, level Level, fields ...Field) (context.Context, EndSpanFunc)

SpanContextL starts a new span, returning a context with a logger scoped to that span and a function to end the span. A span is a simple way to mark the start and end of an operation. To end a span in failure, pass a log.ErrorL(err, level), log.Error(err), log.Errors([]error), etc. to the end function. If error is nil, the span is considered successful.

The returned EndSpanFunc must be called from defer(), due to how Go stacks work.

It is safe to use the returned context to log from multiple goroutines, and safe to log after the EndSpanFunc has been called. (But might be confusing!)

func SpanL

func SpanL(ctx context.Context, event string, level Level, fields ...Field) EndSpanFunc

SpanL starts a new span, returning a function that marks the end of the span. See SpanContextL for details.

type Field

type Field = zap.Field

Field is a typed log field. It is lazily evaluated at log marshaling time.

func ErrorL

func ErrorL(err error, level Level) Field

ErrorL is a Field that marks a span as failed, and logs the end of the span at the provided level.

func Errorp

func Errorp(err *error) Field

Errorp is a Field that marks a span as failed. See ErrorpL.

func ErrorpL

func ErrorpL(err *error, level Level) Field

ErrorpL is a Field that marks a span as failed if *err is a non-nil error at the time when the span ends. See ErrorL.

func Metadata

func Metadata(name string, md metadata.MD) Field

Metadata is a Field that logs the provided metadata (canonicalizing keys, collapsing single-element values to strings, and removing the Pachyderm auth token).

func OutgoingMetadata

func OutgoingMetadata(ctx context.Context) Field

OutgoingMetadata is a Field that logs the outgoing metadata associated with the provided context.

func Proto

func Proto(name string, msg proto.Message) Field

Proto is a Field containing a protocol buffer message.

func RetryAttempt

func RetryAttempt(i int, max int) Field

RetryAttempt is a Field that encodes the current retry (0-indexed) and the total number of retries. It's intended for a for loop where "i" is the loop iterator and "max" is the upper bound "i < max".

type Level

type Level int

Level is the level at which to generate Span logs.

const (
	DebugLevel Level = 1
	InfoLevel  Level = 2
	ErrorLevel Level = 3
)

type LevelChanger

type LevelChanger interface {
	// See resettableLevel.SetLevelFor.
	SetLevelFor(zapcore.Level, time.Duration, func(string, string))
}

LevelChanger is a log level that can be changed for a period of time.

var (

	// These log levels are for the src/server/debug package, which changes log levels at
	// runtime based on an RPC.  The Debug package takes special care to not change LogLevel
	// level to one we don't use; zapcore.Level has more levels than Pachyderm.
	LogLevel, GRPCLevel LevelChanger = logLevel, grpcLevel
)

type LogOption

type LogOption func(l *zap.Logger) *zap.Logger

LogOption is an option to add to a logger.

func WithFields

func WithFields(f ...Field) LogOption

WithFields adds fields to each message logged by the logger.

func WithOptions

func WithOptions(o ...zap.Option) LogOption

WithOptions adds zap options to the logger.

func WithServerID

func WithServerID() LogOption

WithServerID adds a server ID to each message logged by the logger.

func WithoutRatelimit

func WithoutRatelimit() LogOption

WithoutRatelimit removes any rate limiting on the child logger. Essential for things like user code output.

type StartupLogConfig

type StartupLogConfig struct {
	LogLevel           zapcore.Level
	DevelopmentLogger  bool
	DisableLogSampling bool
}

StartupLogConfig is the logging configuration at startup time.

func (StartupLogConfig) AsKubernetesEnvironment

func (c StartupLogConfig) AsKubernetesEnvironment() []v1.EnvVar

AsKubernetesEnvironment returns environment variables that should be set to propagate the logging config.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL