godeltaprof

package module
v0.1.7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 22, 2024 License: Apache-2.0 Imports: 5 Imported by: 1

README

godeltaprof

godeltaprof is an efficient delta profiler for memory, mutex, and block.

Why

In Golang, allocation, mutex and block profiles are cumulative. They only grow over time and show allocations that happened since the beginning of the running program. Not only values grow, but the size of the profile itself grows as well. It could grow up to megabytes in size for long-running processes. These megabytes profiles are called huge profiles in this document.

In many cases, it's more useful to see the differences between two points in time. You can use the original runtime/pprof package, called a delta profile, to see these differences. Using the delta profile requires passing seconds argument to the pprof endpoint query.

go tool pprof http://localhost:6060/debug/pprof/heap?seconds=30

What this does:

  1. Dump profile p0
  2. Sleep
  3. Dump profile p1
  4. Decompress and parse protobuf p0
  5. Decompress and parse protobuf p1
  6. Subtract p0 from p1
  7. Serialize protobuf and compress the result

The resulting profile is usually much smaller (p0 may be megabytes, while result is usually tens of kilobytes).

There are number of issues with this approach:

  1. Heap profile contains both allocation values and in-use values. In-use values are not cumulative. In-use values are corrupted by the subtraction. Note: It can be fixed if runtime/pprof package uses p0.ScaleN([]float64{-1,-1,0,0}), instead of p0.Scale(-1) - that would subtract allocation values and zero out in-use values in p0.
  2. It requires dumping two profiles.
  3. It produces a lot of allocations putting pressure on GC.

DataDog's fastdelta

DataDog's fastdelta profiler uses another approach.

It improves the runtime/pprof approach by keeping a copy of the previous profile and subtracting the current profile from it. The fastdelta profiler uses a custom protobuf pprof parser that doesn't allocate as much memory. This approach is more efficient, faster, and produces less garbage. It also doesn't require using two profiles. However, the fastdelta profiler still parses huge profiles up to megabytes, just to discard most of it.

godeltaprof

godeltaprof does a similar job but slightly differently.

Delta computation happens before serializing any pprof files using runtime.MemprofileRecord and BlockProfileRecord. This way, huge profiles don't need to be parsed. The delta is computed on raw records, all zeros are rejected, and results are serialized and compressed.

The source code for godeltaprof is based (forked) on the original runtime/pprof package. godeltaprof is modified to include delta computation before serialization and to expose the new endpoints. There are other small improvements and benefits:

  • Using github.com/klauspost/compress/gzip instead of compress/gzip
  • Optional lazy mappings reading (they don't change over time for most applications)
  • Separate package from runtime, so updated independently

benchmarks

These benchmarks used memory profiles from the pyroscope server.

BenchmarkOG - dumps memory profile with runtime/pprof package BenchmarkFastDelta - dumps memory profile with runtime/pprof package and computes delta using fastdelta BenchmarkGodeltaprof - does not dump profile with runtime/pprof, computes delta, outputs it results

Each benchmark also outputs produced profile sizes.

BenchmarkOG
      63         181862189 ns/op
profile sizes: [209117 209107 209077 209089 209095 209076 209088 209082 209090 209092]

BenchmarkFastDelta
      43         273936764 ns/op
profile sizes: [169300 10815 8969 9511 9752 9376 9545 8959 10357 9536]

BenchmarkGodeltaprof
     366          31148264 ns/op
profile sizes: [208898 11485 9347 9967 10291 9848 10085 9285 11033 9986]

Notice how BenchmarkOG profiles sizes are ~200k and BenchmarkGodeltaprof and BenchmarkFastDelta are ~10k - that is because a lof of samples with zero values are discarded after delta computation.

Source code of benchmarks could be found here

CPU profiles: BenchmarkOG, BenchmarkFastDelta, BenchmarkGodeltaprof

upstreaming

TODO(korniltsev): create golang issue and ask if godeltaprof is something that could be considered merging to upstream golang repo in some way(maybe not as is, maybe with different APIs)

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type BlockProfiler

type BlockProfiler struct {
	// contains filtered or unexported fields
}

BlockProfiler is a stateful profiler for goroutine blocking events and mutex contention in Go programs. Depending on the function used to create the BlockProfiler, it uses either runtime.BlockProfile or runtime.MutexProfile. The BlockProfiler provides similar functionality to pprof.Lookup("block").WriteTo and pprof.Lookup("mutex").WriteTo, but with some key differences.

The BlockProfiler tracks the delta of blocking events or mutex contention since the last profile was written, effectively providing a snapshot of the changes between two points in time. This is in contrast to the pprof.Lookup functions, which accumulate profiling data and result in profiles that represent the entire lifetime of the program.

The BlockProfiler is safe for concurrent use, as it serializes access to its internal state using a sync.Mutex. This ensures that multiple goroutines can call the Profile method without causing any data race issues.

func NewBlockProfiler

func NewBlockProfiler() *BlockProfiler

NewBlockProfiler creates a new BlockProfiler instance for profiling goroutine blocking events. The resulting BlockProfiler uses runtime.BlockProfile as its data source.

Usage:

bp := godeltaprof.NewBlockProfiler()
...
err := bp.Profile(someWriter)

func NewBlockProfilerWithOptions added in v0.1.6

func NewBlockProfilerWithOptions(options ProfileOptions) *BlockProfiler

func NewMutexProfiler

func NewMutexProfiler() *BlockProfiler

NewMutexProfiler creates a new BlockProfiler instance for profiling mutex contention. The resulting BlockProfiler uses runtime.MutexProfile as its data source.

Usage:

	mp := godeltaprof.NewMutexProfiler()
    ...
    err := mp.Profile(someWriter)

func NewMutexProfilerWithOptions added in v0.1.6

func NewMutexProfilerWithOptions(options ProfileOptions) *BlockProfiler

func (*BlockProfiler) Profile

func (d *BlockProfiler) Profile(w io.Writer) error

type HeapProfiler

type HeapProfiler struct {
	// contains filtered or unexported fields
}

HeapProfiler is a stateful profiler for heap allocations in Go programs. It is based on runtime.MemProfile and provides similar functionality to pprof.WriteHeapProfile, but with some key differences.

The HeapProfiler tracks the delta of heap allocations since the last profile was written, effectively providing a snapshot of the changes in heap usage between two points in time. This is in contrast to the pprof.WriteHeapProfile function, which accumulates profiling data and results in profiles that represent the entire lifetime of the program.

The HeapProfiler is safe for concurrent use, as it serializes access to its internal state using a sync.Mutex. This ensures that multiple goroutines can call the Profile method without causing any data race issues.

Usage:

hp := godeltaprof.NewHeapProfiler()
...
err := hp.Profile(someWriter)

func NewHeapProfiler

func NewHeapProfiler() *HeapProfiler

func NewHeapProfilerWithOptions added in v0.1.6

func NewHeapProfilerWithOptions(options ProfileOptions) *HeapProfiler

func (*HeapProfiler) Profile

func (d *HeapProfiler) Profile(w io.Writer) error

type ProfileOptions added in v0.1.6

type ProfileOptions struct {
	// for go1.21+ if true - use runtime_FrameSymbolName - produces frames with generic types, for example [go.shape.int]
	// for go1.21+ if false - use runtime.Frame->Function - produces frames with generic types ommited [...]
	// pre 1.21 - always use runtime.Frame->Function - produces frames with generic types ommited [...]
	GenericsFrames bool
	LazyMappings   bool
}

Directories

Path Synopsis
compat module
http
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL