stirling/

directory
v0.0.0-...-79886a4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 9, 2024 License: Apache-2.0

README

Stirling

Stirling is Pixie's data collector on nodes. Stirling uses Linux APIs, including eBPF technology to gather application metrics and events from the Linux kernel, system libraries, or the application itself.

Some of the primary data collected by Stirling includes:

  • Application CPU, memory and network utilization.
  • Application network messaging events for a variety of protocols (e.g. HTTP, MySQL, Postgres, ...).

The Stirling architecture consists of a core infrastructure into which "source connectors" can be added to extract information from different data sources. The source connectors export data tables which are published via the Stirling API.

Directory Structure

The Stirling directory structure is as follows:

binaries            # Standalone executable versions of Stirling.
bpf_tools           # Tools for managing the compilation and deployment of BPF code.
BUILD.bazel         # Top-level build file.
core                # Core Stirling infrastructure, without data source connectors.
e2e_tests           # Top-level stirling tests (mostly shell tests).
LICENSE.txt         # Stirling specific license
obj_tools           # Tools for managing binary objects (e.g. ELF and DWARF information).
proto               # Public messages for publishing data tables.
README.md           # This README.
scripts             # Utility scripts.
source_connectors   # Directory containing all source connectors, which are responsible for gathering data.
stirling.cc         # Top-level Stirling source code.
stirling.h          # Top-level Stirling header file.
testing             # Testing utilities.
utils               # Utilities.
BPF code

Many of the source_connectors use BPF (managed by BCC) to collect their data.

We use BCC to write the BPF C code and manage the runtime of the source connectors that use BPF.

For each of these BPF based source_connectors, there should be at least two subdirectories:

  • bcc_bpf: Contains all the BPF code that is compiled by BCC. This code is licensed differently than the rest of Stirling and is intentionally kept isolated. No code outside the directory should #include or link to the code in this directory. All data transfers are done through BPF maps or buffers, data structures for which are described in bcc_bpf_intf.
  • bcc_bpf_intf: Contains public data structures that are used to transport data out from the kernel to user-space. The headers in this file are typically #included by both the BPF code and our user-space C++ code.

Building & Testing

The Stirling code base can be build with the following command:

bazel build //src/stirling/...

Certain Stirling tests use BPF, which requires root privileges. Those tests are marked with the bazel tag requires_bpf. When you run the standard bazel test, only those tests that do not require BPF are run.

bazel test //src/stirling/...

To run an individual BPF test, you can use the sudo_bazel_run.sh convenience script. For example:

sudo_bazel_run.sh //src/stirling/source_connectors/socket_tracer:socket_trace_bpf_test

The sudo_bazel_run.sh builds the target, and then uses sudo to run the script directory.

Note: if a test uses sharding, --test_sharding_strategy=disabled is required to run the test.

Running the BPF test suite

To run the entire BPF test suite, you can deploy to Jenkins, which will run these tests with the appropriate priveleges and environment.

To run the tests locally, you can also use a container:

$PIXIE_ROOT/scripts/run_docker.sh

bazel test --config=bpf //path/to:test

Stirling Wrapper

stirling_wrapper is a stand-alone version of Stirling that one can run locally. It does not require the rest of the PEM, or even k8s. It will collect the data from its sources and print the results directly to STDOUT.

You can run stirling_wrapper via the following command, which builds stirling_wrapper and runs the executable with a sudo prompt.

$PIXIE_ROOT/src/stirling/scripts/stirling_wrapper.sh

Stirling docker container environment

Because Stirling uses the Linux kernel APIs (including BPF), special flags are required to make sure Stirling runs properly when Stirling is itself inside a container. Since the kernel is not part of the container, Stirling's view is that it should be tracing the host. For example, the BPF probes will naturally be on the host and not confined to the container.

To make sure Stirling runs properly in a container environment, the following flags are required:

  • --privileged: BPF requires root permissions.
  • --volume=/sys:/sys: Required for BPF probes to function correctly.
  • --pid=host: Uses host's PID namespace. Without this, Stirling's calls to getpid() won't be correctly resolved. Also used in tests to easily find the target PID and ensure it's same as what BPF sees. This is used inside Jenkinsfile.
  • --volume=/:/host: Make the entire host file system to /host inside container. Stirling needs this to access all of the data files on the host. One of them is the system headers, which is used by BCC to compile C code. Also required for accessing debug info of binaries in other containers.
  • --env PL_HOST_PATH=/host: Related to the line above.

If any additional flags become required, be sure to update at least the following files:

  • scripts/run_docker.sh
  • k8s/pem/pem_daemonset.yaml
  • src/stirling/k8s: all yaml files
  • src/stirling/scripts/docker_run_stirling_wrapper.sh
  • src/stirling/e2e_tests: files that run stirling in a container

Dynamically installed Linux headers in Stirling's BPF runtime

Stirling's BPF runtime requires kernel headers to compile the BPF code. If the target host does not have kernel headers pre-installed, Stirling looks for the appropriate headers package bundled with the PEM image, and extracts them to the correct path inside the container.

In order to limit the size of the PEM image, only a subset of kernel versions are included. Stirling picks the headers package with the closest version number to the host's kernel version.

As a result, the dynamically-installed kernel headers might not exactly match the host kernel. This may result in unexpected runtime behavior if certain structs are not the right match. For instance, in places where Stirling's BPF code accesses kernel data structures, an accessed member's offset in a struct may have changed between versions, and the byte code compiled with an older version of the kernel headers would result in accessing wrong data. More rarely, the BPF code compilation may fail altogether if struct member names have changed.

To avoid such issues, we manually examine the history of the kernel data structures accessed in Stirling's BPF code, and make sure the correct linux headers version are included in the PEM image.

FAQ

Why do some of my records have exactly the same latency?

Rarely, a few successive records may have the exactly-same latency value. That is because theose records, despite being separated into multiple req/resp pairs, were actually sent in batch by syscalls. Stirling captures data from those syscalls, and cannot assign different timestamps to those messages. As a result, records have the exactly-same latency.

This is an expected behavior, and a result of Stirling's use of eBPF kprobes.

Directories

Path Synopsis
demo_apps
source_connectors
socket_tracer/testing/containers/pgsql
This file is adapted from README.md of https://github.com/jmoiron/sqlx.
This file is adapted from README.md of https://github.com/jmoiron/sqlx.
testing
demo_apps/go_http/go_http_client
The intention to have matching http client in Go, is to mimic the typical setup of calling a restful service in Go.
The intention to have matching http client in Go, is to mimic the typical setup of calling a restful service in Go.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL