prometheus-sql-adapter

command module

v0.4.0 Latest Latest Go to latest Published: Feb 28, 2020 License: MIT Imports: 23 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/ssube/prometheus-sql-adapter

Links

Open Source Insights

README ¶

Prometheus SQL Adapter

Adapter to connect Prometheus' remote write endpoint to a PostgreSQL server, preferably running TimescaleDB. Caches labels for each timeseries to reduce writes, linking them to samples by metric fingerprint.

This adapter was inspired by the Timescale PostgreSQL adapter and maintains a compatible schema, so queries may be used with either, but this adapter does not require the pg_prometheus extension, making it compatible with Aurora PostgreSQL, Azure Database for PostgreSQL, and other managed PostgreSQL services.

While it is possible to use this adapter and most of the schema without TimescaleDB, it will become difficult to prune older data, compression will not be available, and queries will be slower. If you can use TimescaleDB, please do.

Features

batteries included
- example queries for alerting, reports, and schema metadata
- Grafana dashboards for Kubernetes workloads, hardware metrics, and schema metadata
- Jupyter notebooks for long-term reporting
- PostgreSQL server image with schema and grant setup scripts
- Prometheus rules for derived metrics
compatible schema
- query compatible with Timescale's official pg_prometheus schema
- does not require pg_prometheus extension
- does not require superuser or extension privileges
efficient storage
- hashed & cached label IDs
- samples in compressed hypertable
- uses Go SQL and bulk copy for samples

Getting Started

create the schema:
- deploy kubernetes/server.yml
- or docker run --rm -p 5432:5432 ssube/prometheus-sql-adapter:master-postgres-11 -c 'shared_preload_libraries=timescaledb'
- or run ./scripts/schema-create [license-level] [retain-live] [retain-total] against an existing database
configure adapters:
- create a role for each set of adapters to write
- run ./scripts/schema-grant.sh [role-name] adapter
- deploy kubernetes/adapter.yml
configure Grafana:
- create a role for each Grafana instance to read
- run ./scripts/schema-grant.sh [role-name] grafana
- add a Postgres data source
- import dashboards from grafana/
configure humans:
- create a role for each developer to read
- run ./scripts/schema-grant.sh [role-name] human
- show off your sweet new graphs

The schema scripts are idempotent and safe to run repeatedly, including schema-create.sh.

Non-breaking upgrades can be performed by running the schema scripts again, in the same order.

Status

Schema

This schema is compatible with the Timescale pg_prometheus adapter but does not require the pg_prometheus extension or SUPERUSER permissions.

Captured labels and samples are split into two tables, with labels stored uniquely and identified by their FNV-1a hash, to which samples are tied.

Labels are stored once for each timeseries using the metric's hashed fingerprint, or label ID (lid). This is provided by the Prometheus SDK and uses the 64-bit FNV-1a hash, which is then stored as a UUID column. The remote write adapters each maintain an LRU cache of label IDs, and can be configured not to rewrite labels.

Using the metric's fingerprint provides a short, deterministic identifier for each label set, or timeseries. The adapters do not need to coordinate and can safely write in parallel, using an ON CONFLICT clause to skip or update existing label sets. While a numeric counter might be shorter than the current hash-as-UUID, it would require coordination between the adapters or within the database. The hashed lid avoids lookups when writing to an existing timeseries.

Maximum time ranges may be enforced by the metrics view to limit the amount of raw data that can be fetched at once, but deduplication and aggregation typically need more context to determine the correct operators, and must happen later. The maximum range allowed by metrics is used for the compression policy as well, compacting data when it can no longer be queried directly. Finally, the view makes this schema compatible with the original pg_prometheus schema and the v0.1 schema (which featured a single metrics table with both value and labels).

Views

Views within the schema are split into three groups: aggregated samples, materialized catalogs of the schema, and compatibility with the pg_prometheus schema.

Aggregate views are prefixed with agg_ and use TimescaleDB's continuous aggregates to occasionally refresh a materialized view and aggregate samples into larger time buckets.

Catalog views are prefixed with cat_ and materialize expensive views of the metric labels, enriching them with collected metadata.

Compatibility views ensure the schema is fully compatible with the pg_prometheus extensions' schema, despite slightly different underlying storage.