carbon-tagger

command module
v0.0.0-...-1a095a0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 24, 2015 License: Apache-2.0 Imports: 16 Imported by: 0

README

Carbon-tagger

graphite protocol extension

Standard graphite protocol (proto1)

string.based.metric[...] value unix_timestamp

"string", "based", "metric" being the nodes of the metric, and metric names usually being unorganized, unstandardized and lacking information.

Carbon-tagger implements metrics 2.0 which aims to make metrics entirely self describing, structured, and standardized.

It does this by using an axtended graphite protocol (backwards compatible):

  • nodes can be old-style values, or new-style key=val or key_is_val tag pairs. (Some versions of graphite-web/graphite-api struggle with equals signs, so the latter format is recommended)
  • if there's a "=" or "is" in one or more of the nodes, we'll try to parse as proto2 and add it to the index if below conditions are met.
  • there must be a tag pair with unit as tag key.
  • there must be at least one other tag.
  • you can freely choose the order of the nodes for every metric, but when you change the order, you change the metric key.
  • old-style nodes (i.e. not "key=val" or key_is_val format) within a proto2 metric implicitly get an "nX" tag key where X is the node position in the string, starting from 1.

You'll probably want to follow the metrics naming conventions, specifically apply the correct units

So this defines a metric as a set of tags, and while we're at it, it also specifies the metric_id that will be used by the current carbon and related tools. carbon-tagger will maintain a database of metrics and their tags, and pass it on (unaltered) to a daemon like carbon-cache or carbon-relay. So the protocol is fully backwards compatible.

This is metrics 2.0 but

  • using dots as delimiters until we can fix graphite.
  • you can use units like "Mbps" or "Errps" to mean "Mb/s" and "Err/s". Graphite treats slashes as delimiters. Carbon-tagger will set the proper unit tag.

indexing

  • Indexes metrics 2.0 full (_id and tag)
  • legacy metrics, just the _id, so you can search for it. (empty tags property) it's up to a tool like graph-explorer to create or update documents for legacy metrics with tags enabled.

how does this affect the rest of my stack?

  • carbon-relay, carbon-cache: unaffected, they receive the same data as usual, the identifiers just look a little different.
  • graphite-web: unaffected. metrics 2.0 still show up in the tree of the graphite composer, although that's an inferior UI model that I want to phase out. ideally, dashboards leverage the tag database, like graph-explorer does.
  • aggregators like statsd need proto2 support. statsdaemon is a drop-in statsd replacement that for metrics in metrics2.0 format correctly represents its aggregations and statistical summaries by updating the key/value pairs of metric names, as opposed to the traditional prefix/suffix "features" which leave metrics even vaguer than they were.

why do you process the same metrics every time they are submitted?

to have a realtime database. you could get all metricnames later and process them offline, which lowers resource usage but has higher delays

internal metrics

are in proto2 format and are submitted to a carbon endpoint (typically your relay) they are also available on the http address at /debug/vars2

performance

currently, not very optimized at all! but it's probably speedy enough, and there's a big buffer that smoothens the effect of new metrics

  • I reach about 15k metrics/s processing speed, even when the temp buffer is full and it's syncing to ES. in fact, i don't see a discernable difference between buffer full (unblocked) and buffer full (blocked) probably carbon-cache (whisper) was being the bottleneck?

  • space used: 176B/metric (21M for 125k metrics, twice that if we'd enable indexing/analyzing)

TODO

  • make sure the bulk thing uses 'create' and doesn't update/replace the doc every time
  • it seems like ES doesn't contain all metrics (on 2M unique inserts, ES' count is 1889300)
  • better mapping, _source, type analyzing?
  • don't store legacy metrics with empty tags, if already exists. now it will temporarily break graph-explorer's structured metrics (until the latter indexes again)

future optimisations

  • populate an ES cache from disk on program start
  • forward_lines channel buffering, so that runtime doesn't have to switch Gs all the time?
  • GOMAXPROCS

if/when ES performance becomes an issue, consider:

  • edge-ngrams/reverse edge-ngrams to do it at index time
  • use prefix match with regular/reverse fields
  • query_string maybe

building

if you already have a working Go setup, adjust accordingly:

mkdir -p ~/go/
export GOPATH=~/go
go get github.com/Vimeo/carbon-tagger
go get github.com/mjibson/party
go build github.com/Vimeo/carbon-tagger

installation

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
_third_party
github.com/Dieterbe/go-metrics
Go port of Coda Hale's Metrics library <https://github.com/Dieterbe/go-metrics> Coda Hale's original work: <https://github.com/codahale/metrics>
Go port of Coda Hale's Metrics library <https://github.com/Dieterbe/go-metrics> Coda Hale's original work: <https://github.com/codahale/metrics>
github.com/Dieterbe/go-metrics/exp
Hook go-metrics into expvar on any /debug/vars2 request, load all vars from the registry into expvar, and execute regular expvar handler
Hook go-metrics into expvar on any /debug/vars2 request, load all vars from the registry into expvar, and execute regular expvar handler
github.com/bitly/go-hostpool
A Go package to intelligently and flexibly pool among multiple hosts from your Go application.
A Go package to intelligently and flexibly pool among multiple hosts from your Go application.
github.com/metrics20/go-metrics20
Package metrics20 provides functions that manipulate a metric string to represent a given operation if the metric is detected to be in metrics 2.0 format, the change will be in that style, if not, it will be a simple string prefix/postfix like legacy statsd.
Package metrics20 provides functions that manipulate a metric string to represent a given operation if the metric is detected to be in metrics 2.0 format, the change will be in that style, if not, it will be a simple string prefix/postfix like legacy statsd.
github.com/pelletier/go-toml
TOML markup language parser.
TOML markup language parser.
github.com/stvp/go-toml-config
Package config implements simple TOML-based configuration variables, based on the flag package in the standard Go library (In fact, it's just a simple wrapper around flag.FlagSet).
Package config implements simple TOML-based configuration variables, based on the flag package in the standard Go library (In fact, it's just a simple wrapper around flag.FlagSet).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL